Skip to content
Snippets Groups Projects
Commit bdde010e authored by Josh Rosen's avatar Josh Rosen Committed by Wenchen Fan
Browse files

[SPARK-14863][SQL] Cache TreeNode's hashCode by default

Caching TreeNode's `hashCode` can lead to orders-of-magnitude performance improvement in certain optimizer rules when operating on huge/complex schemas.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #12626 from JoshRosen/cache-treenode-hashcode.
parent 39a77e15
No related branches found
No related tags found
No related merge requests found
......@@ -71,7 +71,9 @@ object CurrentOrigin {
}
}
// scalastyle:off
abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product {
// scalastyle:on
self: BaseType =>
val origin: Origin = CurrentOrigin.get
......@@ -84,6 +86,9 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product {
lazy val containsChild: Set[TreeNode[_]] = children.toSet
private lazy val _hashCode: Int = scala.util.hashing.MurmurHash3.productHash(this)
override def hashCode(): Int = _hashCode
/**
* Faster version of equality which short-circuits when two treeNodes are the same instance.
* We don't just override Object.equals, as doing so prevents the scala compiler from
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment