-
- Downloads
[SPARK-14419] [SQL] Improve HashedRelation for key fit within Long
## What changes were proposed in this pull request? Currently, we use java HashMap for HashedRelation if the key could fit within a Long. The java HashMap and CompactBuffer are not memory efficient, the memory used by them is also accounted accurately. This PR introduce a LongToUnsafeRowMap (similar to BytesToBytesMap) for better memory efficiency and performance. ## How was this patch tested? Updated existing tests. Author: Davies Liu <davies@databricks.com> Closes #12190 from davies/long_map2.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala 2 additions, 1 deletion...che/spark/sql/execution/aggregate/TungstenAggregate.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoin.scala 7 additions, 11 deletions.../apache/spark/sql/execution/joins/BroadcastHashJoin.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala 10 additions, 21 deletions...scala/org/apache/spark/sql/execution/joins/HashJoin.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala 467 additions, 221 deletions...org/apache/spark/sql/execution/joins/HashedRelation.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/joins/ShuffledHashJoin.scala 8 additions, 43 deletions...g/apache/spark/sql/execution/joins/ShuffledHashJoin.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/BenchmarkWholeStageCodegen.scala 100 additions, 32 deletions...ache/spark/sql/execution/BenchmarkWholeStageCodegen.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/ExchangeSuite.scala 4 additions, 4 deletions.../scala/org/apache/spark/sql/execution/ExchangeSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/joins/HashedRelationSuite.scala 35 additions, 13 deletions...pache/spark/sql/execution/joins/HashedRelationSuite.scala
Loading
Please register or sign in to comment