-
- Downloads
[SPARK-14419] [SQL] Improve HashedRelation for key fit within Long
## What changes were proposed in this pull request? Currently, we use java HashMap for HashedRelation if the key could fit within a Long. The java HashMap and CompactBuffer are not memory efficient, the memory used by them is also accounted accurately. This PR introduce a LongToUnsafeRowMap (similar to BytesToBytesMap) for better memory efficiency and performance. This PR reopen #12190 to fix bugs. ## How was this patch tested? Existing tests. Author: Davies Liu <davies@databricks.com> Closes #12278 from davies/long_map3.
Showing
- core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java 5 additions, 9 deletions...ain/java/org/apache/spark/unsafe/map/BytesToBytesMap.java
- sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala 2 additions, 1 deletion...che/spark/sql/execution/aggregate/TungstenAggregate.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoin.scala 7 additions, 11 deletions.../apache/spark/sql/execution/joins/BroadcastHashJoin.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala 14 additions, 27 deletions...scala/org/apache/spark/sql/execution/joins/HashJoin.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala 427 additions, 221 deletions...org/apache/spark/sql/execution/joins/HashedRelation.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/joins/ShuffledHashJoin.scala 8 additions, 43 deletions...g/apache/spark/sql/execution/joins/ShuffledHashJoin.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/BenchmarkWholeStageCodegen.scala 100 additions, 32 deletions...ache/spark/sql/execution/BenchmarkWholeStageCodegen.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/ExchangeSuite.scala 4 additions, 4 deletions.../scala/org/apache/spark/sql/execution/ExchangeSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/joins/HashedRelationSuite.scala 35 additions, 13 deletions...pache/spark/sql/execution/joins/HashedRelationSuite.scala
This diff is collapsed.
Please register or sign in to comment