-
- Downloads
[SPARK-13095] [SQL] improve performance for broadcast join with dimension table
This PR improve the performance for Broadcast join with dimension tables, which is common in data warehouse. If the join key can fit in a long, we will use a special api `get(Long)` to get the rows from HashedRelation. If the HashedRelation only have unique keys, we will use a special api `getValue(Long)` or `getValue(InternalRow)`. If the keys can fit within a long, also the keys are dense, we will use a array of UnsafeRow, instead a hash map. TODO: will do cleanup Author: Davies Liu <davies@databricks.com> Closes #11065 from davies/gen_dim.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala 1 addition, 0 deletions...la/org/apache/spark/sql/execution/WholeStageCodegen.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoin.scala 59 additions, 37 deletions.../apache/spark/sql/execution/joins/BroadcastHashJoin.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashOuterJoin.scala 1 addition, 6 deletions...he/spark/sql/execution/joins/BroadcastHashOuterJoin.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastLeftSemiJoinHash.scala 1 addition, 5 deletions...spark/sql/execution/joins/BroadcastLeftSemiJoinHash.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala 41 additions, 3 deletions...scala/org/apache/spark/sql/execution/joins/HashJoin.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala 284 additions, 13 deletions...org/apache/spark/sql/execution/joins/HashedRelation.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/BenchmarkWholeStageCodegen.scala 23 additions, 4 deletions...ache/spark/sql/execution/BenchmarkWholeStageCodegen.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/joins/HashedRelationSuite.scala 28 additions, 1 deletion...pache/spark/sql/execution/joins/HashedRelationSuite.scala
Loading
Please register or sign in to comment