Skip to content
Snippets Groups Projects
Commit ff0af0dd authored by Davies Liu's avatar Davies Liu Committed by Davies Liu
Browse files

[SPARK-13095] [SQL] improve performance for broadcast join with dimension table

This PR improve the performance for Broadcast join with dimension tables, which is common in data warehouse.

If the join key can fit in a long, we will use a special api `get(Long)` to get the rows from HashedRelation.

If the HashedRelation only have unique keys, we will use a special api `getValue(Long)` or `getValue(InternalRow)`.

If the keys can fit within a long, also the keys are dense, we will use a array of UnsafeRow, instead a hash map.

TODO: will do cleanup

Author: Davies Liu <davies@databricks.com>

Closes #11065 from davies/gen_dim.
parent 37bc203c
No related branches found
No related tags found
Loading
Showing with 438 additions and 69 deletions
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment