-
- Downloads
[SPARK-14929] [SQL] Disable vectorized map for wide schemas & high-precision decimals
## What changes were proposed in this pull request? While the vectorized hash map in `TungstenAggregate` is currently supported for all primitive data types during partial aggregation, this patch only enables the hash map for a subset of cases that've been verified to show performance improvements on our benchmarks subject to an internal conf that sets an upper limit on the maximum length of the aggregate key/value schema. This list of supported use-cases should be expanded over time. ## How was this patch tested? This is no new change in functionality so existing tests should suffice. Performance tests were done on TPCDS benchmarks. Author: Sameer Agarwal <sameer@databricks.com> Closes #12710 from sameeragarwal/vectorized-enable.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala 28 additions, 7 deletions...che/spark/sql/execution/aggregate/TungstenAggregate.scala
- sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 9 additions, 7 deletions...rc/main/scala/org/apache/spark/sql/internal/SQLConf.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/BenchmarkWholeStageCodegen.scala 10 additions, 10 deletions...ache/spark/sql/execution/BenchmarkWholeStageCodegen.scala
- sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/AggregationQuerySuite.scala 8 additions, 7 deletions...ache/spark/sql/hive/execution/AggregationQuerySuite.scala
Loading
Please register or sign in to comment