-
- Downloads
[SPARK-17115][SQL] decrease the threshold when split expressions
## What changes were proposed in this pull request? In 2.0, we change the threshold of splitting expressions from 16K to 64K, which cause very bad performance on wide table, because the generated method can't be JIT compiled by default (above the limit of 8K bytecode). This PR will decrease it to 1K, based on the benchmark results for a wide table with 400 columns of LongType. It also fix a bug around splitting expression in whole-stage codegen (it should not split them). ## How was this patch tested? Added benchmark suite. Author: Davies Liu <davies@databricks.com> Closes #14692 from davies/split_exprs.
Showing
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala 6 additions, 3 deletions...park/sql/catalyst/expressions/codegen/CodeGenerator.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala 0 additions, 2 deletions...che/spark/sql/execution/aggregate/HashAggregateExec.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/BenchmarkWideTable.scala 53 additions, 0 deletions...he/spark/sql/execution/benchmark/BenchmarkWideTable.scala
Loading
Please register or sign in to comment