-
- Downloads
[SPARK-13376] [SQL] improve column pruning
## What changes were proposed in this pull request? This PR mostly rewrite the ColumnPruning rule to support most of the SQL logical plans (except those for Dataset). ## How was the this patch tested? This is test by unit tests, also manually test with TPCDS Q78, which could prune all unused columns successfully, improved the performance by 78% (from 22s to 12s). Author: Davies Liu <davies@databricks.com> Closes #11256 from davies/fix_column_pruning.
Showing
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala 58 additions, 70 deletions...a/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ColumnPruningSuite.scala 126 additions, 2 deletions...che/spark/sql/catalyst/optimizer/ColumnPruningSuite.scala
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/FilterPushdownSuite.scala 0 additions, 80 deletions...he/spark/sql/catalyst/optimizer/FilterPushdownSuite.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryColumnarTableScan.scala 3 additions, 4 deletions...rk/sql/execution/columnar/InMemoryColumnarTableScan.scala
Loading
Please register or sign in to comment