-
- Downloads
[SPARK-12218] Fixes ORC conjunction predicate push down
This PR is a follow-up of PR #10362. Two major changes: 1. The fix introduced in #10362 is OK for Parquet, but may disable ORC PPD in many cases PR #10362 stops converting an `AND` predicate if any branch is inconvertible. On the other hand, `OrcFilters` combines all filters into a single big conjunction first and then tries to convert it into ORC `SearchArgument`. This means, if any filter is inconvertible, no filters can be pushed down. This PR fixes this issue by finding out all convertible filters first before doing the actual conversion. The reason behind the current implementation is mostly due to the limitation of ORC `SearchArgument` builder, which is documented in this PR in detail. 1. Copied the `AND` predicate fix for ORC from #10362 to avoid merge conflict. Same as #10362, this PR targets master (2.0.0-SNAPSHOT), branch-1.6, and branch-1.5. Author: Cheng Lian <lian@databricks.com> Closes #10377 from liancheng/spark-12218.fix-orc-conjunction-ppd.
Showing
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala 40 additions, 2 deletions...ql/execution/datasources/parquet/ParquetFilterSuite.scala
- sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFilters.scala 41 additions, 27 deletions...main/scala/org/apache/spark/sql/hive/orc/OrcFilters.scala
- sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcSourceSuite.scala 31 additions, 1 deletion.../scala/org/apache/spark/sql/hive/orc/OrcSourceSuite.scala
Loading
Please register or sign in to comment