-
- Downloads
[SPARK-12850][SQL] Support Bucket Pruning (Predicate Pushdown for Bucketed Tables)
JIRA: https://issues.apache.org/jira/browse/SPARK-12850 This PR is to support bucket pruning when the predicates are `EqualTo`, `EqualNullSafe`, `IsNull`, `In`, and `InSet`. Like HIVE, in this PR, the bucket pruning works when the bucketing key has one and only one column. So far, I do not find a way to verify how many buckets are actually scanned. However, I did verify it when doing the debug. Could you provide a suggestion how to do it properly? Thank you! cloud-fan yhuai rxin marmbrus BTW, we can add more cases to support complex predicate including `Or` and `And`. Please let me know if I should do it in this PR. Maybe we also need to add test cases to verify if bucket pruning works well for each data type. Author: gatorsmile <gatorsmile@gmail.com> Closes #10942 from gatorsmile/pruningBuckets.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala 76 additions, 2 deletions.../spark/sql/execution/datasources/DataSourceStrategy.scala
- sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala 12 additions, 3 deletions.../main/scala/org/apache/spark/sql/sources/interfaces.scala
- sql/hive/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala 157 additions, 5 deletions...cala/org/apache/spark/sql/sources/BucketedReadSuite.scala
Loading
Please register or sign in to comment