-
- Downloads
[SPARK-17213][SQL] Disable Parquet filter push-down for string and binary...
[SPARK-17213][SQL] Disable Parquet filter push-down for string and binary columns due to PARQUET-686 This PR targets to both master and branch-2.1. ## What changes were proposed in this pull request? Due to PARQUET-686, Parquet doesn't do string comparison correctly while doing filter push-down for string columns. This PR disables filter push-down for both string and binary columns to work around this issue. Binary columns are also affected because some Parquet data models (like Hive) may store string columns as a plain Parquet `binary` instead of a `binary (UTF8)`. ## How was this patch tested? New test case added in `ParquetFilterSuite`. Author: Cheng Lian <lian@databricks.com> Closes #16106 from liancheng/spark-17213-bad-string-ppd. (cherry picked from commit ca639163) Signed-off-by:Reynold Xin <rxin@databricks.com>
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala 24 additions, 0 deletions...rk/sql/execution/datasources/parquet/ParquetFilters.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala 23 additions, 3 deletions...ql/execution/datasources/parquet/ParquetFilterSuite.scala
Loading
Please register or sign in to comment