-
- Downloads
[SPARK-15916][SQL] JDBC filter push down should respect operator precedence
## What changes were proposed in this pull request? This PR fixes the problem that the precedence order is messed when pushing where-clause expression to JDBC layer. **Case 1:** For sql `select * from table where (a or b) and c`, the where-clause is wrongly converted to JDBC where-clause `a or (b and c)` after filter push down. The consequence is that JDBC may returns less or more rows than expected. **Case 2:** For sql `select * from table where always_false_condition`, the result table may not be empty if the JDBC RDD is partitioned using where-clause: ``` spark.read.jdbc(url, table, predicates = Array("partition 1 where clause", "partition 2 where clause"...) ``` ## How was this patch tested? Unit test. This PR also close #13640 Author: hyukjinkwon <gurwls223@gmail.com> Author: Sean Zhong <seanzhong@databricks.com> Closes #13743 from clockfly/SPARK-15916.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala 2 additions, 2 deletions...apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala
- sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 26 additions, 0 deletions.../src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala
Loading
Please register or sign in to comment