-
- Downloads
[SPARK-21759][SQL] In.checkInputDataTypes should not wrongly report unresolved...
[SPARK-21759][SQL] In.checkInputDataTypes should not wrongly report unresolved plans for IN correlated subquery ## What changes were proposed in this pull request? With the check for structural integrity proposed in SPARK-21726, it is found that the optimization rule `PullupCorrelatedPredicates` can produce unresolved plans. For a correlated IN query looks like: SELECT t1.a FROM t1 WHERE t1.a IN (SELECT t2.c FROM t2 WHERE t1.b < t2.d); The query plan might look like: Project [a#0] +- Filter a#0 IN (list#4 [b#1]) : +- Project [c#2] : +- Filter (outer(b#1) < d#3) : +- LocalRelation <empty>, [c#2, d#3] +- LocalRelation <empty>, [a#0, b#1] After `PullupCorrelatedPredicates`, it produces query plan like: 'Project [a#0] +- 'Filter a#0 IN (list#4 [(b#1 < d#3)]) : +- Project [c#2, d#3] : +- LocalRelation <empty>, [c#2, d#3] +- LocalRelation <empty>, [a#0, b#1] Because the correlated predicate involves another attribute `d#3` in subquery, it has been pulled out and added into the `Project` on the top of the subquery. When `list` in `In` contains just one `ListQuery`, `In.checkInputDataTypes` checks if the size of `value` expressions matches the output size of subquery. In the above example, there is only `value` expression and the subquery output has two attributes `c#2, d#3`, so it fails the check and `In.resolved` returns `false`. We should not let `In.checkInputDataTypes` wrongly report unresolved plans to fail the structural integrity check. ## How was this patch tested? Added test. Author: Liang-Chi Hsieh <viirya@gmail.com> Closes #18968 from viirya/SPARK-21759.
Showing
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala 4 additions, 2 deletions...ala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala 3 additions, 2 deletions...org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala 30 additions, 35 deletions...rg/apache/spark/sql/catalyst/expressions/predicates.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala 10 additions, 3 deletions.../org/apache/spark/sql/catalyst/expressions/subquery.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala 5 additions, 5 deletions...la/org/apache/spark/sql/catalyst/optimizer/subquery.scala
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PullupCorrelatedPredicatesSuite.scala 52 additions, 0 deletions.../catalyst/optimizer/PullupCorrelatedPredicatesSuite.scala
- sql/core/src/test/resources/sql-tests/results/subquery/negative-cases/subq-input-typecheck.sql.out 2 additions, 4 deletions...ults/subquery/negative-cases/subq-input-typecheck.sql.out
Loading
Please register or sign in to comment