-
- Downloads
[SPARK-19933][SQL] Do not change output of a subquery
## What changes were proposed in this pull request? The `RemoveRedundantAlias` rule can change the output attributes (the expression id's to be precise) of a query by eliminating the redundant alias producing them. This is no problem for a regular query, but can cause problems for correlated subqueries: The attributes produced by the subquery are used in the parent plan; changing them will break the parent plan. This PR fixes this by wrapping a subquery in a `Subquery` top level node when it gets optimized. The `RemoveRedundantAlias` rule now recognizes `Subquery` and makes sure that the output attributes of the `Subquery` node are retained. ## How was this patch tested? Added a test case to `RemoveRedundantAliasAndProjectSuite` and added a regression test to `SubquerySuite`. Author: Herman van Hovell <hvanhovell@databricks.com> Closes #17278 from hvanhovell/SPARK-19933. (cherry picked from commit e04c05cf) Signed-off-by:Herman van Hovell <hvanhovell@databricks.com>
Showing
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala 12 additions, 3 deletions...a/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala 8 additions, 0 deletions...rk/sql/catalyst/plans/logical/basicLogicalOperators.scala
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RemoveRedundantAliasAndProjectSuite.scala 8 additions, 0 deletions...alyst/optimizer/RemoveRedundantAliasAndProjectSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala 14 additions, 0 deletions...e/src/test/scala/org/apache/spark/sql/SubquerySuite.scala
Loading
Please register or sign in to comment