-
- Downloads
[SPARK-19931][SQL] InMemoryTableScanExec should rewrite output partitioning...
[SPARK-19931][SQL] InMemoryTableScanExec should rewrite output partitioning and ordering when aliasing output attributes ## What changes were proposed in this pull request? Now `InMemoryTableScanExec` simply takes the `outputPartitioning` and `outputOrdering` from the associated `InMemoryRelation`'s `child.outputPartitioning` and `outputOrdering`. However, `InMemoryTableScanExec` can alias the output attributes. In this case, its `outputPartitioning` and `outputOrdering` are not correct and its parent operators can't correctly determine its data distribution. ## How was this patch tested? Jenkins tests. Please review http://spark.apache.org/contributing.html before opening a pull request. Author: Liang-Chi Hsieh <viirya@gmail.com> Closes #17175 from viirya/ensure-no-unnecessary-shuffle.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala 18 additions, 3 deletions.../spark/sql/execution/columnar/InMemoryTableScanExec.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/columnar/InMemoryColumnarQuerySuite.scala 26 additions, 0 deletions...k/sql/execution/columnar/InMemoryColumnarQuerySuite.scala
Please register or sign in to comment