-
- Downloads
[SPARK-17474] [SQL] fix python udf in TakeOrderedAndProjectExec
## What changes were proposed in this pull request? When there is any Python UDF in the Project between Sort and Limit, it will be collected into TakeOrderedAndProjectExec, ExtractPythonUDFs failed to pull the Python UDFs out because QueryPlan.expressions does not include the expression inside Option[Seq[Expression]]. Ideally, we should fix the `QueryPlan.expressions`, but tried with no luck (it always run into infinite loop). In PR, I changed the TakeOrderedAndProjectExec to no use Option[Seq[Expression]] to workaround it. cc JoshRosen ## How was this patch tested? Added regression test. Author: Davies Liu <davies@databricks.com> Closes #15030 from davies/all_expr.
Showing
- python/pyspark/sql/tests.py 8 additions, 0 deletionspython/pyspark/sql/tests.py
- sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala 4 additions, 4 deletions...cala/org/apache/spark/sql/execution/SparkStrategies.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala 6 additions, 6 deletions...src/main/scala/org/apache/spark/sql/execution/limit.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/TakeOrderedAndProjectSuite.scala 2 additions, 2 deletions...ache/spark/sql/execution/TakeOrderedAndProjectSuite.scala
Loading
Please register or sign in to comment