-
- Downloads
[SPARK-18589][SQL] Fix Python UDF accessing attributes from both side of join
PythonUDF is unevaluable, which can not be used inside a join condition, currently the optimizer will push a PythonUDF which accessing both side of join into the join condition, then the query will fail to plan. This PR fix this issue by checking the expression is evaluable or not before pushing it into Join. Add a regression test. Author: Davies Liu <davies@databricks.com> Closes #16581 from davies/pyudf_join.
Showing
- python/pyspark/sql/tests.py 9 additions, 0 deletionspython/pyspark/sql/tests.py
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala 12 additions, 1 deletion...rg/apache/spark/sql/catalyst/expressions/predicates.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala 1 addition, 1 deletion...a/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala 2 additions, 3 deletions...scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
Loading
Please register or sign in to comment