-
- Downloads
[SPARK-21848][SQL] Add trait UserDefinedExpression to identify user-defined functions
## What changes were proposed in this pull request? Add trait UserDefinedExpression to identify user-defined functions. UDF can be expensive. In optimizer we may need to avoid executing UDF multiple times. E.g. ```scala table.select(UDF as 'a).select('a, ('a + 1) as 'b) ``` If UDF is expensive in this case, optimizer should not collapse the project to ```scala table.select(UDF as 'a, (UDF+1) as 'b) ``` Currently UDF classes like PythonUDF, HiveGenericUDF are not defined in catalyst. This PR is to add a new trait to make it easier to identify user-defined functions. ## How was this patch tested? Unit test Author: Wang Gengliang <ltnwgl@gmail.com> Closes #19064 from gengliangwang/UDFType.
Showing
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala 6 additions, 0 deletions...rg/apache/spark/sql/catalyst/expressions/Expression.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala 1 addition, 1 deletion.../org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/udaf.scala 5 additions, 1 deletion...scala/org/apache/spark/sql/execution/aggregate/udaf.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonUDF.scala 2 additions, 2 deletions...ala/org/apache/spark/sql/execution/python/PythonUDF.scala
- sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala 14 additions, 4 deletions...e/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala
Loading
Please register or sign in to comment