-
- Downloads
[SPARK-15051][SQL] Create a TypedColumn alias
## What changes were proposed in this pull request? Currently when we create an alias against a TypedColumn from user-defined Aggregator(for example: agg(aggSum.toColumn as "a")), spark is using the alias' function from Column( as), the alias function will return a column contains a TypedAggregateExpression, which is unresolved because the inputDeserializer is not defined. Later the aggregator function (agg) will inject the inputDeserializer back to the TypedAggregateExpression, but only if the aggregate columns are TypedColumn, in the above case, the TypedAggregateExpression will remain unresolved because it is under column and caused the problem reported by this jira [15051](https://issues.apache.org/jira/browse/SPARK-15051?jql=project%20%3D%20SPARK). This PR propose to create an alias function for TypedColumn, it will return a TypedColumn. It is using the similar code path as Column's alia function. For the spark build in aggregate function, like max, it is working with alias, for example val df1 = Seq(1 -> "a", 2 -> "b", 3 -> "b").toDF("i", "j") checkAnswer(df1.agg(max("j") as "b"), Row(3) :: Nil) Thanks for comments. ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) Add test cases in DatasetAggregatorSuite.scala run the sql related queries against this patch. Author: Kevin Yu <qyu@us.ibm.com> Closes #12893 from kevinyu98/spark-15051.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/Column.scala 13 additions, 6 deletionssql/core/src/main/scala/org/apache/spark/sql/Column.scala
- sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala 8 additions, 0 deletions...t/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
Loading
Please register or sign in to comment