-
- Downloads
[SPARK-17124][SQL] RelationalGroupedDataset.agg should preserve order and...
[SPARK-17124][SQL] RelationalGroupedDataset.agg should preserve order and allow multiple aggregates per column ## What changes were proposed in this pull request? This patch fixes a longstanding issue with one of the RelationalGroupedDataset.agg function. Even though the signature accepts vararg of pairs, the underlying implementation turns the seq into a map, and thus not order preserving nor allowing multiple aggregates per column. This change also allows users to use this function to run multiple different aggregations for a single column, e.g. ``` agg("age" -> "max", "age" -> "count") ``` ## How was this patch tested? Added a test case in DataFrameAggregateSuite. Author: petermaxlee <petermaxlee@gmail.com> Closes #14697 from petermaxlee/SPARK-17124.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala 4 additions, 2 deletions...scala/org/apache/spark/sql/RelationalGroupedDataset.scala
- sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala 10 additions, 0 deletions.../scala/org/apache/spark/sql/DataFrameAggregateSuite.scala
Loading
Please register or sign in to comment