-
- Downloads
[SPARK-9301][SQL] Add collect_set and collect_list aggregate functions
For now they are thin wrappers around the corresponding Hive UDAFs. One limitation with these in Hive 0.13.0 is they only support aggregating primitive types. I chose snake_case here instead of camelCase because it seems to be used in the majority of the multi-word fns. Do we also want to add these to `functions.py`? This approach was recommended here: https://github.com/apache/spark/pull/8592#issuecomment-154247089 marmbrus rxin Author: Nick Buroojy <nick.buroojy@civitaslearning.com> Closes #9526 from nburoojy/nick/udaf-alias. (cherry picked from commit a6ee4f98) Signed-off-by:Michael Armbrust <michael@databricks.com>
Showing
- python/pyspark/sql/functions.py 14 additions, 11 deletionspython/pyspark/sql/functions.py
- python/pyspark/sql/tests.py 17 additions, 0 deletionspython/pyspark/sql/tests.py
- sql/core/src/main/scala/org/apache/spark/sql/functions.scala 20 additions, 0 deletionssql/core/src/main/scala/org/apache/spark/sql/functions.scala
- sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveDataFrameAnalyticsSuite.scala 13 additions, 2 deletions...g/apache/spark/sql/hive/HiveDataFrameAnalyticsSuite.scala
Loading
Please register or sign in to comment