-
- Downloads
[SPARK-11489][SQL] Only include common first order statistics in GroupedData
We added a bunch of higher order statistics such as skewness and kurtosis to GroupedData. I don't think they are common enough to justify being listed, since users can always use the normal statistics aggregate functions. That is to say, after this change, we won't support ```scala df.groupBy("key").kurtosis("colA", "colB") ``` However, we will still support ```scala df.groupBy("key").agg(kurtosis(col("colA")), kurtosis(col("colB"))) ``` Author: Reynold Xin <rxin@databricks.com> Closes #9446 from rxin/SPARK-11489.
Showing
- python/pyspark/sql/group.py 0 additions, 88 deletionspython/pyspark/sql/group.py
- sql/core/src/main/scala/org/apache/spark/sql/GroupedData.scala 28 additions, 118 deletions...ore/src/main/scala/org/apache/spark/sql/GroupedData.scala
- sql/core/src/test/java/test/org/apache/spark/sql/JavaDataFrameSuite.java 0 additions, 1 deletion...st/java/test/org/apache/spark/sql/JavaDataFrameSuite.java
Loading
Please register or sign in to comment