Skip to content
Snippets Groups Projects
  • Reynold Xin's avatar
    5051262d
    [SPARK-11489][SQL] Only include common first order statistics in GroupedData · 5051262d
    Reynold Xin authored
    We added a bunch of higher order statistics such as skewness and kurtosis to GroupedData. I don't think they are common enough to justify being listed, since users can always use the normal statistics aggregate functions.
    
    That is to say, after this change, we won't support
    ```scala
    df.groupBy("key").kurtosis("colA", "colB")
    ```
    
    However, we will still support
    ```scala
    df.groupBy("key").agg(kurtosis(col("colA")), kurtosis(col("colB")))
    ```
    
    Author: Reynold Xin <rxin@databricks.com>
    
    Closes #9446 from rxin/SPARK-11489.
    5051262d
    History
    [SPARK-11489][SQL] Only include common first order statistics in GroupedData
    Reynold Xin authored
    We added a bunch of higher order statistics such as skewness and kurtosis to GroupedData. I don't think they are common enough to justify being listed, since users can always use the normal statistics aggregate functions.
    
    That is to say, after this change, we won't support
    ```scala
    df.groupBy("key").kurtosis("colA", "colB")
    ```
    
    However, we will still support
    ```scala
    df.groupBy("key").agg(kurtosis(col("colA")), kurtosis(col("colB")))
    ```
    
    Author: Reynold Xin <rxin@databricks.com>
    
    Closes #9446 from rxin/SPARK-11489.