-
- Downloads
[SPARK-21100][SQL] Add summary method as alternative to describe that gives...
[SPARK-21100][SQL] Add summary method as alternative to describe that gives quartiles similar to Pandas ## What changes were proposed in this pull request? Adds method `summary` that allows user to specify which statistics and percentiles to calculate. By default it include the existing statistics from `describe` and quartiles (25th, 50th, and 75th percentiles) similar to Pandas. Also changes the implementation of `describe` to delegate to `summary`. ## How was this patch tested? additional unit test Author: Andrew Ray <ray.andrew@gmail.com> Closes #18307 from aray/SPARK-21100.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala 73 additions, 40 deletionssql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala 96 additions, 2 deletions...a/org/apache/spark/sql/execution/stat/StatFunctions.scala
- sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 89 additions, 23 deletions.../src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
Loading
Please register or sign in to comment