-
- Downloads
[SPARK-16429][SQL] Include `StringType` columns in `describe()`
## What changes were proposed in this pull request? Currently, Spark `describe` supports `StringType`. However, `describe()` returns a dataset for only all numeric columns. This PR aims to include `StringType` columns in `describe()`, `describe` without argument. **Background** ```scala scala> spark.read.json("examples/src/main/resources/people.json").describe("age", "name").show() +-------+------------------+-------+ |summary| age| name| +-------+------------------+-------+ | count| 2| 3| | mean| 24.5| null| | stddev|7.7781745930520225| null| | min| 19| Andy| | max| 30|Michael| +-------+------------------+-------+ ``` **Before** ```scala scala> spark.read.json("examples/src/main/resources/people.json").describe().show() +-------+------------------+ |summary| age| +-------+------------------+ | count| 2| | mean| 24.5| | stddev|7.7781745930520225| | min| 19| | max| 30| +-------+------------------+ ``` **After** ```scala scala> spark.read.json("examples/src/main/resources/people.json").describe().show() +-------+------------------+-------+ |summary| age| name| +-------+------------------+-------+ | count| 2| 3| | mean| 24.5| null| | stddev|7.7781745930520225| null| | min| 19| Andy| | max| 30|Michael| +-------+------------------+-------+ ``` ## How was this patch tested? Pass the Jenkins with a update testcase. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #14095 from dongjoon-hyun/SPARK-16429.
Showing
- R/pkg/R/DataFrame.R 2 additions, 2 deletionsR/pkg/R/DataFrame.R
- R/pkg/inst/tests/testthat/test_sparkSQL.R 2 additions, 2 deletionsR/pkg/inst/tests/testthat/test_sparkSQL.R
- python/pyspark/sql/dataframe.py 4 additions, 4 deletionspython/pyspark/sql/dataframe.py
- sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala 13 additions, 3 deletionssql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
- sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 18 additions, 18 deletions.../src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
Loading
Please register or sign in to comment