-
- Downloads
[SPARK-7231] [SPARKR] Changes to make SparkR DataFrame dplyr friendly.
Changes include 1. Rename sortDF to arrange 2. Add new aliases `group_by` and `sample_frac`, `summarize` 3. Add more user friendly column addition (mutate), rename 4. Support mean as an alias for avg in Scala and also support n_distinct, n as in dplyr Using these changes we can pretty much run the examples as described in http://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html with the same syntax The only thing missing in SparkR is auto resolving column names when used in an expression i.e. making something like `select(flights, delay)` works in dply but we right now need `select(flights, flights$delay)` or `select(flights, "delay")`. But this is a complicated change and I'll file a new issue for it cc sun-rui rxin Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #6005 from shivaram/sparkr-df-api and squashes the following commits: 5e0716a [Shivaram Venkataraman] Fix some roxygen bugs 1254953 [Shivaram Venkataraman] Merge branch 'master' of https://github.com/apache/spark into sparkr-df-api 0521149 [Shivaram Venkataraman] Changes to make SparkR DataFrame dplyr friendly. Changes include 1. Rename sortDF to arrange 2. Add new aliases `group_by` and `sample_frac`, `summarize` 3. Add more user friendly column addition (mutate), rename 4. Support mean as an alias for avg in Scala and also support n_distinct, n as in dplyr
Showing
- R/pkg/NAMESPACE 9 additions, 2 deletionsR/pkg/NAMESPACE
- R/pkg/R/DataFrame.R 115 additions, 12 deletionsR/pkg/R/DataFrame.R
- R/pkg/R/column.R 28 additions, 4 deletionsR/pkg/R/column.R
- R/pkg/R/generics.R 38 additions, 3 deletionsR/pkg/R/generics.R
- R/pkg/R/group.R 8 additions, 2 deletionsR/pkg/R/group.R
- R/pkg/inst/tests/test_sparkSQL.R 30 additions, 6 deletionsR/pkg/inst/tests/test_sparkSQL.R
- sql/core/src/main/scala/org/apache/spark/sql/functions.scala 16 additions, 0 deletionssql/core/src/main/scala/org/apache/spark/sql/functions.scala
- sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 5 additions, 0 deletions.../src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
Loading
Please register or sign in to comment