-
- Downloads
[SPARK-15110] [SPARKR] Implement repartitionByColumn for SparkR DataFrames
## What changes were proposed in this pull request? Implement repartitionByColumn on DataFrame. This will allow us to run R functions on each partition identified by column groups with dapply() method. ## How was this patch tested? Unit tests Author: NarineK <narine.kokhlikyan@us.ibm.com> Closes #12887 from NarineK/repartitionByColumns.
Showing
- R/pkg/R/DataFrame.R 32 additions, 5 deletionsR/pkg/R/DataFrame.R
- R/pkg/R/RDD.R 6 additions, 2 deletionsR/pkg/R/RDD.R
- R/pkg/R/generics.R 1 addition, 1 deletionR/pkg/R/generics.R
- R/pkg/inst/tests/testthat/test_sparkSQL.R 36 additions, 0 deletionsR/pkg/inst/tests/testthat/test_sparkSQL.R
- sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala 3 additions, 2 deletionssql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
Loading
Please register or sign in to comment