You need to sign in or sign up before continuing.
- May 26, 2017
-
-
Zheng RuiFeng authored
## What changes were proposed in this pull request? 1, add an example for sparkr `decisionTree` 2, document it in user guide ## How was this patch tested? local submit Author: Zheng RuiFeng <ruifengz@foxmail.com> Closes #18067 from zhengruifeng/dt_example.
-
- May 04, 2017
-
-
Felix Cheung authored
[SPARK-20015][SPARKR][SS][DOC][EXAMPLE] Document R Structured Streaming (experimental) in R vignettes and R & SS programming guide, R example ## What changes were proposed in this pull request? Add - R vignettes - R programming guide - SS programming guide - R example Also disable spark.als in vignettes for now since it's failing (SPARK-20402) ## How was this patch tested? manually Author: Felix Cheung <felixcheung_m@hotmail.com> Closes #17814 from felixcheung/rdocss.
-
- May 01, 2017
-
-
Felix Cheung authored
## What changes were proposed in this pull request? Updating R Programming Guide ## How was this patch tested? manually Author: Felix Cheung <felixcheung_m@hotmail.com> Closes #17816 from felixcheung/r22relnote.
-
- Apr 29, 2017
-
-
wangmiao1981 authored
## What changes were proposed in this pull request? Add hyper link in the SparkR programming guide. ## How was this patch tested? Build doc and manually check the doc link. Author: wangmiao1981 <wm624@hotmail.com> Closes #17805 from wangmiao1981/doc.
-
- Apr 28, 2017
-
-
wangmiao1981 authored
## What changes were proposed in this pull request? add link to svmLinear in the SparkR programming document. ## How was this patch tested? Build doc manually and click the link to the document. It looks good. Author: wangmiao1981 <wm624@hotmail.com> Closes #17797 from wangmiao1981/doc.
-
- Apr 27, 2017
-
-
zero323 authored
## What changes were proposed in this pull request? Add `spark.fpGrowth` to SparkR programming guide. ## How was this patch tested? Manual tests. Author: zero323 <zero323@users.noreply.github.com> Closes #17775 from zero323/SPARK-20208-FOLLOW-UP.
-
- Apr 26, 2017
-
-
zero323 authored
## What changes were proposed in this pull request? - Add `rollup` and `cube` methods and corresponding generics. - Add short description to the vignette. ## How was this patch tested? - Existing unit tests. - Additional unit tests covering new features. - `check-cran.sh`. Author: zero323 <zero323@users.noreply.github.com> Closes #17728 from zero323/SPARK-20437.
-
- Mar 27, 2017
-
-
Yanbo Liang authored
Section ```Data type mapping between R and Spark``` was put in the wrong place in SparkR doc currently, we should move it to a separate section. ## What changes were proposed in this pull request? Before this PR:  After this PR:  Author: Yanbo Liang <ybliang8@gmail.com> Closes #17440 from yanboliang/sparkr-doc.
-
- Dec 17, 2016
-
-
Felix Cheung authored
## What changes were proposed in this pull request? Reorganizing content (copy/paste) ## How was this patch tested? https://felixcheung.github.io/sparkr-vignettes.html Previous: https://felixcheung.github.io/sparkr-vignettes_old.html Author: Felix Cheung <felixcheung_m@hotmail.com> Closes #16301 from felixcheung/rvignettespass2.
-
- Dec 08, 2016
-
-
Yanbo Liang authored
## What changes were proposed in this pull request? * Add all R examples for ML wrappers which were added during 2.1 release cycle. * Split the whole ```ml.R``` example file into individual example for each algorithm, which will be convenient for users to rerun them. * Add corresponding examples to ML user guide. * Update ML section of SparkR user guide. Note: MLlib Scala/Java/Python examples will be consistent, however, SparkR examples may different from them, since R users may use the algorithms in a different way, for example, using R ```formula``` to specify ```featuresCol``` and ```labelCol```. ## How was this patch tested? Run all examples manually. Author: Yanbo Liang <ybliang8@gmail.com> Closes #16148 from yanboliang/spark-18325.
-
- Dec 04, 2016
-
-
Felix Cheung authored
## What changes were proposed in this pull request? If SparkR is running as a package and it has previously downloaded Spark Jar it should be able to run as before without having to set SPARK_HOME. Basically with this bug the auto install Spark will only work in the first session. This seems to be a regression on the earlier behavior. Fix is to always try to install or check for the cached Spark if running in an interactive session. As discussed before, we should probably only install Spark iff running in an interactive session (R shell, RStudio etc) ## How was this patch tested? Manually Author: Felix Cheung <felixcheung_m@hotmail.com> Closes #16077 from felixcheung/rsessioninteractive.
-
- Nov 23, 2016
-
-
Sean Owen authored
## What changes were proposed in this pull request? Updates links to the wiki to links to the new location of content on spark.apache.org. ## How was this patch tested? Doc builds Author: Sean Owen <sowen@cloudera.com> Closes #15967 from srowen/SPARK-18073.1.
-
- Oct 27, 2016
-
-
Felix Cheung authored
## What changes were proposed in this pull request? API and programming guide doc changes for Scala, Python and R. ## How was this patch tested? manual test Author: Felix Cheung <felixcheung_m@hotmail.com> Closes #15629 from felixcheung/jsondoc.
-
- Oct 21, 2016
-
-
Felix Cheung authored
## What changes were proposed in this pull request? Add crossJoin and do not default to cross join if joinExpr is left out ## How was this patch tested? unit test Author: Felix Cheung <felixcheung_m@hotmail.com> Closes #15559 from felixcheung/rcrossjoin.
-
- Sep 23, 2016
-
-
Jeff Zhang authored
## What changes were proposed in this pull request? Spark will add sparkr.zip to archive only when it is yarn mode (SparkSubmit.scala). ``` if (args.isR && clusterManager == YARN) { val sparkRPackagePath = RUtils.localSparkRPackagePath if (sparkRPackagePath.isEmpty) { printErrorAndExit("SPARK_HOME does not exist for R application in YARN mode.") } val sparkRPackageFile = new File(sparkRPackagePath.get, SPARKR_PACKAGE_ARCHIVE) if (!sparkRPackageFile.exists()) { printErrorAndExit(s"$SPARKR_PACKAGE_ARCHIVE does not exist for R application in YARN mode.") } val sparkRPackageURI = Utils.resolveURI(sparkRPackageFile.getAbsolutePath).toString // Distribute the SparkR package. // Assigns a symbol link name "sparkr" to the shipped package. args.archives = mergeFileLists(args.archives, sparkRPackageURI + "#sparkr") // Distribute the R package archive containing all the built R packages. if (!RUtils.rPackages.isEmpty) { val rPackageFile = RPackageUtils.zipRLibraries(new File(RUtils.rPackages.get), R_PACKAGE_ARCHIVE) if (!rPackageFile.exists()) { printErrorAndExit("Failed to zip all the built R packages.") } val rPackageURI = Utils.resolveURI(rPackageFile.getAbsolutePath).toString // Assigns a symbol link name "rpkg" to the shipped package. args.archives = mergeFileLists(args.archives, rPackageURI + "#rpkg") } } ``` So it is necessary to pass spark.master from R process to JVM. Otherwise sparkr.zip won't be distributed to executor. Besides that I also pass spark.yarn.keytab/spark.yarn.principal to spark side, because JVM process need them to access secured cluster. ## How was this patch tested? Verify it manually in R Studio using the following code. ``` Sys.setenv(SPARK_HOME="/Users/jzhang/github/spark") .libPaths(c(file.path(Sys.getenv(), "R", "lib"), .libPaths())) library(SparkR) sparkR.session(master="yarn-client", sparkConfig = list(spark.executor.instances="1")) df <- as.DataFrame(mtcars) head(df) ``` … Author: Jeff Zhang <zjffdu@apache.org> Closes #14784 from zjffdu/SPARK-17210.
-
- Sep 14, 2016
-
-
Sean Owen authored
## What changes were proposed in this pull request? Point references to spark-packages.org to https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects This will be accompanied by a parallel change to the spark-website repo, and additional changes to this wiki. ## How was this patch tested? Jenkins tests. Author: Sean Owen <sowen@cloudera.com> Closes #15075 from srowen/SPARK-17445.
-
- Jul 25, 2016
-
-
Felix Cheung authored
## What changes were proposed in this pull request? Fix broken url, also, sparkR.session.stop doc page should have it in the header, instead of saying "sparkR.stop"  Data type section is in the middle of a list of gapply/gapplyCollect subsections:  ## How was this patch tested? manual test Author: Felix Cheung <felixcheung_m@hotmail.com> Closes #14329 from felixcheung/rdoclinkfix.
-
- Jul 18, 2016
-
-
Felix Cheung authored
## What changes were proposed in this pull request? Fix code style from ad hoc review of RC4 doc ## How was this patch tested? manual shivaram Author: Felix Cheung <felixcheung_m@hotmail.com> Closes #14250 from felixcheung/rdocs2rc4.
-
- Jul 16, 2016
-
-
Narine Kokhlikyan authored
## What changes were proposed in this pull request? Updates programming guide for spark.gapply/spark.gapplyCollect. Similar to other examples I used `faithful` dataset to demonstrate gapply's functionality. Please, let me know if you prefer another example. ## How was this patch tested? Existing test cases in R Author: Narine Kokhlikyan <narine@slice.com> Closes #14090 from NarineK/gapplyProgGuide.
-
- Jul 13, 2016
-
-
Felix Cheung authored
## What changes were proposed in this pull request? Minor documentation update for code example, code style, and missed reference to "sparkR.init" ## How was this patch tested? manual shivaram Author: Felix Cheung <felixcheung_m@hotmail.com> Closes #14178 from felixcheung/rcsvprogrammingguide.
-
- Jul 11, 2016
-
-
Yanbo Liang authored
## What changes were proposed in this pull request? * Update SparkR ML section to make them consistent with SparkR API docs. * Since #13972 adds labelling support for the ```include_example``` Jekyll plugin, so that we can split the single ```ml.R``` example file into multiple line blocks with different labels, and include them in different algorithms/models in the generated HTML page. ## How was this patch tested? Only docs update, manually check the generated docs. Author: Yanbo Liang <ybliang8@gmail.com> Closes #14011 from yanboliang/r-user-guide-update.
-
- Jun 23, 2016
-
-
Felix Cheung authored
## What changes were proposed in this pull request? Updated setJobGroup, cancelJobGroup, clearJobGroup to not require sc/SparkContext as parameter. Also updated roxygen2 doc and R programming guide on deprecations. ## How was this patch tested? unit tests Author: Felix Cheung <felixcheung_m@hotmail.com> Closes #13838 from felixcheung/rjobgroup.
-
- Jun 22, 2016
-
-
Kai Jiang authored
## What changes were proposed in this pull request? Guide for - UDFs with dapply, dapplyCollect - spark.lapply for running parallel R functions ## How was this patch tested? build locally <img width="654" alt="screen shot 2016-06-14 at 03 12 56" src="https://cloud.githubusercontent.com/assets/3419881/16039344/12a3b6a0-31de-11e6-8d77-fe23308075c0.png"> Author: Kai Jiang <jiangkai@gmail.com> Closes #13660 from vectorijk/spark-15672-R-guide-update.
-
- Jun 21, 2016
-
-
Felix Cheung authored
## What changes were proposed in this pull request? Update doc as per discussion in PR #13592 ## How was this patch tested? manual shivaram liancheng Author: Felix Cheung <felixcheung_m@hotmail.com> Closes #13799 from felixcheung/rsqlprogrammingguide.
-
- Jun 20, 2016
-
-
Felix Cheung authored
## What changes were proposed in this pull request? roxygen2 doc, programming guide, example updates ## How was this patch tested? manual checks shivaram Author: Felix Cheung <felixcheung_m@hotmail.com> Closes #13751 from felixcheung/rsparksessiondoc.
-
- Jun 17, 2016
-
-
GayathriMurali authored
## What changes were proposed in this pull request? Make user guide changes to SparkR documentation for all changes that happened in 2.0 to Machine Learning APIs Author: GayathriMurali <gayathri.m@intel.com> Closes #13285 from GayathriMurali/SPARK-15129.
-
- May 26, 2016
-
-
felixcheung authored
## What changes were proposed in this pull request? Follow up on the earlier PR - in here we are fixing up roxygen2 doc examples. Also add to the programming guide migration section. ## How was this patch tested? SparkR tests Author: felixcheung <felixcheung_m@hotmail.com> Closes #13340 from felixcheung/sqlcontextdoc.
-
- May 25, 2016
-
-
Krishna Kalyan authored
## What changes were proposed in this pull request? Under Upgrading From SparkR 1.5.x to 1.6.x section added the information, SparkSQL converts `NA` in R to `null`. ## How was this patch tested? Document update, no tests. Author: Krishna Kalyan <krishnakalyan3@gmail.com> Closes #13268 from krishnakalyan3/spark-12071-1.
-
- May 09, 2016
-
-
Yanbo Liang authored
## What changes were proposed in this pull request? * Since Spark has supported native csv reader, it does not necessary to use the third party ```spark-csv``` in ```examples/src/main/r/data-manipulation.R```. Meanwhile, remove all ```spark-csv``` usage in SparkR. * Running R applications through ```sparkR``` is not supported as of Spark 2.0, so we change to use ```./bin/spark-submit``` to run the example. ## How was this patch tested? Offline test. Author: Yanbo Liang <ybliang8@gmail.com> Closes #13005 from yanboliang/r-df-examples.
-
- Apr 25, 2016
-
-
Dongjoon Hyun authored
## What changes were proposed in this pull request? This issue aims to fix some errors in R examples and make them up-to-date in docs and example modules. - Remove the wrong usage of `map`. We need to use `lapply` in `sparkR` if needed. However, `lapply` is private so far. The corrected example will be added later. - Fix the wrong example in Section `Generic Load/Save Functions` of `docs/sql-programming-guide.md` for consistency - Fix datatypes in `sparkr.md`. - Update a data result in `sparkr.md`. - Replace deprecated functions to remove warnings: jsonFile -> read.json, parquetFile -> read.parquet - Use up-to-date R-like functions: loadDF -> read.df, saveDF -> write.df, saveAsParquetFile -> write.parquet - Replace `SparkR DataFrame` with `SparkDataFrame` in `dataframe.R` and `data-manipulation.R`. - Other minor syntax fixes and a typo. ## How was this patch tested? Manual. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #12649 from dongjoon-hyun/SPARK-14883.
-
- Apr 23, 2016
-
-
felixcheung authored
## What changes were proposed in this pull request? Fixed inadvertent roxygen2 doc changes, added class name change to programming guide Follow up of #12621 ## How was this patch tested? manually checked Author: felixcheung <felixcheung_m@hotmail.com> Closes #12647 from felixcheung/rdataframe.
-
- Jan 19, 2016
-
-
felixcheung authored
shivaram sorry it took longer to fix some conflicts, this is the change to add an alias for `table` Author: felixcheung <felixcheung_m@hotmail.com> Closes #10406 from felixcheung/readtable.
-
- Jan 04, 2016
-
-
felixcheung authored
checked that the change is in Spark 1.6.0. shivaram Author: felixcheung <felixcheung_m@hotmail.com> Closes #10574 from felixcheung/rwritemodedoc.
-
- Dec 16, 2015
-
-
Jeff Zhang authored
shivaram Please help review. Author: Jeff Zhang <zjffdu@apache.org> Closes #10290 from zjffdu/SPARK-12318.
-
- Dec 03, 2015
-
-
felixcheung authored
shivaram Author: felixcheung <felixcheung_m@hotmail.com> Closes #10119 from felixcheung/rdocdplyrmasked.
-
- Nov 19, 2015
-
-
felixcheung authored
[SPARK-11339][SPARKR] Document the list of functions in R base package that are masked by functions with same name in SparkR Added tests for function that are reported as masked, to make sure the base:: or stats:: function can be called. For those we can't call, added them to SparkR programming guide. It would seem to me `table, sample, subset, filter, cov` not working are not actually expected - I investigated/experimented with them but couldn't get them to work. It looks like as they are defined in base or stats they are missing the S3 generic, eg. ``` > methods("transform") [1] transform,ANY-method transform.data.frame [3] transform,DataFrame-method transform.default see '?methods' for accessing help and source code > methods("subset") [1] subset.data.frame subset,DataFrame-method subset.default [4] subset.matrix see '?methods' for accessing help and source code Warning message: In .S3methods(generic.function, class, parent.frame()) : function 'subset' appears not to be S3 generic; found functions that look like S3 methods ``` Any idea? More information on masking: http://www.ats.ucla.edu/stat/r/faq/referencing_objects.htm http://www.sfu.ca/~sweldon/howTo/guide4.pdf This is what the output doc looks like (minus css):  Author: felixcheung <felixcheung_m@hotmail.com> Closes #9785 from felixcheung/rmasked.
-
- Nov 18, 2015
-
-
Yanbo Liang authored
This PR includes: * Update SparkR:::glm, SparkR:::summary API docs. * Update SparkR machine learning user guide and example codes to show: * supporting feature interaction in R formula. * summary for gaussian GLM model. * coefficients for binomial GLM model. mengxr Author: Yanbo Liang <ybliang8@gmail.com> Closes #9727 from yanboliang/spark-11684.
-
- Nov 03, 2015
-
-
felixcheung authored
 (This is without css, but you get the idea) shivaram Author: felixcheung <felixcheung_m@hotmail.com> Closes #9401 from felixcheung/rstudioprogrammingguide.
-
- Oct 30, 2015
-
-
felixcheung authored
[SPARK-11340][SPARKR] Support setting driver properties when starting Spark from R programmatically or from RStudio Mapping spark.driver.memory from sparkEnvir to spark-submit commandline arguments. shivaram suggested that we possibly add other spark.driver.* properties - do we want to add all of those? I thought those could be set in SparkConf? sun-rui Author: felixcheung <felixcheung_m@hotmail.com> Closes #9290 from felixcheung/rdrivermem.
-
- Aug 11, 2015
-
-
Eric Liang authored
This documents the use of R model formulae in the SparkR guide. Also fixes some bugs in the R api doc. mengxr Author: Eric Liang <ekl@databricks.com> Closes #8085 from ericl/docs.
-