Skip to content
Snippets Groups Projects
  1. May 26, 2017
    • Zheng RuiFeng's avatar
      [SPARK-20849][DOC][SPARKR] Document R DecisionTree · a97c4970
      Zheng RuiFeng authored
      ## What changes were proposed in this pull request?
      1, add an example for sparkr `decisionTree`
      2, document it in user guide
      
      ## How was this patch tested?
      local submit
      
      Author: Zheng RuiFeng <ruifengz@foxmail.com>
      
      Closes #18067 from zhengruifeng/dt_example.
      a97c4970
  2. Feb 28, 2017
  3. Feb 21, 2017
  4. Dec 08, 2016
    • Yanbo Liang's avatar
      [SPARK-18325][SPARKR][ML] SparkR ML wrappers example code and user guide · 9bf8f3cd
      Yanbo Liang authored
      ## What changes were proposed in this pull request?
      * Add all R examples for ML wrappers which were added during 2.1 release cycle.
      * Split the whole ```ml.R``` example file into individual example for each algorithm, which will be convenient for users to rerun them.
      * Add corresponding examples to ML user guide.
      * Update ML section of SparkR user guide.
      
      Note: MLlib Scala/Java/Python examples will be consistent, however, SparkR examples may different from them, since R users may use the algorithms in a different way, for example, using R ```formula``` to specify ```featuresCol``` and ```labelCol```.
      
      ## How was this patch tested?
      Run all examples manually.
      
      Author: Yanbo Liang <ybliang8@gmail.com>
      
      Closes #16148 from yanboliang/spark-18325.
      9bf8f3cd
  5. Dec 07, 2016
    • wm624@hotmail.com's avatar
      [SPARK-18633][ML][EXAMPLE] Add multiclass logistic regression summary python example and document · aad11209
      wm624@hotmail.com authored
      ## What changes were proposed in this pull request?
      Logistic Regression summary is added in Python API. We need to add example and document for summary.
      
      The newly added example is consistent with Scala and Java examples.
      
      ## How was this patch tested?
      
      Manually tests: Run the example with spark-submit; copy & paste code into pyspark; build document and check the document.
      
      Author: wm624@hotmail.com <wm624@hotmail.com>
      
      Closes #16064 from wangmiao1981/py.
      aad11209
  6. Dec 05, 2016
  7. Nov 17, 2016
    • Zheng RuiFeng's avatar
      [SPARK-18480][DOCS] Fix wrong links for ML guide docs · cdaf4ce9
      Zheng RuiFeng authored
      ## What changes were proposed in this pull request?
      1, There are two `[Graph.partitionBy]` in `graphx-programming-guide.md`, the first one had no effert.
      2, `DataFrame`, `Transformer`, `Pipeline` and `Parameter`  in `ml-pipeline.md` were linked to `ml-guide.html` by mistake.
      3, `PythonMLLibAPI` in `mllib-linear-methods.md` was not accessable, because class `PythonMLLibAPI` is private.
      4, Other link updates.
      ## How was this patch tested?
       manual tests
      
      Author: Zheng RuiFeng <ruifengz@foxmail.com>
      
      Closes #15912 from zhengruifeng/md_fix.
      Unverified
      cdaf4ce9
  8. Nov 16, 2016
  9. Nov 08, 2016
  10. Oct 05, 2016
    • sethah's avatar
      [SPARK-17239][ML][DOC] Update user guide for multiclass logistic regression · 9df54f53
      sethah authored
      ## What changes were proposed in this pull request?
      Updates user guide to reflect that LogisticRegression now supports multiclass. Also adds new examples to show multiclass training.
      
      ## How was this patch tested?
      Ran locally using spark-submit, run-example, and copy/paste from user guide into shells. Generated docs and verified correct output.
      
      Author: sethah <seth.hendrickson16@gmail.com>
      
      Closes #15349 from sethah/SPARK-17239.
      Unverified
      9df54f53
  11. Jul 15, 2016
    • Joseph K. Bradley's avatar
      [SPARK-14817][ML][MLLIB][DOC] Made DataFrame-based API primary in MLlib guide · 5ffd5d38
      Joseph K. Bradley authored
      ## What changes were proposed in this pull request?
      
      Made DataFrame-based API primary
      * Spark doc menu bar and other places now link to ml-guide.html, not mllib-guide.html
      * mllib-guide.html keeps RDD-specific list of features, with a link at the top redirecting people to ml-guide.html
      * ml-guide.html includes a "maintenance mode" announcement about the RDD-based API
        * **Reviewers: please check this carefully**
      * (minor) Titles for DF API no longer include "- spark.ml" suffix.  Titles for RDD API have "- RDD-based API" suffix
      * Moved migration guide to ml-guide from mllib-guide
        * Also moved past guides from mllib-migration-guides to ml-migration-guides, with a redirect link on mllib-migration-guides
        * **Reviewers**: I did not change any of the content of the migration guides.
      
      Reorganized DataFrame-based guide:
      * ml-guide.html mimics the old mllib-guide.html page in terms of content: overview, migration guide, etc.
      * Moved Pipeline description into ml-pipeline.html and moved tuning into ml-tuning.html
        * **Reviewers**: I did not change the content of these guides, except some intro text.
      * Sidebar remains the same, but with pipeline and tuning sections added
      
      Other:
      * ml-classification-regression.html: Moved text about linear methods to new section in page
      
      ## How was this patch tested?
      
      Generated docs locally
      
      Author: Joseph K. Bradley <joseph@databricks.com>
      
      Closes #14213 from jkbradley/ml-guide-2.0.
      5ffd5d38
  12. Jun 16, 2016
    • WeichenXu's avatar
      [SPARK-15608][ML][EXAMPLES][DOC] add examples and documents of ml.isotonic regression · 9040d83b
      WeichenXu authored
      ## What changes were proposed in this pull request?
      
      add ml doc for ml isotonic regression
      add scala example for ml isotonic regression
      add java example for ml isotonic regression
      add python example for ml isotonic regression
      
      modify scala example for mllib isotonic regression
      modify java example for mllib isotonic regression
      modify python example for mllib isotonic regression
      
      add data/mllib/sample_isotonic_regression_libsvm_data.txt
      delete data/mllib/sample_isotonic_regression_data.txt
      ## How was this patch tested?
      
      N/A
      
      Author: WeichenXu <WeichenXu123@outlook.com>
      
      Closes #13381 from WeichenXu123/add_isotonic_regression_doc.
      9040d83b
  13. Jun 11, 2016
    • Dongjoon Hyun's avatar
      [SPARK-15883][MLLIB][DOCS] Fix broken links in mllib documents · ad102af1
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      This issue fixes all broken links on Spark 2.0 preview MLLib documents. Also, this contains some editorial change.
      
      **Fix broken links**
        * mllib-data-types.md
        * mllib-decision-tree.md
        * mllib-ensembles.md
        * mllib-feature-extraction.md
        * mllib-pmml-model-export.md
        * mllib-statistics.md
      
      **Fix malformed section header and scala coding style**
        * mllib-linear-methods.md
      
      **Replace indirect forward links with direct one**
        * ml-classification-regression.md
      
      ## How was this patch tested?
      
      Manual tests (with `cd docs; jekyll build`.)
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #13608 from dongjoon-hyun/SPARK-15883.
      ad102af1
  14. Jun 07, 2016
    • Yanbo Liang's avatar
      [SPARK-13590][ML][DOC] Document spark.ml LiR, LoR and AFTSurvivalRegression behavior difference · 6ecedf39
      Yanbo Liang authored
      ## What changes were proposed in this pull request?
      When fitting ```LinearRegressionModel```(by "l-bfgs" solver) and ```LogisticRegressionModel``` w/o intercept on dataset with constant nonzero column, spark.ml produce same model as R glmnet but different from LIBSVM.
      
      When fitting ```AFTSurvivalRegressionModel``` w/o intercept on dataset with constant nonzero column, spark.ml produce different model compared with R survival::survreg.
      
      We should output a warning message and clarify in document for this condition.
      
      ## How was this patch tested?
      Document change, no unit test.
      
      cc mengxr
      
      Author: Yanbo Liang <ybliang8@gmail.com>
      
      Closes #12731 from yanboliang/spark-13590.
      6ecedf39
  15. May 27, 2016
    • sethah's avatar
      [SPARK-15186][ML][DOCS] Add user guide for generalized linear regression · c96244f5
      sethah authored
      ## What changes were proposed in this pull request?
      
      This patch adds a user guide section for generalized linear regression and includes the examples from [#12754](https://github.com/apache/spark/pull/12754).
      
      ## How was this patch tested?
      
      Documentation only, no tests required.
      
      ## Approach
      
      In general, it is a bit unclear what level of detail ought to be included in the user guide since there is a lot of variability within the current user guide. I tried to give a fairly brief mathematical introduction to GLMs, and cover what types of problems they could be used for. Additionally, I included a brief blurb on the IRLS solver. The input/output columns are given in a table as is found elsewhere in the docs (though, again, these appear rather intermittently in the current docs), as well as a table providing the supported families and their link functions.
      
      Author: sethah <seth.hendrickson16@gmail.com>
      
      Closes #13139 from sethah/SPARK-15186.
      c96244f5
  16. May 20, 2016
    • sethah's avatar
      [SPARK-15394][ML][DOCS] User guide typos and grammar audit · 5e203505
      sethah authored
      ## What changes were proposed in this pull request?
      
      Correct some typos and incorrectly worded sentences.
      
      ## How was this patch tested?
      
      Doc changes only.
      
      Note that many of these changes were identified by whomfire01
      
      Author: sethah <seth.hendrickson16@gmail.com>
      
      Closes #13180 from sethah/ml_guide_audit.
      5e203505
  17. May 11, 2016
    • Zheng RuiFeng's avatar
      [SPARK-15141][EXAMPLE][DOC] Update OneVsRest Examples · ad1a8466
      Zheng RuiFeng authored
      ## What changes were proposed in this pull request?
      1, Add python example for OneVsRest
      2, remove args-parsing
      
      ## How was this patch tested?
      manual tests
      `./bin/spark-submit examples/src/main/python/ml/one_vs_rest_example.py`
      
      Author: Zheng RuiFeng <ruifengz@foxmail.com>
      
      Closes #12920 from zhengruifeng/ovr_pe.
      ad1a8466
  18. Apr 13, 2016
  19. Feb 22, 2016
  20. Feb 03, 2016
  21. Jan 05, 2016
  22. Dec 10, 2015
    • Timothy Hunter's avatar
      [SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib... · 2ecbe02d
      Timothy Hunter authored
      [SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib and mllib in the documentation.
      
      Replaces a number of occurences of `MLlib` in the documentation that were meant to refer to the `spark.mllib` package instead. It should clarify for new users the difference between `spark.mllib` (the package) and MLlib (the umbrella project for ML in spark).
      
      It also removes some files that I forgot to delete with #10207
      
      Author: Timothy Hunter <timhunter@databricks.com>
      
      Closes #10234 from thunterdb/12212.
      2ecbe02d
  23. Dec 08, 2015
Loading