Skip to content
Snippets Groups Projects
  1. Nov 17, 2016
    • Zheng RuiFeng's avatar
      [SPARK-18480][DOCS] Fix wrong links for ML guide docs · 536a2159
      Zheng RuiFeng authored
      
      ## What changes were proposed in this pull request?
      1, There are two `[Graph.partitionBy]` in `graphx-programming-guide.md`, the first one had no effert.
      2, `DataFrame`, `Transformer`, `Pipeline` and `Parameter`  in `ml-pipeline.md` were linked to `ml-guide.html` by mistake.
      3, `PythonMLLibAPI` in `mllib-linear-methods.md` was not accessable, because class `PythonMLLibAPI` is private.
      4, Other link updates.
      ## How was this patch tested?
       manual tests
      
      Author: Zheng RuiFeng <ruifengz@foxmail.com>
      
      Closes #15912 from zhengruifeng/md_fix.
      
      (cherry picked from commit cdaf4ce9)
      Signed-off-by: default avatarSean Owen <sowen@cloudera.com>
      Unverified
      536a2159
  2. Oct 03, 2016
  3. Jul 15, 2016
    • Joseph K. Bradley's avatar
      [SPARK-14817][ML][MLLIB][DOC] Made DataFrame-based API primary in MLlib guide · 5ffd5d38
      Joseph K. Bradley authored
      ## What changes were proposed in this pull request?
      
      Made DataFrame-based API primary
      * Spark doc menu bar and other places now link to ml-guide.html, not mllib-guide.html
      * mllib-guide.html keeps RDD-specific list of features, with a link at the top redirecting people to ml-guide.html
      * ml-guide.html includes a "maintenance mode" announcement about the RDD-based API
        * **Reviewers: please check this carefully**
      * (minor) Titles for DF API no longer include "- spark.ml" suffix.  Titles for RDD API have "- RDD-based API" suffix
      * Moved migration guide to ml-guide from mllib-guide
        * Also moved past guides from mllib-migration-guides to ml-migration-guides, with a redirect link on mllib-migration-guides
        * **Reviewers**: I did not change any of the content of the migration guides.
      
      Reorganized DataFrame-based guide:
      * ml-guide.html mimics the old mllib-guide.html page in terms of content: overview, migration guide, etc.
      * Moved Pipeline description into ml-pipeline.html and moved tuning into ml-tuning.html
        * **Reviewers**: I did not change the content of these guides, except some intro text.
      * Sidebar remains the same, but with pipeline and tuning sections added
      
      Other:
      * ml-classification-regression.html: Moved text about linear methods to new section in page
      
      ## How was this patch tested?
      
      Generated docs locally
      
      Author: Joseph K. Bradley <joseph@databricks.com>
      
      Closes #14213 from jkbradley/ml-guide-2.0.
      5ffd5d38
  4. Jun 11, 2016
    • Dongjoon Hyun's avatar
      [SPARK-15883][MLLIB][DOCS] Fix broken links in mllib documents · ad102af1
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      This issue fixes all broken links on Spark 2.0 preview MLLib documents. Also, this contains some editorial change.
      
      **Fix broken links**
        * mllib-data-types.md
        * mllib-decision-tree.md
        * mllib-ensembles.md
        * mllib-feature-extraction.md
        * mllib-pmml-model-export.md
        * mllib-statistics.md
      
      **Fix malformed section header and scala coding style**
        * mllib-linear-methods.md
      
      **Replace indirect forward links with direct one**
        * ml-classification-regression.md
      
      ## How was this patch tested?
      
      Manual tests (with `cd docs; jekyll build`.)
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #13608 from dongjoon-hyun/SPARK-15883.
      ad102af1
  5. Feb 26, 2016
    • Dongjoon Hyun's avatar
      [SPARK-11381][DOCS] Replace example code in mllib-linear-methods.md using include_example · 7af0de07
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      This PR replaces example codes in `mllib-linear-methods.md` using `include_example`
      by doing the followings:
        * Extracts the example codes(Scala,Java,Python) as files in `example` module.
        * Merges some dialog-style examples into a single file.
        * Hide redundant codes in HTML for the consistency with other docs.
      
      ## How was the this patch tested?
      
      manual test.
      This PR can be tested by document generations, `SKIP_API=1 jekyll build`.
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #11320 from dongjoon-hyun/SPARK-11381.
      7af0de07
  6. Jan 12, 2016
  7. Dec 10, 2015
    • Timothy Hunter's avatar
      [SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib... · 2ecbe02d
      Timothy Hunter authored
      [SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib and mllib in the documentation.
      
      Replaces a number of occurences of `MLlib` in the documentation that were meant to refer to the `spark.mllib` package instead. It should clarify for new users the difference between `spark.mllib` (the package) and MLlib (the umbrella project for ML in spark).
      
      It also removes some files that I forgot to delete with #10207
      
      Author: Timothy Hunter <timhunter@databricks.com>
      
      Closes #10234 from thunterdb/12212.
      2ecbe02d
  8. Nov 26, 2015
  9. Oct 15, 2015
  10. Oct 07, 2015
  11. Aug 18, 2015
  12. Jul 15, 2015
    • Shuo Xiang's avatar
      [SPARK-7555] [DOCS] Add doc for elastic net in ml-guide and mllib-guide · 303c1201
      Shuo Xiang authored
      jkbradley I put the elastic net under the **Algorithm guide** section. Also add the formula of elastic net in mllib-linear `mllib-linear-methods#regularizers`.
      
      dbtsai I left the code tab for you to add example code. Do you think it is the right place?
      
      Author: Shuo Xiang <shuoxiangpub@gmail.com>
      
      Closes #6504 from coderxiang/elasticnet and squashes the following commits:
      
      f6061ee [Shuo Xiang] typo
      90a7c88 [Shuo Xiang] Merge remote-tracking branch 'upstream/master' into elasticnet
      0610a36 [Shuo Xiang] move out the elastic net to ml-linear-methods
      8747190 [Shuo Xiang] merge master
      706d3f7 [Shuo Xiang] add python code
      9bc2b4c [Shuo Xiang] typo
      db32a60 [Shuo Xiang] java code sample
      aab3b3a [Shuo Xiang] Merge remote-tracking branch 'upstream/master' into elasticnet
      a0dae07 [Shuo Xiang] simplify code
      d8616fd [Shuo Xiang] Update the definition of elastic net. Add scala code; Mention Lasso and Ridge
      df5bd14 [Shuo Xiang] use wikipeida page in ml-linear-methods.md
      78d9366 [Shuo Xiang] address comments
      8ce37c2 [Shuo Xiang] Merge branch 'elasticnet' of github.com:coderxiang/spark into elasticnet
      8f24848 [Shuo Xiang] Merge branch 'elastic-net-doc' of github.com:coderxiang/spark into elastic-net-doc
      998d766 [Shuo Xiang] Merge branch 'elastic-net-doc' of github.com:coderxiang/spark into elastic-net-doc
      89f10e4 [Shuo Xiang] Merge remote-tracking branch 'upstream/master' into elastic-net-doc
      9262a72 [Shuo Xiang] update
      7e07d12 [Shuo Xiang] update
      b32f21a [Shuo Xiang] add doc for elastic net in sparkml
      937eef1 [Shuo Xiang] Merge remote-tracking branch 'upstream/master' into elastic-net-doc
      180b496 [Shuo Xiang] Merge remote-tracking branch 'upstream/master'
      aa0717d [Shuo Xiang] Merge remote-tracking branch 'upstream/master'
      5f109b4 [Shuo Xiang] Merge remote-tracking branch 'upstream/master'
      c5c5bfe [Shuo Xiang] Merge remote-tracking branch 'upstream/master'
      98804c9 [Shuo Xiang] fix bug in topBykey and update test
      303c1201
  13. Jul 01, 2015
    • Yuhao Yang's avatar
      [SPARK-8308] [MLLIB] add missing save load for python example · 20129133
      Yuhao Yang authored
      jira: https://issues.apache.org/jira/browse/SPARK-8308
      
      1. add some missing save/load in python examples. , LogisticRegression, LinearRegression and NaiveBayes
      2. tune down iterations for MatrixFactorization, since current number will trigger StackOverflow for default java configuration (>1M)
      
      Author: Yuhao Yang <hhbyyh@gmail.com>
      
      Closes #6760 from hhbyyh/docUpdate and squashes the following commits:
      
      9bd3383 [Yuhao Yang] update scala example
      8a44692 [Yuhao Yang] Merge remote-tracking branch 'upstream/master' into docUpdate
      077cbb8 [Yuhao Yang] Merge remote-tracking branch 'upstream/master' into docUpdate
      3e948dc [Yuhao Yang] add missing save load for python example
      20129133
  14. Jun 30, 2015
    • MechCoder's avatar
      [SPARK-4127] [MLLIB] [PYSPARK] Python bindings for StreamingLinearRegressionWithSGD · 45281664
      MechCoder authored
      Python bindings for StreamingLinearRegressionWithSGD
      
      Author: MechCoder <manojkumarsivaraj334@gmail.com>
      
      Closes #6744 from MechCoder/spark-4127 and squashes the following commits:
      
      d8f6457 [MechCoder] Moved StreamingLinearAlgorithm to pyspark.mllib.regression
      d47cc24 [MechCoder] Inherit from StreamingLinearAlgorithm
      1b4ddd6 [MechCoder] minor
      4de6c68 [MechCoder] Minor refactor
      5e85a3b [MechCoder] Add tests for simultaneous training and prediction
      fb27889 [MechCoder] Add example and docs
      505380b [MechCoder] Add tests
      d42bdae [MechCoder] [SPARK-4127] Python bindings for StreamingLinearRegressionWithSGD
      45281664
  15. Jun 03, 2015
  16. May 21, 2015
    • Mike Dusenberry's avatar
      [DOCS] [MLLIB] Fixing broken link in MLlib Linear Methods documentation. · e4136ea6
      Mike Dusenberry authored
      Just a small change: fixed a broken link in the MLlib Linear Methods documentation by removing a newline character between the link title and link address.
      
      Author: Mike Dusenberry <dusenberrymw@gmail.com>
      
      Closes #6340 from dusenberrymw/Fix_MLlib_Linear_Methods_link and squashes the following commits:
      
      0a57818 [Mike Dusenberry] Fixing broken link in MLlib Linear Methods documentation.
      e4136ea6
  17. Apr 18, 2015
    • Gaurav Nanda's avatar
      Fixed doc · 729885ec
      Gaurav Nanda authored
      Just fixed a doc.
      
      Author: Gaurav Nanda <gaurav324@gmail.com>
      
      Closes #5576 from gaurav324/master and squashes the following commits:
      
      8a7323f [Gaurav Nanda] Fixed doc
      729885ec
  18. Mar 03, 2015
  19. Mar 02, 2015
    • Xiangrui Meng's avatar
      [SPARK-5537] Add user guide for multinomial logistic regression · 9d6c5aee
      Xiangrui Meng authored
      This is based on #4801 from dbtsai. The linear method guide is re-organized a little bit for this change.
      
      Closes #4801
      
      Author: Xiangrui Meng <meng@databricks.com>
      Author: DB Tsai <dbtsai@alpinenow.com>
      
      Closes #4861 from mengxr/SPARK-5537 and squashes the following commits:
      
      47af0ac [Xiangrui Meng] update user guide for multinomial logistic regression
      cdc2e15 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into AlpineNow-mlor-doc
      096d0ca [DB Tsai] first commit
      9d6c5aee
  20. Feb 27, 2015
    • Joseph K. Bradley's avatar
      [SPARK-4587] [mllib] [docs] Fixed save,load calls in ML guide examples · d17cb2ba
      Joseph K. Bradley authored
      Should pass spark context to save/load
      
      CC: mengxr
      
      Author: Joseph K. Bradley <joseph@databricks.com>
      
      Closes #4816 from jkbradley/ml-io-doc-fix and squashes the following commits:
      
      83d369d [Joseph K. Bradley] added comment to save,load parts of ML guide examples
      2841170 [Joseph K. Bradley] Fixed save,load calls in ML guide examples
      d17cb2ba
  21. Feb 25, 2015
    • Joseph K. Bradley's avatar
      [SPARK-5974] [SPARK-5980] [mllib] [python] [docs] Update ML guide with save/load, Python GBT · d20559b1
      Joseph K. Bradley authored
      * Add GradientBoostedTrees Python examples to ML guide
        * I ran these in the pyspark shell, and they worked.
      * Add save/load to examples in ML guide
      * Added note to python docs about predict,transform not working within RDD actions,transformations in some cases (See SPARK-5981)
      
      CC: mengxr
      
      Author: Joseph K. Bradley <joseph@databricks.com>
      
      Closes #4750 from jkbradley/SPARK-5974 and squashes the following commits:
      
      c410e38 [Joseph K. Bradley] Added note to LabeledPoint about attributes
      bcae18b [Joseph K. Bradley] Added import of models for save/load examples in ml guide.  Fixed line length for tree.py, feature.py (but not other ML Pyspark files yet).
      6d81c3e [Joseph K. Bradley] completed python GBT examples
      9903309 [Joseph K. Bradley] Added note to python docs about predict,transform not working within RDD actions,transformations in some cases
      c7dfad8 [Joseph K. Bradley] Added model save/load to ML guide.  Added GBT examples to ML guide
      d20559b1
  22. Dec 03, 2014
    • Joseph K. Bradley's avatar
      [SPARK-4711] [mllib] [docs] Programming guide advice on choosing optimizer · 27ab0b8a
      Joseph K. Bradley authored
      I have heard requests for the docs to include advice about choosing an optimization method. The programming guide could include a brief statement about this (so the user does not have to read the whole optimization section).
      
      CC: mengxr
      
      Author: Joseph K. Bradley <joseph@databricks.com>
      
      Closes #3569 from jkbradley/lr-doc and squashes the following commits:
      
      654aeb5 [Joseph K. Bradley] updated section header for mllib-optimization
      5035ad0 [Joseph K. Bradley] updated based on review
      94f6dec [Joseph K. Bradley] Updated linear methods and optimization docs with quick advice on choosing an optimization method
      27ab0b8a
  23. Oct 14, 2014
    • Sean Owen's avatar
      SPARK-1307 [DOCS] Don't use term 'standalone' to refer to a Spark Application · 18ab6bd7
      Sean Owen authored
      HT to Diana, just proposing an implementation of her suggestion, which I rather agreed with. Is there a second/third for the motion?
      
      Refer to "self-contained" rather than "standalone" apps to avoid confusion with standalone deployment mode. And fix placement of reference to this in MLlib docs.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #2787 from srowen/SPARK-1307 and squashes the following commits:
      
      b5b82e2 [Sean Owen] Refer to "self-contained" rather than "standalone" apps to avoid confusion with standalone deployment mode. And fix placement of reference to this in MLlib docs.
      18ab6bd7
  24. Sep 25, 2014
    • Aaron Staple's avatar
      [SPARK-1484][MLLIB] Warn when running an iterative algorithm on uncached data. · ff637c93
      Aaron Staple authored
      Add warnings to KMeans, GeneralizedLinearAlgorithm, and computeSVD when called with input data that is not cached. KMeans is implemented iteratively, and I believe that GeneralizedLinearAlgorithm’s current optimizers are iterative and its future optimizers are also likely to be iterative. RowMatrix’s computeSVD is iterative against an RDD when run in DistARPACK mode. ALS and DecisionTree are iterative as well, but they implement RDD caching internally so do not require a warning.
      
      I added a warning to GeneralizedLinearAlgorithm rather than inside its optimizers, where the iteration actually occurs, because internally GeneralizedLinearAlgorithm maps its input data to an uncached RDD before passing it to an optimizer. (In other words, the warning would be printed for every GeneralizedLinearAlgorithm run, regardless of whether its input is cached, if the warning were in GradientDescent or other optimizer.) I assume that use of an uncached RDD by GeneralizedLinearAlgorithm is intentional, and that the mapping there (adding label, intercepts and scaling) is a lightweight operation. Arguably a user calling an optimizer such as GradientDescent will be knowledgable enough to cache their data without needing a log warning, so lack of a warning in the optimizers may be ok.
      
      Some of the documentation examples making use of these iterative algorithms did not cache their training RDDs (while others did). I updated the examples to always cache. I also fixed some (unrelated) minor errors in the documentation examples.
      
      Author: Aaron Staple <aaron.staple@gmail.com>
      
      Closes #2347 from staple/SPARK-1484 and squashes the following commits:
      
      bd49701 [Aaron Staple] Address review comments.
      ab2d4a4 [Aaron Staple] Disable warnings on python code path.
      a7a0f99 [Aaron Staple] Change code comments per review comments.
      7cca1dc [Aaron Staple] Change warning message text.
      c77e939 [Aaron Staple] [SPARK-1484][MLLIB] Warn when running an iterative algorithm on uncached data.
      3b6c511 [Aaron Staple] Minor doc example fixes.
      ff637c93
  25. Aug 19, 2014
    • freeman's avatar
      [SPARK-3112][MLLIB] Add documentation and example for StreamingLR · c7252b00
      freeman authored
      Added a documentation section on StreamingLR to the ``MLlib - Linear Methods``, including a worked example.
      
      mengxr tdas
      
      Author: freeman <the.freeman.lab@gmail.com>
      
      Closes #2047 from freeman-lab/streaming-lr-docs and squashes the following commits:
      
      568d250 [freeman] Tweaks to wording / formatting
      05a1139 [freeman] Added documentation and example for StreamingLR
      c7252b00
  26. Aug 12, 2014
    • Ameet Talwalkar's avatar
      SPARK-2830 [MLlib]: re-organize mllib documentation · c235b83e
      Ameet Talwalkar authored
      As per discussions with Xiangrui, I've reorganized and edited the mllib documentation.
      
      Author: Ameet Talwalkar <atalwalkar@gmail.com>
      
      Closes #1908 from atalwalkar/master and squashes the following commits:
      
      fe6938a [Ameet Talwalkar] made xiangruis suggested changes
      840028b [Ameet Talwalkar] made xiangruis suggested changes
      7ec366a [Ameet Talwalkar] reorganize and edit mllib documentation
      c235b83e
  27. Jul 20, 2014
    • Michael Giannakopoulos's avatar
      [SPARK-1945][MLLIB] Documentation Improvements for Spark 1.0 · db56f2df
      Michael Giannakopoulos authored
      Standalone application examples are added to 'mllib-linear-methods.md' file written in Java.
      This commit is related to the issue [Add full Java Examples in MLlib docs](https://issues.apache.org/jira/browse/SPARK-1945).
      Also I changed the name of the sigmoid function from 'logit' to 'f'. This is because the logit function
      is the inverse of sigmoid.
      
      Thanks,
      Michael
      
      Author: Michael Giannakopoulos <miccagiann@gmail.com>
      
      Closes #1311 from miccagiann/master and squashes the following commits:
      
      8ffe5ab [Michael Giannakopoulos] Update code so as to comply with code standards.
      f7ad5cc [Michael Giannakopoulos] Merge remote-tracking branch 'upstream/master'
      38d92c7 [Michael Giannakopoulos] Adding PCA, SVD and LBFGS examples in Java. Performing minor updates in the already committed examples so as to eradicate the call of 'productElement' function whenever is possible.
      cc0a089 [Michael Giannakopoulos] Modyfied Java examples so as to comply with coding standards.
      b1141b2 [Michael Giannakopoulos] Added Java examples for Clustering and Collaborative Filtering [mllib-clustering.md & mllib-collaborative-filtering.md].
      837f7a8 [Michael Giannakopoulos] Merge remote-tracking branch 'upstream/master'
      15f0eb4 [Michael Giannakopoulos] Java examples included in 'mllib-linear-methods.md' file.
      db56f2df
  28. Jul 13, 2014
    • Sean Owen's avatar
      SPARK-2363. Clean MLlib's sample data files · 635888cb
      Sean Owen authored
      (Just made a PR for this, mengxr was the reporter of:)
      
      MLlib has sample data under serveral folders:
      1) data/mllib
      2) data/
      3) mllib/data/*
      Per previous discussion with Matei Zaharia, we want to put them under `data/mllib` and clean outdated files.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #1394 from srowen/SPARK-2363 and squashes the following commits:
      
      54313dd [Sean Owen] Move ML example data from /mllib/data/ and /data/ into /data/mllib/
      635888cb
  29. May 18, 2014
    • Xiangrui Meng's avatar
      [WIP][SPARK-1871][MLLIB] Improve MLlib guide for v1.0 · df0aa835
      Xiangrui Meng authored
      Some improvements to MLlib guide:
      
      1. [SPARK-1872] Update API links for unidoc.
      2. [SPARK-1783] Added `page.displayTitle` to the global layout. If it is defined, use it instead of `page.title` for title display.
      3. Add more Java/Python examples.
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #816 from mengxr/mllib-doc and squashes the following commits:
      
      ec2e407 [Xiangrui Meng] format scala example for ALS
      cd9f40b [Xiangrui Meng] add a paragraph to summarize distributed matrix types
      4617f04 [Xiangrui Meng] add python example to loadLibSVMFile and fix Java example
      d6509c2 [Xiangrui Meng] [SPARK-1783] update mllib titles
      561fdc0 [Xiangrui Meng] add a displayTitle option to global layout
      195d06f [Xiangrui Meng] add Java example for summary stats and minor fix
      9f1ff89 [Xiangrui Meng] update java api links in mllib-basics
      7dad18e [Xiangrui Meng] update java api links in NB
      3a0f4a6 [Xiangrui Meng] api/pyspark -> api/python
      35bdeb9 [Xiangrui Meng] api/mllib -> api/scala
      e4afaa8 [Xiangrui Meng] explicity state what might change
      df0aa835
  30. May 08, 2014
    • DB Tsai's avatar
      MLlib documentation fix · d38febee
      DB Tsai authored
      Fixed the documentation for that `loadLibSVMData` is changed to `loadLibSVMFile`.
      
      Author: DB Tsai <dbtsai@alpinenow.com>
      
      Closes #703 from dbtsai/dbtsai-docfix and squashes the following commits:
      
      71dd508 [DB Tsai] loadLibSVMData is changed to loadLibSVMFile
      d38febee
  31. May 06, 2014
    • Sean Owen's avatar
      SPARK-1727. Correct small compile errors, typos, and markdown issues in (primarly) MLlib docs · 25ad8f93
      Sean Owen authored
      While play-testing the Scala and Java code examples in the MLlib docs, I noticed a number of small compile errors, and some typos. This led to finding and fixing a few similar items in other docs.
      
      Then in the course of building the site docs to check the result, I found a few small suggestions for the build instructions. I also found a few more formatting and markdown issues uncovered when I accidentally used maruku instead of kramdown.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #653 from srowen/SPARK-1727 and squashes the following commits:
      
      6e7c38a [Sean Owen] Final doc updates - one more compile error, and use of mean instead of sum and count
      8f5e847 [Sean Owen] Fix markdown syntax issues that maruku flags, even though we use kramdown (but only those that do not affect kramdown's output)
      99966a9 [Sean Owen] Update issue tracker URL in docs
      23c9ac3 [Sean Owen] Add Scala Naive Bayes example, to use existing example data file (whose format needed a tweak)
      8c81982 [Sean Owen] Fix small compile errors and typos across MLlib docs
      25ad8f93
  32. May 05, 2014
    • Xiangrui Meng's avatar
      [SPARK-1594][MLLIB] Cleaning up MLlib APIs and guide · 98750a74
      Xiangrui Meng authored
      Final pass before the v1.0 release.
      
      * Remove `VectorRDDs`
      * Move `BinaryClassificationMetrics` from `evaluation.binary` to `evaluation`
      * Change default value of `addIntercept` to false and allow to add intercept in Ridge and Lasso.
      * Clean `DecisionTree` package doc and test suite.
      * Mark model constructors `private[spark]`
      * Rename `loadLibSVMData` to `loadLibSVMFile` and hide `LabelParser` from users.
      * Add `saveAsLibSVMFile`.
      * Add `appendBias` to `MLUtils`.
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #524 from mengxr/mllib-cleaning and squashes the following commits:
      
      295dc8b [Xiangrui Meng] update loadLibSVMFile doc
      1977ac1 [Xiangrui Meng] fix doc of appendBias
      649fcf0 [Xiangrui Meng] rename loadLibSVMData to loadLibSVMFile; hide LabelParser from user APIs
      54b812c [Xiangrui Meng] add appendBias
      a71e7d0 [Xiangrui Meng] add saveAsLibSVMFile
      d976295 [Xiangrui Meng] Merge branch 'master' into mllib-cleaning
      b7e5cec [Xiangrui Meng] remove some experimental annotations and make model constructors private[mllib]
      9b02b93 [Xiangrui Meng] minor code style update
      a593ddc [Xiangrui Meng] fix python tests
      fc28c18 [Xiangrui Meng] mark more classes experimental
      f6cbbff [Xiangrui Meng] fix Java tests
      0af70b0 [Xiangrui Meng] minor
      6e139ef [Xiangrui Meng] Merge branch 'master' into mllib-cleaning
      94e6dce [Xiangrui Meng] move BinaryLabelCounter and BinaryConfusionMatrixImpl to evaluation.binary
      df34907 [Xiangrui Meng] clean DecisionTreeSuite to use LocalSparkContext
      c81807f [Xiangrui Meng] set the default value of AddIntercept to false
      03389c0 [Xiangrui Meng] allow to add intercept in Ridge and Lasso
      c66c56f [Xiangrui Meng] move tree md to package object doc
      a2695df [Xiangrui Meng] update guide for BinaryClassificationMetrics
      9194f4c [Xiangrui Meng] move BinaryClassificationMetrics one level up
      1c1a0e3 [Xiangrui Meng] remove VectorRDDs because it only contains one function that is not necessary for us to maintain
      98750a74
  33. Apr 22, 2014
    • Xiangrui Meng's avatar
      [SPARK-1506][MLLIB] Documentation improvements for MLlib 1.0 · 26d35f3f
      Xiangrui Meng authored
      Preview: http://54.82.240.23:4000/mllib-guide.html
      
      Table of contents:
      
      * Basics
        * Data types
        * Summary statistics
      * Classification and regression
        * linear support vector machine (SVM)
        * logistic regression
        * linear linear squares, Lasso, and ridge regression
        * decision tree
        * naive Bayes
      * Collaborative Filtering
        * alternating least squares (ALS)
      * Clustering
        * k-means
      * Dimensionality reduction
        * singular value decomposition (SVD)
        * principal component analysis (PCA)
      * Optimization
        * stochastic gradient descent
        * limited-memory BFGS (L-BFGS)
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #422 from mengxr/mllib-doc and squashes the following commits:
      
      944e3a9 [Xiangrui Meng] merge master
      f9fda28 [Xiangrui Meng] minor
      9474065 [Xiangrui Meng] add alpha to ALS examples
      928e630 [Xiangrui Meng] initialization_mode -> initializationMode
      5bbff49 [Xiangrui Meng] add imports to labeled point examples
      c17440d [Xiangrui Meng] fix python nb example
      28f40dc [Xiangrui Meng] remove localhost:4000
      369a4d3 [Xiangrui Meng] Merge branch 'master' into mllib-doc
      7dc95cc [Xiangrui Meng] update linear methods
      053ad8a [Xiangrui Meng] add links to go back to the main page
      abbbf7e [Xiangrui Meng] update ALS argument names
      648283e [Xiangrui Meng] level down statistics
      14e2287 [Xiangrui Meng] add sample libsvm data and use it in guide
      8cd2441 [Xiangrui Meng] minor updates
      186ab07 [Xiangrui Meng] update section names
      6568d65 [Xiangrui Meng] update toc, level up lr and svm
      162ee12 [Xiangrui Meng] rename section names
      5c1e1b1 [Xiangrui Meng] minor
      8aeaba1 [Xiangrui Meng] wrap long lines
      6ce6a6f [Xiangrui Meng] add summary statistics to toc
      5760045 [Xiangrui Meng] claim beta
      cc604bf [Xiangrui Meng] remove classification and regression
      92747b3 [Xiangrui Meng] make section titles consistent
      e605dd6 [Xiangrui Meng] add LIBSVM loader
      f639674 [Xiangrui Meng] add python section to migration guide
      c82ffb4 [Xiangrui Meng] clean optimization
      31660eb [Xiangrui Meng] update linear algebra and stat
      0a40837 [Xiangrui Meng] first pass over linear methods
      1fc8271 [Xiangrui Meng] update toc
      906ed0a [Xiangrui Meng] add a python example to naive bayes
      5f0a700 [Xiangrui Meng] update collaborative filtering
      656d416 [Xiangrui Meng] update mllib-clustering
      86e143a [Xiangrui Meng] remove data types section from main page
      8d1a128 [Xiangrui Meng] move part of linear algebra to data types and add Java/Python examples
      d1b5cbf [Xiangrui Meng] merge master
      72e4804 [Xiangrui Meng] one pass over tree guide
      64f8995 [Xiangrui Meng] move decision tree guide to a separate file
      9fca001 [Xiangrui Meng] add first version of linear algebra guide
      53c9552 [Xiangrui Meng] update dependencies
      f316ec2 [Xiangrui Meng] add migration guide
      f399f6c [Xiangrui Meng] move linear-algebra to dimensionality-reduction
      182460f [Xiangrui Meng] add guide for naive Bayes
      137fd1d [Xiangrui Meng] re-organize toc
      a61e434 [Xiangrui Meng] update mllib's toc
      26d35f3f
Loading