Skip to content
Snippets Groups Projects
  1. Jul 18, 2015
    • Paweł Kozikowski's avatar
      [MLLIB] [DOC] Seed fix in mllib naive bayes example · b9ef7ac9
      Paweł Kozikowski authored
      Previous seed resulted in empty test data set.
      
      Author: Paweł Kozikowski <mupakoz@gmail.com>
      
      Closes #7477 from mupakoz/patch-1 and squashes the following commits:
      
      f5d41ee [Paweł Kozikowski] Mllib Naive Bayes example data set enlarged
      b9ef7ac9
  2. Jul 02, 2015
  3. May 22, 2015
    • Ram Sriharsha's avatar
      [SPARK-7574] [ML] [DOC] User guide for OneVsRest · 509d55ab
      Ram Sriharsha authored
      Including Iris Dataset (after shuffling and relabeling 3 -> 0 to confirm to 0 -> numClasses-1 labeling). Could not find an existing dataset in data/mllib for multiclass classification.
      
      Author: Ram Sriharsha <rsriharsha@hw11853.local>
      
      Closes #6296 from harsha2010/SPARK-7574 and squashes the following commits:
      
      645427c [Ram Sriharsha] cleanup
      46c41b1 [Ram Sriharsha] cleanup
      2f76295 [Ram Sriharsha] Code Review Fixes
      ebdf103 [Ram Sriharsha] Java Example
      c026613 [Ram Sriharsha] Code Review fixes
      4b7d1a6 [Ram Sriharsha] minor cleanup
      13bed9c [Ram Sriharsha] add wikipedia link
      bb9dbfa [Ram Sriharsha] Clean up naming
      6f90db1 [Ram Sriharsha] [SPARK-7574][ml][doc] User guide for OneVsRest
      509d55ab
  4. Feb 23, 2015
    • Jacky Li's avatar
      [SPARK-5939][MLLib] make FPGrowth example app take parameters · 651a1c01
      Jacky Li authored
      Add parameter parsing in FPGrowth example app in Scala and Java
      And a sample data file is added in data/mllib folder
      
      Author: Jacky Li <jacky.likun@huawei.com>
      
      Closes #4714 from jackylk/parameter and squashes the following commits:
      
      8c478b3 [Jacky Li] fix according to comments
      3bb74f6 [Jacky Li] make FPGrowth exampl app take parameters
      f0e4d10 [Jacky Li] make FPGrowth exampl app take parameters
      651a1c01
  5. Feb 20, 2015
    • Joseph K. Bradley's avatar
      [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] Doc cleanups for 1.3 release · 4a17eedb
      Joseph K. Bradley authored
      For SPARK-5867:
      * The spark.ml programming guide needs to be updated to use the new SQL DataFrame API instead of the old SchemaRDD API.
      * It should also include Python examples now.
      
      For SPARK-5892:
      * Fix Python docs
      * Various other cleanups
      
      BTW, I accidentally merged this with master.  If you want to compile it on your own, use this branch which is based on spark/branch-1.3 and cherry-picks the commits from this PR: [https://github.com/jkbradley/spark/tree/doc-review-1.3-check]
      
      CC: mengxr  (ML),  davies  (Python docs)
      
      Author: Joseph K. Bradley <joseph@databricks.com>
      
      Closes #4675 from jkbradley/doc-review-1.3 and squashes the following commits:
      
      f191bb0 [Joseph K. Bradley] small cleanups
      e786efa [Joseph K. Bradley] small doc corrections
      6b1ab4a [Joseph K. Bradley] fixed python lint test
      946affa [Joseph K. Bradley] Added sample data for ml.MovieLensALS example.  Changed spark.ml Java examples to use DataFrames API instead of sql()
      da81558 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into doc-review-1.3
      629dbf5 [Joseph K. Bradley] Updated based on code review: * made new page for old migration guides * small fixes * moved inherit_doc in python
      b9df7c4 [Joseph K. Bradley] Small cleanups: toDF to toDF(), adding s for string interpolation
      34b067f [Joseph K. Bradley] small doc correction
      da16aef [Joseph K. Bradley] Fixed python mllib docs
      8cce91c [Joseph K. Bradley] GMM: removed old imports, added some doc
      695f3f6 [Joseph K. Bradley] partly done trying to fix inherit_doc for class hierarchies in python docs
      a72c018 [Joseph K. Bradley] made ChiSqTestResult appear in python docs
      b05a80d [Joseph K. Bradley] organize imports. doc cleanups
      e572827 [Joseph K. Bradley] updated programming guide for ml and mllib
      4a17eedb
  6. Feb 15, 2015
    • martinzapletal's avatar
      [MLLIB][SPARK-5502] User guide for isotonic regression · 61eb1267
      martinzapletal authored
      User guide for isotonic regression added to docs/mllib-regression.md including code examples for Scala and Java.
      
      Author: martinzapletal <zapletal-martin@email.cz>
      
      Closes #4536 from zapletal-martin/SPARK-5502 and squashes the following commits:
      
      67fe773 [martinzapletal] SPARK-5502 reworded model prediction rules to use more general language rather than the code/implementation specific terms
      80bd4c3 [martinzapletal] SPARK-5502 created docs page for isotonic regression, added links to the page, updated data and examples
      7d8136e [martinzapletal] SPARK-5502 Added documentation for Isotonic regression including examples for Scala and Java
      504b5c3 [martinzapletal] SPARK-5502 Added documentation for Isotonic regression including examples for Scala and Java
      61eb1267
  7. Feb 09, 2015
    • Xiangrui Meng's avatar
      [SPARK-5539][MLLIB] LDA guide · 855d12ac
      Xiangrui Meng authored
      This is the LDA user guide from jkbradley with Java and Scala code example.
      
      Author: Xiangrui Meng <meng@databricks.com>
      Author: Joseph K. Bradley <joseph@databricks.com>
      
      Closes #4465 from mengxr/lda-guide and squashes the following commits:
      
      6dcb7d1 [Xiangrui Meng] update java example in the user guide
      76169ff [Xiangrui Meng] update java example
      36c3ae2 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into lda-guide
      c2a1efe [Joseph K. Bradley] Added LDA programming guide, plus Java example (which is in the guide and probably should be removed).
      855d12ac
  8. Feb 06, 2015
    • Travis Galoppo's avatar
      [SPARK-5013] [MLlib] Added documentation and sample data file for GaussianMixture · 9ad56ad2
      Travis Galoppo authored
      Simple description and code samples (and sample data) for GaussianMixture
      
      Author: Travis Galoppo <tjg2107@columbia.edu>
      
      Closes #4401 from tgaloppo/spark-5013 and squashes the following commits:
      
      c9ff9a5 [Travis Galoppo] Fixed link in mllib-clustering.md Added Gaussian mixture and power iteration as available clustering techniques in mllib-guide
      2368690 [Travis Galoppo] Minor fixes
      3eb41fa [Travis Galoppo] [SPARK-5013] Added documentation and sample data file for GaussianMixture
      9ad56ad2
  9. Jul 13, 2014
    • Sean Owen's avatar
      SPARK-2363. Clean MLlib's sample data files · 635888cb
      Sean Owen authored
      (Just made a PR for this, mengxr was the reporter of:)
      
      MLlib has sample data under serveral folders:
      1) data/mllib
      2) data/
      3) mllib/data/*
      Per previous discussion with Matei Zaharia, we want to put them under `data/mllib` and clean outdated files.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #1394 from srowen/SPARK-2363 and squashes the following commits:
      
      54313dd [Sean Owen] Move ML example data from /mllib/data/ and /data/ into /data/mllib/
      635888cb
  10. May 19, 2014
    • Xiangrui Meng's avatar
      [SPARK-1874][MLLIB] Clean up MLlib sample data · bcb9dce6
      Xiangrui Meng authored
      1. Added synthetic datasets for `MovieLensALS`, `LinearRegression`, `BinaryClassification`.
      2. Embedded instructions in the help message of those example apps.
      
      Per discussion with Matei on the JIRA page, new example data is under `data/mllib`.
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #833 from mengxr/mllib-sample-data and squashes the following commits:
      
      59f0a18 [Xiangrui Meng] add sample binary classification data
      3c2f92f [Xiangrui Meng] add linear regression data
      050f1ca [Xiangrui Meng] add a sample dataset for MovieLensALS example
      bcb9dce6
  11. Sep 22, 2013
Loading