Skip to content
Snippets Groups Projects
  1. Dec 16, 2015
  2. Sep 21, 2015
  3. Sep 17, 2015
  4. Aug 28, 2015
    • noelsmith's avatar
      [SPARK-10188] [PYSPARK] Pyspark CrossValidator with RMSE selects incorrect model · 7583681e
      noelsmith authored
      * Added isLargerBetter() method to Pyspark Evaluator to match the Scala version.
      * JavaEvaluator delegates isLargerBetter() to underlying Scala object.
      * Added check for isLargerBetter() in CrossValidator to determine whether to use argmin or argmax.
      * Added test cases for where smaller is better (RMSE) and larger is better (R-Squared).
      
      (This contribution is my original work and that I license the work to the project under Sparks' open source license)
      
      Author: noelsmith <mail@noelsmith.com>
      
      Closes #8399 from noel-smith/pyspark-rmse-xval-fix.
      7583681e
  5. Aug 14, 2015
  6. Jun 02, 2015
    • Xiangrui Meng's avatar
      [SPARK-7432] [MLLIB] fix flaky CrossValidator doctest · bd97840d
      Xiangrui Meng authored
      The new test uses CV to compare `maxIter=0` and `maxIter=1`, and validate on the evaluation result. jkbradley
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #6572 from mengxr/SPARK-7432 and squashes the following commits:
      
      c236bb8 [Xiangrui Meng] fix flacky cv doctest
      bd97840d
  7. May 18, 2015
    • Xiangrui Meng's avatar
      [SPARK-7380] [MLLIB] pipeline stages should be copyable in Python · 9c7e802a
      Xiangrui Meng authored
      This PR makes pipeline stages in Python copyable and hence simplifies some implementations. It also includes the following changes:
      
      1. Rename `paramMap` and `defaultParamMap` to `_paramMap` and `_defaultParamMap`, respectively.
      2. Accept a list of param maps in `fit`.
      3. Use parent uid and name to identify param.
      
      jkbradley
      
      Author: Xiangrui Meng <meng@databricks.com>
      Author: Joseph K. Bradley <joseph@databricks.com>
      
      Closes #6088 from mengxr/SPARK-7380 and squashes the following commits:
      
      413c463 [Xiangrui Meng] remove unnecessary doc
      4159f35 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-7380
      611c719 [Xiangrui Meng] fix python style
      68862b8 [Xiangrui Meng] update _java_obj initialization
      927ad19 [Xiangrui Meng] fix ml/tests.py
      0138fc3 [Xiangrui Meng] update feature transformers and fix a bug in RegexTokenizer
      9ca44fb [Xiangrui Meng] simplify Java wrappers and add tests
      c7d84ef [Xiangrui Meng] update ml/tests.py to test copy params
      7e0d27f [Xiangrui Meng] merge master
      46840fb [Xiangrui Meng] update wrappers
      b6db1ed [Xiangrui Meng] update all self.paramMap to self._paramMap
      46cb6ed [Xiangrui Meng] merge master
      a163413 [Xiangrui Meng] fix style
      1042e80 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-7380
      9630eae [Xiangrui Meng] fix Identifiable._randomUID
      13bd70a [Xiangrui Meng] update ml/tests.py
      64a536c [Xiangrui Meng] use _fit/_transform/_evaluate to simplify the impl
      02abf13 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into copyable-python
      66ce18c [Joseph K. Bradley] some cleanups before sending to Xiangrui
      7431272 [Joseph K. Bradley] Rebased with master
      9c7e802a
  8. May 10, 2015
    • Joseph K. Bradley's avatar
      [SPARK-7431] [ML] [PYTHON] Made CrossValidatorModel call parent init in PySpark · 3038443e
      Joseph K. Bradley authored
      Fixes bug with PySpark cvModel not having UID
      Also made small PySpark fixes: Evaluator should inherit from Params.  MockModel should inherit from Model.
      
      CC: mengxr
      
      Author: Joseph K. Bradley <joseph@databricks.com>
      
      Closes #5968 from jkbradley/pyspark-cv-uid and squashes the following commits:
      
      57f13cd [Joseph K. Bradley] Made CrossValidatorModel call parent init in PySpark
      3038443e
  9. May 08, 2015
  10. May 07, 2015
    • Xiangrui Meng's avatar
      [SPARK-7432] [MLLIB] disable cv doctest · 773aa252
      Xiangrui Meng authored
      Temporarily disable flaky doctest for CrossValidator. jkbradley
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #5962 from mengxr/disable-pyspark-cv-test and squashes the following commits:
      
      5db7e5b [Xiangrui Meng] disable cv doctest
      773aa252
  11. May 06, 2015
    • Xiangrui Meng's avatar
      [SPARK-6940] [MLLIB] Add CrossValidator to Python ML pipeline API · 32cdc815
      Xiangrui Meng authored
      Since CrossValidator is a meta algorithm, we copy the implementation in Python. jkbradley
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #5926 from mengxr/SPARK-6940 and squashes the following commits:
      
      6af181f [Xiangrui Meng] add TODOs
      8285134 [Xiangrui Meng] update doc
      060f7c3 [Xiangrui Meng] update doctest
      acac727 [Xiangrui Meng] add keyword args
      cdddecd [Xiangrui Meng] add CrossValidator in Python
      32cdc815
  12. May 03, 2015
    • Xiangrui Meng's avatar
      [SPARK-7329] [MLLIB] simplify ParamGridBuilder impl · 1ffa8cb9
      Xiangrui Meng authored
      as suggested by justinuang on #5601.
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #5873 from mengxr/SPARK-7329 and squashes the following commits:
      
      d08f9cf [Xiangrui Meng] simplify tests
      b7a7b9b [Xiangrui Meng] simplify grid build
      1ffa8cb9
    • Omede Firouz's avatar
      [SPARK-7022] [PYSPARK] [ML] Add ML.Tuning.ParamGridBuilder to PySpark · f4af9255
      Omede Firouz authored
      Author: Omede Firouz <ofirouz@palantir.com>
      Author: Omede <omedefirouz@gmail.com>
      
      Closes #5601 from oefirouz/paramgrid and squashes the following commits:
      
      c9e2481 [Omede Firouz] Make test a doctest
      9a8ce22 [Omede] Fix linter issues
      8b8a6d2 [Omede Firouz] [SPARK-7022][PySpark][ML] Add ML.Tuning.ParamGridBuilder to PySpark
      f4af9255
Loading