Skip to content
Snippets Groups Projects
  1. May 10, 2015
    • Glenn Weidner's avatar
      [SPARK-7427] [PYSPARK] Make sharedParams match in Scala, Python · c5aca0c2
      Glenn Weidner authored
      Modified 2 files:
      python/pyspark/ml/param/_shared_params_code_gen.py
      python/pyspark/ml/param/shared.py
      
      Generated shared.py on Linux using Python 2.6.6 on Redhat Enterprise Linux Server 6.6.
      python _shared_params_code_gen.py > shared.py
      
      Only changed maxIter, regParam, rawPredictionCol based on strings from SharedParamsCodeGen.scala.  Note warning was displayed when committing shared.py:
      warning: LF will be replaced by CRLF in python/pyspark/ml/param/shared.py.
      
      Author: Glenn Weidner <gweidner@us.ibm.com>
      
      Closes #6023 from gweidner/br-7427 and squashes the following commits:
      
      db72e32 [Glenn Weidner] [SPARK-7427] [PySpark] Make sharedParams match in Scala, Python
      825e4a9 [Glenn Weidner] [SPARK-7427] [PySpark] Make sharedParams match in Scala, Python
      e6a865e [Glenn Weidner] [SPARK-7427] [PySpark] Make sharedParams match in Scala, Python
      1eee702 [Glenn Weidner] Merge remote-tracking branch 'upstream/master'
      1ac10e5 [Glenn Weidner] Merge remote-tracking branch 'upstream/master'
      cafd104 [Glenn Weidner] Merge remote-tracking branch 'upstream/master'
      9bea1eb [Glenn Weidner] Merge remote-tracking branch 'upstream/master'
      4a35c20 [Glenn Weidner] Merge remote-tracking branch 'upstream/master'
      9790cbe [Glenn Weidner] Merge remote-tracking branch 'upstream/master'
      d9c30f4 [Glenn Weidner] [SPARK-7275] [SQL] [WIP] Make LogicalRelation public
      c5aca0c2
  2. May 08, 2015
    • Burak Yavuz's avatar
      [SPARK-7488] [ML] Feature Parity in PySpark for ml.recommendation · 84bf931f
      Burak Yavuz authored
      Adds Python Api for `ALS` under `ml.recommendation` in PySpark. Also adds seed as a settable parameter in the Scala Implementation of ALS.
      
      Author: Burak Yavuz <brkyvz@gmail.com>
      
      Closes #6015 from brkyvz/ml-rec and squashes the following commits:
      
      be6e931 [Burak Yavuz] addressed comments
      eaed879 [Burak Yavuz] readd numFeatures
      0bd66b1 [Burak Yavuz] fixed seed
      7f6d964 [Burak Yavuz] merged master
      52e2bda [Burak Yavuz] added ALS
      84bf931f
    • Burak Yavuz's avatar
      [SPARK-7383] [ML] Feature Parity in PySpark for ml.features · f5ff4a84
      Burak Yavuz authored
      Implemented python wrappers for Scala functions that don't exist in `ml.features`
      
      Author: Burak Yavuz <brkyvz@gmail.com>
      
      Closes #5991 from brkyvz/ml-feat-PR and squashes the following commits:
      
      adcca55 [Burak Yavuz] add regex tokenizer to __all__
      b91cb44 [Burak Yavuz] addressed comments
      bd39fd2 [Burak Yavuz] remove addition
      b82bd7c [Burak Yavuz] Parity in PySpark for ml.features
      f5ff4a84
  3. May 07, 2015
    • Burak Yavuz's avatar
      [SPARK-7388] [SPARK-7383] wrapper for VectorAssembler in Python · 9e2ffb13
      Burak Yavuz authored
      The wrapper required the implementation of the `ArrayParam`, because `Array[T]` is hard to obtain from Python. `ArrayParam` has an extra function called `wCast` which is an internal function to obtain `Array[T]` from `Seq[T]`
      
      Author: Burak Yavuz <brkyvz@gmail.com>
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #5930 from brkyvz/ml-feat and squashes the following commits:
      
      73e745f [Burak Yavuz] Merge pull request #3 from mengxr/SPARK-7388
      c221db9 [Xiangrui Meng] overload StringArrayParam.w
      c81072d [Burak Yavuz] addressed comments
      99c2ebf [Burak Yavuz] add to python_shared_params
      39ecb07 [Burak Yavuz] fix scalastyle
      7f7ea2a [Burak Yavuz] [SPARK-7388][SPARK-7383] wrapper for VectorAssembler in Python
      9e2ffb13
  4. May 05, 2015
    • Xiangrui Meng's avatar
      [SPARK-7333] [MLLIB] Add BinaryClassificationEvaluator to PySpark · ee374e89
      Xiangrui Meng authored
      This PR adds `BinaryClassificationEvaluator` to Python ML Pipelines API, which is a simple wrapper of the Scala implementation. oefirouz
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #5885 from mengxr/SPARK-7333 and squashes the following commits:
      
      25d7451 [Xiangrui Meng] fix tests in python 3
      babdde7 [Xiangrui Meng] fix doc
      cb51e6a [Xiangrui Meng] add BinaryClassificationEvaluator in PySpark
      ee374e89
  5. Apr 16, 2015
    • Xiangrui Meng's avatar
      [SPARK-6893][ML] default pipeline parameter handling in python · 57cd1e86
      Xiangrui Meng authored
      Same as #5431 but for Python. jkbradley
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #5534 from mengxr/SPARK-6893 and squashes the following commits:
      
      d3b519b [Xiangrui Meng] address comments
      ebaccc6 [Xiangrui Meng] style update
      fce244e [Xiangrui Meng] update explainParams with test
      4d6b07a [Xiangrui Meng] add tests
      5294500 [Xiangrui Meng] update default param handling in python
      57cd1e86
  6. Jan 28, 2015
    • Xiangrui Meng's avatar
      [SPARK-4586][MLLIB] Python API for ML pipeline and parameters · e80dc1c5
      Xiangrui Meng authored
      This PR adds Python API for ML pipeline and parameters. The design doc can be found on the JIRA page. It includes transformers and an estimator to demo the simple text classification example code.
      
      TODO:
      - [x] handle parameters in LRModel
      - [x] unit tests
      - [x] missing some docs
      
      CC: davies jkbradley
      
      Author: Xiangrui Meng <meng@databricks.com>
      Author: Davies Liu <davies@databricks.com>
      
      Closes #4151 from mengxr/SPARK-4586 and squashes the following commits:
      
      415268e [Xiangrui Meng] remove inherit_doc from __init__
      edbd6fe [Xiangrui Meng] move Identifiable to ml.util
      44c2405 [Xiangrui Meng] Merge pull request #2 from davies/ml
      dd1256b [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-4586
      14ae7e2 [Davies Liu] fix docs
      54ca7df [Davies Liu] fix tests
      78638df [Davies Liu] Merge branch 'SPARK-4586' of github.com:mengxr/spark into ml
      fc59a02 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-4586
      1dca16a [Davies Liu] refactor
      090b3a3 [Davies Liu] Merge branch 'master' of github.com:apache/spark into ml
      0882513 [Xiangrui Meng] update doc style
      a4f4dbf [Xiangrui Meng] add unit test for LR
      7521d1c [Xiangrui Meng] add unit tests to HashingTF and Tokenizer
      ba0ba1e [Xiangrui Meng] add unit tests for pipeline
      0586c7b [Xiangrui Meng] add more comments to the example
      5153cff [Xiangrui Meng] simplify java models
      036ca04 [Xiangrui Meng] gen numFeatures
      46fa147 [Xiangrui Meng] update mllib/pom.xml to include python files in the assembly
      1dcc17e [Xiangrui Meng] update code gen and make param appear in the doc
      f66ba0c [Xiangrui Meng] make params a property
      d5efd34 [Xiangrui Meng] update doc conf and move embedded param map to instance attribute
      f4d0fe6 [Xiangrui Meng] use LabeledDocument and Document in example
      05e3e40 [Xiangrui Meng] update example
      d3e8dbe [Xiangrui Meng] more docs optimize pipeline.fit impl
      56de571 [Xiangrui Meng] fix style
      d0c5bb8 [Xiangrui Meng] a working copy
      bce72f4 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-4586
      17ecfb9 [Xiangrui Meng] code gen for shared params
      d9ea77c [Xiangrui Meng] update doc
      c18dca1 [Xiangrui Meng] make the example working
      dadd84e [Xiangrui Meng] add base classes and docs
      a3015cf [Xiangrui Meng] add Estimator and Transformer
      46eea43 [Xiangrui Meng] a pipeline in python
      33b68e0 [Xiangrui Meng] a working LR
      e80dc1c5
Loading