Skip to content
Snippets Groups Projects
  1. May 19, 2016
  2. May 17, 2016
  3. Apr 30, 2016
    • Xiangrui Meng's avatar
      [SPARK-14653][ML] Remove json4s from mllib-local · 0847fe4e
      Xiangrui Meng authored
      ## What changes were proposed in this pull request?
      
      This PR moves Vector.toJson/fromJson to ml.linalg.VectorEncoder under mllib/ to keep mllib-local's dependency minimal. The json encoding is used by Params. So we still need this feature in SPARK-14615, where we will switch to ml.linalg in spark.ml APIs.
      
      ## How was this patch tested?
      
      Copied existing unit tests over.
      
      cc; dbtsai
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #12802 from mengxr/SPARK-14653.
      0847fe4e
  4. Apr 28, 2016
  5. Apr 26, 2016
    • Joseph K. Bradley's avatar
      [SPARK-14732][ML] spark.ml GaussianMixture should use MultivariateGaussian in mllib-local · bd2c9a6d
      Joseph K. Bradley authored
      ## What changes were proposed in this pull request?
      
      Before, spark.ml GaussianMixtureModel used the spark.mllib MultivariateGaussian in its public API.  This was added after 1.6, so we can modify this API without breaking APIs.
      
      This PR copies MultivariateGaussian to mllib-local in spark.ml, with a few changes:
      * Renamed fields to match numpy, scipy: mu => mean, sigma => cov
      
      This PR then uses the spark.ml MultivariateGaussian in the spark.ml GaussianMixtureModel, which involves:
      * Modifying the constructor
      * Adding a computeProbabilities method
      
      Also:
      * Added EPSILON to mllib-local for use in MultivariateGaussian
      
      ## How was this patch tested?
      
      Existing unit tests
      
      Author: Joseph K. Bradley <joseph@databricks.com>
      
      Closes #12593 from jkbradley/sparkml-gmm-fix.
      bd2c9a6d
  6. Apr 22, 2016
    • Joan's avatar
      [SPARK-6429] Implement hashCode and equals together · bf95b8da
      Joan authored
      ## What changes were proposed in this pull request?
      
      Implement some `hashCode` and `equals` together in order to enable the scalastyle.
      This is a first batch, I will continue to implement them but I wanted to know your thoughts.
      
      Author: Joan <joan@goyeau.com>
      
      Closes #12157 from joan38/SPARK-6429-HashCode-Equals.
      bf95b8da
  7. Apr 15, 2016
    • DB Tsai's avatar
      [SPARK-14549][ML] Copy the Vector and Matrix classes from mllib to ml in mllib-local · 96534aa4
      DB Tsai authored
      ## What changes were proposed in this pull request?
      
      This task will copy the Vector and Matrix classes from mllib to ml package in mllib-local jar. The UDTs and `since` annotation in ml vector and matrix will be removed from now. UDTs will be achieved by #SPARK-14487, and `since` will be replaced by /*  since 1.2.0 */
      
      The BLAS implementation will be copied, and some of the test utilities will be copies as well.
      
      Summary of changes:
      
      1. In mllib-local/src/main/scala/org/apache/spark/**ml**/linalg/BLAS.scala
        - Copied from mllib/src/main/scala/org/apache/spark/**mllib**/linalg/BLAS.scala
        - logDebug("gemm: alpha is equal to 0 and beta is equal to 1. Returning C.") is removed in ml version.
      2. In  mllib-local/src/main/scala/org/apache/spark/**ml**/linalg/Matrices.scala
        - Copied from mllib/src/main/scala/org/apache/spark/**mllib**/linalg/Matrices.scala
        - `Since` was removed, and we'll use standard `/* Since /*` Java doc. Will be in another PR.
        - `UDT` related code was removed, and will use `SPARK-13944` https://github.com/apache/spark/pull/12259  to replace the annotation.
      3. In mllib-local/src/main/scala/org/apache/spark/**ml**/linalg/Vectors.scala
        - Copied from mllib/src/main/scala/org/apache/spark/**mllib**/linalg/Vectors.scala
        - `Since` was removed.
        - `UDT` related code was removed.
        - In `def parseNumeric`, it was throwing `throw new SparkException(s"Cannot parse $other.")`, and now it's throwing `throw new IllegalArgumentException(s"Cannot parse $other.")`
      4. In mllib/src/main/scala/org/apache/spark/**mllib**/linalg/Vectors.scala
        - For consistency with ML version of vector, `def parseNumeric` is now throwing `throw new IllegalArgumentException(s"Cannot parse $other.")`
      5. mllib/src/main/scala/org/apache/spark/**mllib**/util/NumericParser.scala is moved to mllib-local/src/main/scala/org/apache/spark/**ml**/util/NumericParser.scala
        - All the `throw new SparkException` were replaced by `throw new IllegalArgumentException`
      
      ## How was this patch tested?
      
      unit tests
      
      Author: DB Tsai <dbt@netflix.com>
      
      Closes #12317 from dbtsai/dbtsai-ml-vector.
      96534aa4
  8. Apr 14, 2016
  9. Apr 11, 2016
    • DB Tsai's avatar
      [SPARK-14462][ML][MLLIB] Add the mllib-local build to maven pom · efaf7d18
      DB Tsai authored
      ## What changes were proposed in this pull request?
      
      In order to separate the linear algebra, and vector matrix classes into a standalone jar, we need to setup the build first. This PR will create a new jar called mllib-local with minimal dependencies.
      
      The previous PR was failing the build because of `spark-core:test` dependency, and that was reverted. In this PR, `FunSuite` with `// scalastyle:ignore funsuite` in mllib-local test was used, similar to sketch.
      
      Thanks.
      
      ## How was this patch tested?
      
      Unit tests
      
      mengxr tedyu holdenk
      
      Author: DB Tsai <dbt@netflix.com>
      
      Closes #12298 from dbtsai/dbtsai-mllib-local-build-fix.
      efaf7d18
  10. Apr 09, 2016
    • Xiangrui Meng's avatar
      415446cc
    • DB Tsai's avatar
      [SPARK-14462][ML][MLLIB] add the mllib-local build to maven pom · 1598d11b
      DB Tsai authored
      ## What changes were proposed in this pull request?
      
      In order to separate the linear algebra, and vector matrix classes into a standalone jar, we need to setup the build first. This PR will create a new jar called mllib-local with minimal dependencies. The test scope will still depend on spark-core and spark-core-test in order to use the common utilities, but the runtime will avoid any platform dependency. Couple platform independent classes will be moved to this package to demonstrate how this work.
      
      ## How was this patch tested?
      
      Unit tests
      
      Author: DB Tsai <dbt@netflix.com>
      
      Closes #12241 from dbtsai/dbtsai-mllib-local-build.
      1598d11b
Loading