Skip to content
Snippets Groups Projects
  • nate.crosswhite's avatar
    7450a992
    [SPARK-4749] [mllib]: Allow initializing KMeans clusters using a seed · 7450a992
    nate.crosswhite authored
    This implements the functionality for SPARK-4749 and provides units tests in Scala and PySpark
    
    Author: nate.crosswhite <nate.crosswhite@stresearch.com>
    Author: nxwhite-str <nxwhite-str@users.noreply.github.com>
    Author: Xiangrui Meng <meng@databricks.com>
    
    Closes #3610 from nxwhite-str/master and squashes the following commits:
    
    a2ebbd3 [nxwhite-str] Merge pull request #1 from mengxr/SPARK-4749-kmeans-seed
    7668124 [Xiangrui Meng] minor updates
    f8d5928 [nate.crosswhite] Addressing PR issues
    277d367 [nate.crosswhite] Merge remote-tracking branch 'upstream/master'
    9156a57 [nate.crosswhite] Merge remote-tracking branch 'upstream/master'
    5d087b4 [nate.crosswhite] Adding KMeans train with seed and Scala unit test
    616d111 [nate.crosswhite] Merge remote-tracking branch 'upstream/master'
    35c1884 [nate.crosswhite] Add kmeans initial seed to pyspark API
    7450a992
    History
    [SPARK-4749] [mllib]: Allow initializing KMeans clusters using a seed
    nate.crosswhite authored
    This implements the functionality for SPARK-4749 and provides units tests in Scala and PySpark
    
    Author: nate.crosswhite <nate.crosswhite@stresearch.com>
    Author: nxwhite-str <nxwhite-str@users.noreply.github.com>
    Author: Xiangrui Meng <meng@databricks.com>
    
    Closes #3610 from nxwhite-str/master and squashes the following commits:
    
    a2ebbd3 [nxwhite-str] Merge pull request #1 from mengxr/SPARK-4749-kmeans-seed
    7668124 [Xiangrui Meng] minor updates
    f8d5928 [nate.crosswhite] Addressing PR issues
    277d367 [nate.crosswhite] Merge remote-tracking branch 'upstream/master'
    9156a57 [nate.crosswhite] Merge remote-tracking branch 'upstream/master'
    5d087b4 [nate.crosswhite] Adding KMeans train with seed and Scala unit test
    616d111 [nate.crosswhite] Merge remote-tracking branch 'upstream/master'
    35c1884 [nate.crosswhite] Add kmeans initial seed to pyspark API