-
- Downloads
[SPARK-6517][MLLIB] Implement the Algorithm of Hierarchical Clustering
I implemented a hierarchical clustering algorithm again. This PR doesn't include examples, documentation and spark.ml APIs. I am going to send another PRs later. https://issues.apache.org/jira/browse/SPARK-6517 - This implementation based on a bi-sectiong K-means clustering. - It derives from the freeman-lab 's implementation - The basic idea is not changed from the previous version. (#2906) - However, It is 1000x faster than the previous version through parallel processing. Thank you for your great cooperation, RJ Nowling(rnowling), Jeremy Freeman(freeman-lab), Xiangrui Meng(mengxr) and Sean Owen(srowen). Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Author: Xiangrui Meng <meng@databricks.com> Author: Yu ISHIKAWA <yu-iskw@users.noreply.github.com> Closes #5267 from yu-iskw/new-hierarchical-clustering.
Showing
- mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeans.scala 491 additions, 0 deletions...a/org/apache/spark/mllib/clustering/BisectingKMeans.scala
- mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeansModel.scala 95 additions, 0 deletions.../apache/spark/mllib/clustering/BisectingKMeansModel.scala
- mllib/src/test/java/org/apache/spark/mllib/clustering/JavaBisectingKMeansSuite.java 73 additions, 0 deletions...ache/spark/mllib/clustering/JavaBisectingKMeansSuite.java
- mllib/src/test/scala/org/apache/spark/mllib/clustering/BisectingKMeansSuite.scala 182 additions, 0 deletions.../apache/spark/mllib/clustering/BisectingKMeansSuite.scala
Loading
Please register or sign in to comment