-
- Downloads
[SPARK-19636][ML] Feature parity for correlation statistics in MLlib
## What changes were proposed in this pull request? This patch adds the Dataframes-based support for the correlation statistics found in the `org.apache.spark.mllib.stat.correlation.Statistics`, following the design doc discussed in the JIRA ticket. The current implementation is a simple wrapper around the `spark.mllib` implementation. Future optimizations can be implemented at a later stage. ## How was this patch tested? ``` build/sbt "testOnly org.apache.spark.ml.stat.StatisticsSuite" ``` Author: Timothy Hunter <timhunter@databricks.com> Closes #17108 from thunterdb/19636.
Showing
- mllib-local/src/test/scala/org/apache/spark/ml/util/TestingUtils.scala 8 additions, 0 deletions...rc/test/scala/org/apache/spark/ml/util/TestingUtils.scala
- mllib/src/main/scala/org/apache/spark/ml/stat/Correlation.scala 86 additions, 0 deletions...src/main/scala/org/apache/spark/ml/stat/Correlation.scala
- mllib/src/test/scala/org/apache/spark/ml/stat/CorrelationSuite.scala 77 additions, 0 deletions...est/scala/org/apache/spark/ml/stat/CorrelationSuite.scala
Loading
Please register or sign in to comment