-
- Downloads
[SPARK-16561][MLLIB] fix multivarOnlineSummary min/max bug
## What changes were proposed in this pull request? renaming var names to make code more clear: nnz => weightSum weightSum => totalWeightSum and add a new member vector `nnz` (not `nnz` in previous code, which renamed to `weightSum`) to count each dimensions non-zero value number. using `nnz` which I added above instead of `weightSum` when calculating min/max so that it fix several numerical error in some extreme case. ## How was this patch tested? A new testcase added. Author: WeichenXu <WeichenXu123@outlook.com> Closes #14216 from WeichenXu123/multivarOnlineSummary.
Showing
- mllib/src/main/scala/org/apache/spark/mllib/stat/MultivariateOnlineSummarizer.scala 35 additions, 28 deletions...pache/spark/mllib/stat/MultivariateOnlineSummarizer.scala
- mllib/src/test/scala/org/apache/spark/mllib/stat/MultivariateOnlineSummarizerSuite.scala 25 additions, 0 deletions.../spark/mllib/stat/MultivariateOnlineSummarizerSuite.scala
Loading
Please register or sign in to comment