-
- Downloads
[SPARK-21681][ML] fix bug of MLOR do not work correctly when featureStd contains zero
## What changes were proposed in this pull request? fix bug of MLOR do not work correctly when featureStd contains zero We can reproduce the bug through such dataset (features including zero variance), will generate wrong result (all coefficients becomes 0) ``` val multinomialDatasetWithZeroVar = { val nPoints = 100 val coefficients = Array( -0.57997, 0.912083, -0.371077, -0.16624, -0.84355, -0.048509) val xMean = Array(5.843, 3.0) val xVariance = Array(0.6856, 0.0) // including zero variance val testData = generateMultinomialLogisticInput( coefficients, xMean, xVariance, addIntercept = true, nPoints, seed) val df = sc.parallelize(testData, 4).toDF().withColumn("weight", lit(1.0)) df.cache() df } ``` ## How was this patch tested? testcase added. Author: WeichenXu <WeichenXu123@outlook.com> Closes #18896 from WeichenXu123/fix_mlor_stdvalue_zero_bug.
Showing
- mllib/src/main/scala/org/apache/spark/ml/optim/aggregator/LogisticAggregator.scala 7 additions, 5 deletions...apache/spark/ml/optim/aggregator/LogisticAggregator.scala
- mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala 78 additions, 0 deletions...che/spark/ml/classification/LogisticRegressionSuite.scala
- mllib/src/test/scala/org/apache/spark/ml/optim/aggregator/LogisticAggregatorSuite.scala 33 additions, 4 deletions...e/spark/ml/optim/aggregator/LogisticAggregatorSuite.scala
Loading
Please register or sign in to comment