-
- Downloads
[SPARK-18701][ML] Fix Poisson GLM failure due to wrong initialization
Poisson GLM fails for many standard data sets (see example in test or JIRA). The issue is incorrect initialization leading to almost zero probability and weights. Specifically, the mean is initialized as the response, which could be zero. Applying the log link results in very negative numbers (protected against -Inf), which again leads to close to zero probability and weights in the weighted least squares. Fix and test are included in the commits. ## What changes were proposed in this pull request? Update initialization in Poisson GLM ## How was this patch tested? Add test in GeneralizedLinearRegressionSuite srowen sethah yanboliang HyukjinKwon mengxr Author: actuaryzhang <actuaryzhang10@gmail.com> Closes #16131 from actuaryzhang/master.
Showing
- mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala 5 additions, 1 deletion...che/spark/ml/regression/GeneralizedLinearRegression.scala
- mllib/src/test/scala/org/apache/spark/ml/regression/GeneralizedLinearRegressionSuite.scala 12 additions, 9 deletions...park/ml/regression/GeneralizedLinearRegressionSuite.scala
Please register or sign in to comment