-
- Downloads
[SPARK-9835][ML] Implement IterativelyReweightedLeastSquares solver
Implement ```IterativelyReweightedLeastSquares``` solver for GLM. I consider it as a solver rather than estimator, it only used internal so I keep it ```private[ml]```. There are two limitations in the current implementation compared with R: * It can not support ```Tuple``` as response for ```Binomial``` family, such as the following code: ``` glm( cbind(using, notUsing) ~ age + education + wantsMore , family = binomial) ``` * It does not support ```offset```. Because I considered that ```RFormula``` did not support ```Tuple``` as label and ```offset``` keyword, so I simplified the implementation. But to add support for these two functions is not very hard, I can do it in follow-up PR if it is necessary. Meanwhile, we can also add R-like statistic summary for IRLS. The implementation refers R, [statsmodels](https://github.com/statsmodels/statsmodels) and [sparkGLM](https://github.com/AlteryxLabs/sparkGLM). Please focus on the main structure and overpass minor issues/docs that I will update later. Any comments and opinions will be appreciated. cc mengxr jkbradley Author: Yanbo Liang <ybliang8@gmail.com> Closes #10639 from yanboliang/spark-9835.
Showing
- mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala 108 additions, 0 deletions...he/spark/ml/optim/IterativelyReweightedLeastSquares.scala
- mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala 6 additions, 1 deletion...cala/org/apache/spark/ml/optim/WeightedLeastSquares.scala
- mllib/src/test/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquaresSuite.scala 200 additions, 0 deletions...ark/ml/optim/IterativelyReweightedLeastSquaresSuite.scala
Please register or sign in to comment