-
- Downloads
[SPARK-14852][ML] refactored GLM summary into training, non-training summaries
## What changes were proposed in this pull request? This splits GeneralizedLinearRegressionSummary into 2 summary types: * GeneralizedLinearRegressionSummary, which does not store info from fitting (diagInvAtWA) * GeneralizedLinearRegressionTrainingSummary, which is a subclass of GeneralizedLinearRegressionSummary and stores info from fitting This also add a method evaluate() which can produce a GeneralizedLinearRegressionSummary on a new dataset. The summary no longer provides the model itself as a public val. Also: * Fixes bug where GeneralizedLinearRegressionTrainingSummary was created with model, not summaryModel. * Adds hasSummary method. * Renames findSummaryModelAndPredictionCol -> getSummaryModel and simplifies that method. * In summary, extract values from model immediately in case user later changes those (e.g., predictionCol). * Pardon the style fixes; that is IntelliJ being obnoxious. ## How was this patch tested? Existing unit tests + updated test for evaluate and hasSummary Author: Joseph K. Bradley <joseph@databricks.com> Closes #12624 from jkbradley/model-summary-api.
Showing
- mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala 101 additions, 55 deletions...che/spark/ml/regression/GeneralizedLinearRegression.scala
- mllib/src/test/scala/org/apache/spark/ml/regression/GeneralizedLinearRegressionSuite.scala 14 additions, 0 deletions...park/ml/regression/GeneralizedLinearRegressionSuite.scala
Loading
Please register or sign in to comment