-
- Downloads
[SPARK-5436] [MLlib] Validate GradientBoostedTrees using runWithValidation
One can early stop if the decrease in error rate is lesser than a certain tol or if the error increases if the training data is overfit. This introduces a new method runWithValidation which takes in a pair of RDD's , one for the training data and the other for the validation. Author: MechCoder <manojkumarsivaraj334@gmail.com> Closes #4677 from MechCoder/spark-5436 and squashes the following commits: 1bb21d4 [MechCoder] Combine regression and classification tests into a single one e4d799b [MechCoder] Addresses indentation and doc comments b48a70f [MechCoder] COSMIT b928a19 [MechCoder] Move validation while training section under usage tips fad9b6e [MechCoder] Made the following changes 1. Add section to documentation 2. Return corresponding to bestValidationError 3. Allow negative tolerance. 55e5c3b [MechCoder] One liner for prevValidateError 3e74372 [MechCoder] TST: Add test for classification 77549a9 [MechCoder] [SPARK-5436] Validate GradientBoostedTrees using runWithValidation
Showing
- docs/mllib-ensembles.md 11 additions, 0 deletionsdocs/mllib-ensembles.md
- mllib/src/main/scala/org/apache/spark/mllib/tree/GradientBoostedTrees.scala 70 additions, 5 deletions...la/org/apache/spark/mllib/tree/GradientBoostedTrees.scala
- mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/BoostingStrategy.scala 5 additions, 1 deletion...che/spark/mllib/tree/configuration/BoostingStrategy.scala
- mllib/src/test/scala/org/apache/spark/mllib/tree/GradientBoostedTreesSuite.scala 36 additions, 0 deletions...g/apache/spark/mllib/tree/GradientBoostedTreesSuite.scala
Please register or sign in to comment