-
- Downloads
[SPARK-18036][ML][MLLIB] Fixing decision trees handling edge cases
## What changes were proposed in this pull request? Decision trees/GBT/RF do not handle edge cases such as constant features or empty features. In the case of constant features we choose any arbitrary split instead of failing with a cryptic error message. In the case of empty features we fail with a better error message stating: DecisionTree requires number of features > 0, but was given an empty features vector Instead of the cryptic error message: java.lang.UnsupportedOperationException: empty.max ## How was this patch tested? Unit tests are added in the patch for: DecisionTreeRegressor GBTRegressor Random Forest Regressor Author: Ilya Matiach <ilmat@microsoft.com> Closes #16377 from imatiach-msft/ilmat/fix-decision-tree.
Showing
- mllib/src/main/scala/org/apache/spark/ml/tree/impl/DecisionTreeMetadata.scala 2 additions, 0 deletions.../org/apache/spark/ml/tree/impl/DecisionTreeMetadata.scala
- mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala 20 additions, 2 deletions...in/scala/org/apache/spark/ml/tree/impl/RandomForest.scala
- mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala 29 additions, 4 deletions...ala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala
Loading
Please register or sign in to comment