-
- Downloads
[SPARK-14862][ML] Updated Classifiers to not require labelCol metadata
## What changes were proposed in this pull request? Updated Classifier, DecisionTreeClassifier, RandomForestClassifier, GBTClassifier to not require input column metadata. * They first check for metadata. * If numClasses is not specified in metadata, they identify the largest label value (up to a limit). This functionality is implemented in a new Classifier.getNumClasses method. Also * Updated Classifier.extractLabeledPoints to (a) check label values and (b) include a second version which takes a numClasses value for validity checking. ## How was this patch tested? * Unit tests in ClassifierSuite for helper methods * Unit tests for DecisionTreeClassifier, RandomForestClassifier, GBTClassifier with toy datasets lacking label metadata Author: Joseph K. Bradley <joseph@databricks.com> Closes #12663 from jkbradley/trees-no-metadata.
Showing
- mllib/src/main/scala/org/apache/spark/ml/classification/Classifier.scala 67 additions, 3 deletions...scala/org/apache/spark/ml/classification/Classifier.scala
- mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala 2 additions, 8 deletions...ache/spark/ml/classification/DecisionTreeClassifier.scala
- mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala 14 additions, 11 deletions...la/org/apache/spark/ml/classification/GBTClassifier.scala
- mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala 2 additions, 8 deletions...ache/spark/ml/classification/RandomForestClassifier.scala
- mllib/src/test/scala/org/apache/spark/ml/classification/ClassifierSuite.scala 108 additions, 0 deletions.../org/apache/spark/ml/classification/ClassifierSuite.scala
- mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala 6 additions, 0 deletions...spark/ml/classification/DecisionTreeClassifierSuite.scala
- mllib/src/test/scala/org/apache/spark/ml/classification/GBTClassifierSuite.scala 39 additions, 1 deletion...g/apache/spark/ml/classification/GBTClassifierSuite.scala
- mllib/src/test/scala/org/apache/spark/ml/classification/RandomForestClassifierSuite.scala 7 additions, 0 deletions...spark/ml/classification/RandomForestClassifierSuite.scala
Please register or sign in to comment