-
- Downloads
[SPARK-14500] [ML] Accept Dataset[_] instead of DataFrame in MLlib APIs
## What changes were proposed in this pull request? This PR updates MLlib APIs to accept `Dataset[_]` as input where `DataFrame` was the input type. This PR doesn't change the output type. In Java, `Dataset[_]` maps to `Dataset<?>`, which includes `Dataset<Row>`. Some implementations were changed in order to return `DataFrame`. Tests and examples were updated. Note that this is a breaking change for subclasses of Transformer/Estimator. Lol, we don't have to rename the input argument, which has been `dataset` since Spark 1.2. TODOs: - [x] update MiMaExcludes (seems all covered by explicit filters from SPARK-13920) - [x] Python - [x] add a new test to accept Dataset[LabeledPoint] - [x] remove unused imports of Dataset ## How was this patch tested? Exiting unit tests with some modifications. cc: rxin jkbradley Author: Xiangrui Meng <meng@databricks.com> Closes #12274 from mengxr/SPARK-14500.
Showing
- examples/src/main/java/org/apache/spark/examples/ml/JavaDeveloperApiExample.java 1 addition, 1 deletion...org/apache/spark/examples/ml/JavaDeveloperApiExample.java
- examples/src/main/scala/org/apache/spark/examples/ml/DeveloperApiExample.scala 2 additions, 2 deletions...la/org/apache/spark/examples/ml/DeveloperApiExample.scala
- mllib/src/main/scala/org/apache/spark/ml/Estimator.scala 10 additions, 6 deletionsmllib/src/main/scala/org/apache/spark/ml/Estimator.scala
- mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala 6 additions, 6 deletionsmllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
- mllib/src/main/scala/org/apache/spark/ml/Predictor.scala 7 additions, 7 deletionsmllib/src/main/scala/org/apache/spark/ml/Predictor.scala
- mllib/src/main/scala/org/apache/spark/ml/Transformer.scala 9 additions, 6 deletionsmllib/src/main/scala/org/apache/spark/ml/Transformer.scala
- mllib/src/main/scala/org/apache/spark/ml/classification/Classifier.scala 3 additions, 3 deletions...scala/org/apache/spark/ml/classification/Classifier.scala
- mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala 2 additions, 2 deletions...ache/spark/ml/classification/DecisionTreeClassifier.scala
- mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala 3 additions, 3 deletions...la/org/apache/spark/ml/classification/GBTClassifier.scala
- mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala 4 additions, 4 deletions...g/apache/spark/ml/classification/LogisticRegression.scala
- mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala 2 additions, 2 deletions...rk/ml/classification/MultilayerPerceptronClassifier.scala
- mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala 2 additions, 2 deletions...scala/org/apache/spark/ml/classification/NaiveBayes.scala
- mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala 5 additions, 5 deletions.../scala/org/apache/spark/ml/classification/OneVsRest.scala
- mllib/src/main/scala/org/apache/spark/ml/classification/ProbabilisticClassifier.scala 3 additions, 3 deletions...che/spark/ml/classification/ProbabilisticClassifier.scala
- mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala 3 additions, 3 deletions...ache/spark/ml/classification/RandomForestClassifier.scala
- mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala 4 additions, 4 deletions...cala/org/apache/spark/ml/clustering/BisectingKMeans.scala
- mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala 3 additions, 3 deletions...cala/org/apache/spark/ml/clustering/GaussianMixture.scala
- mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala 7 additions, 7 deletions...rc/main/scala/org/apache/spark/ml/clustering/KMeans.scala
- mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala 12 additions, 12 deletions...b/src/main/scala/org/apache/spark/ml/clustering/LDA.scala
- mllib/src/main/scala/org/apache/spark/ml/evaluation/BinaryClassificationEvaluator.scala 3 additions, 3 deletions...e/spark/ml/evaluation/BinaryClassificationEvaluator.scala
Loading
Please register or sign in to comment