Skip to content
Snippets Groups Projects
Commit accbed7c authored by Zhe Sun's avatar Zhe Sun Committed by Sean Owen
Browse files

[SPARK-19797][DOC] ML pipeline document correction

## What changes were proposed in this pull request?
Description about pipeline in this paragraph is incorrect https://spark.apache.org/docs/latest/ml-pipeline.html#how-it-works



> If the Pipeline had more **stages**, it would call the LogisticRegressionModel’s transform() method on the DataFrame before passing the DataFrame to the next stage.

Reason: Transformer could also be a stage. But only another Estimator will invoke an transform call and pass the data to next stage. The description in the document misleads ML pipeline users.

## How was this patch tested?
This is a tiny modification of **docs/ml-pipelines.md**. I jekyll build the modification and check the compiled document.

Author: Zhe Sun <ymwdalex@gmail.com>

Closes #17137 from ymwdalex/SPARK-19797-ML-pipeline-document-correction.

(cherry picked from commit 0bac3e4c)
Signed-off-by: default avatarSean Owen <sowen@cloudera.com>
parent 1237aaea
No related branches found
No related tags found
No related merge requests found
......@@ -132,7 +132,7 @@ The `Pipeline.fit()` method is called on the original `DataFrame`, which has raw
The `Tokenizer.transform()` method splits the raw text documents into words, adding a new column with words to the `DataFrame`.
The `HashingTF.transform()` method converts the words column into feature vectors, adding a new column with those vectors to the `DataFrame`.
Now, since `LogisticRegression` is an `Estimator`, the `Pipeline` first calls `LogisticRegression.fit()` to produce a `LogisticRegressionModel`.
If the `Pipeline` had more stages, it would call the `LogisticRegressionModel`'s `transform()`
If the `Pipeline` had more `Estimator`s, it would call the `LogisticRegressionModel`'s `transform()`
method on the `DataFrame` before passing the `DataFrame` to the next stage.
A `Pipeline` is an `Estimator`.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment