Skip to content
Snippets Groups Projects
  • Timothy Hunter's avatar
    2ecbe02d
    [SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib... · 2ecbe02d
    Timothy Hunter authored
    [SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib and mllib in the documentation.
    
    Replaces a number of occurences of `MLlib` in the documentation that were meant to refer to the `spark.mllib` package instead. It should clarify for new users the difference between `spark.mllib` (the package) and MLlib (the umbrella project for ML in spark).
    
    It also removes some files that I forgot to delete with #10207
    
    Author: Timothy Hunter <timhunter@databricks.com>
    
    Closes #10234 from thunterdb/12212.
    2ecbe02d
    History
    [SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib...
    Timothy Hunter authored
    [SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, spark.mllib and mllib in the documentation.
    
    Replaces a number of occurences of `MLlib` in the documentation that were meant to refer to the `spark.mllib` package instead. It should clarify for new users the difference between `spark.mllib` (the package) and MLlib (the umbrella project for ML in spark).
    
    It also removes some files that I forgot to delete with #10207
    
    Author: Timothy Hunter <timhunter@databricks.com>
    
    Closes #10234 from thunterdb/12212.
mllib-classification-regression.md 1.72 KiB
layout: global
title: Classification and Regression - spark.mllib
displayTitle: Classification and Regression - spark.mllib

The spark.mllib package supports various methods for binary classification, multiclass classification, and regression analysis. The table below outlines the supported algorithms for each type of problem.

Problem Type Supported Methods
Binary Classification linear SVMs, logistic regression, decision trees, random forests, gradient-boosted trees, naive Bayes
Multiclass Classification logistic regression, decision trees, random forests, naive Bayes
Regression linear least squares, Lasso, ridge regression, decision trees, random forests, gradient-boosted trees, isotonic regression

More details for these methods can be found here: