Skip to content
Snippets Groups Projects
  • Yanbo Liang's avatar
    b5bd75d9
    [SPARK-6255] [MLLIB] Support multiclass classification in Python API · b5bd75d9
    Yanbo Liang authored
    Python API parity check for classification and multiclass classification support, major disparities need to be added for Python:
    ```scala
    LogisticRegressionWithLBFGS
        setNumClasses
        setValidateData
    LogisticRegressionModel
        getThreshold
        numClasses
        numFeatures
    SVMWithSGD
        setValidateData
    SVMModel
        getThreshold
    ```
    For users the greatest benefit in this PR is multiclass classification was supported by Python API.
    Users can train multiclass classification model and use it to predict in pyspark.
    
    Author: Yanbo Liang <ybliang8@gmail.com>
    
    Closes #5137 from yanboliang/spark-6255 and squashes the following commits:
    
    0bd531e [Yanbo Liang] address comments
    444d5e2 [Yanbo Liang] LogisticRegressionModel.predict() optimization
    fc7990b [Yanbo Liang] address comments
    b0d9c63 [Yanbo Liang] Support Mulinomial LR model predict in Python API
    ded847c [Yanbo Liang] Python API parity check for classification (support multiclass classification)
    b5bd75d9
    History
    [SPARK-6255] [MLLIB] Support multiclass classification in Python API
    Yanbo Liang authored
    Python API parity check for classification and multiclass classification support, major disparities need to be added for Python:
    ```scala
    LogisticRegressionWithLBFGS
        setNumClasses
        setValidateData
    LogisticRegressionModel
        getThreshold
        numClasses
        numFeatures
    SVMWithSGD
        setValidateData
    SVMModel
        getThreshold
    ```
    For users the greatest benefit in this PR is multiclass classification was supported by Python API.
    Users can train multiclass classification model and use it to predict in pyspark.
    
    Author: Yanbo Liang <ybliang8@gmail.com>
    
    Closes #5137 from yanboliang/spark-6255 and squashes the following commits:
    
    0bd531e [Yanbo Liang] address comments
    444d5e2 [Yanbo Liang] LogisticRegressionModel.predict() optimization
    fc7990b [Yanbo Liang] address comments
    b0d9c63 [Yanbo Liang] Support Mulinomial LR model predict in Python API
    ded847c [Yanbo Liang] Python API parity check for classification (support multiclass classification)