Skip to content
Snippets Groups Projects
  • Xiangrui Meng's avatar
    32218307
    [SPARK-4372][MLLIB] Make LR and SVM's default parameters consistent in Scala and Python · 32218307
    Xiangrui Meng authored
    The current default regParam is 1.0 and regType is claimed to be none in Python (but actually it is l2), while regParam = 0.0 and regType is L2 in Scala. We should make the default values consistent. This PR sets the default regType to L2 and regParam to 0.01. Note that the default regParam value in LIBLINEAR (and hence scikit-learn) is 1.0. However, we use average loss instead of total loss in our formulation. Hence regParam=1.0 is definitely too heavy.
    
    In LinearRegression, we set regParam=0.0 and regType=None, because we have separate classes for Lasso and Ridge, both of which use regParam=0.01 as the default.
    
    davies atalwalkar
    
    Author: Xiangrui Meng <meng@databricks.com>
    
    Closes #3232 from mengxr/SPARK-4372 and squashes the following commits:
    
    9979837 [Xiangrui Meng] update Ridge/Lasso to use default regParam 0.01 cast input arguments
    d3ba096 [Xiangrui Meng] change 'none' back to None
    1909a6e [Xiangrui Meng] change default regParam to 0.01 and regType to L2 in LR and SVM
    32218307
    History
    [SPARK-4372][MLLIB] Make LR and SVM's default parameters consistent in Scala and Python
    Xiangrui Meng authored
    The current default regParam is 1.0 and regType is claimed to be none in Python (but actually it is l2), while regParam = 0.0 and regType is L2 in Scala. We should make the default values consistent. This PR sets the default regType to L2 and regParam to 0.01. Note that the default regParam value in LIBLINEAR (and hence scikit-learn) is 1.0. However, we use average loss instead of total loss in our formulation. Hence regParam=1.0 is definitely too heavy.
    
    In LinearRegression, we set regParam=0.0 and regType=None, because we have separate classes for Lasso and Ridge, both of which use regParam=0.01 as the default.
    
    davies atalwalkar
    
    Author: Xiangrui Meng <meng@databricks.com>
    
    Closes #3232 from mengxr/SPARK-4372 and squashes the following commits:
    
    9979837 [Xiangrui Meng] update Ridge/Lasso to use default regParam 0.01 cast input arguments
    d3ba096 [Xiangrui Meng] change 'none' back to None
    1909a6e [Xiangrui Meng] change default regParam to 0.01 and regType to L2 in LR and SVM