Skip to content
Snippets Groups Projects
  • Joseph K. Bradley's avatar
    01f09b16
    [SPARK-14812][ML][MLLIB][PYTHON] Experimental, DeveloperApi annotation audit for ML · 01f09b16
    Joseph K. Bradley authored
    ## What changes were proposed in this pull request?
    
    General decisions to follow, except where noted:
    * spark.mllib, pyspark.mllib: Remove all Experimental annotations.  Leave DeveloperApi annotations alone.
    * spark.ml, pyspark.ml
    ** Annotate Estimator-Model pairs of classes and companion objects the same way.
    ** For all algorithms marked Experimental with Since tag <= 1.6, remove Experimental annotation.
    ** For all algorithms marked Experimental with Since tag = 2.0, leave Experimental annotation.
    * DeveloperApi annotations are left alone, except where noted.
    * No changes to which types are sealed.
    
    Exceptions where I am leaving items Experimental in spark.ml, pyspark.ml, mainly because the items are new:
    * Model Summary classes
    * MLWriter, MLReader, MLWritable, MLReadable
    * Evaluator and subclasses: There is discussion of changes around evaluating multiple metrics at once for efficiency.
    * RFormula: Its behavior may need to change slightly to match R in edge cases.
    * AFTSurvivalRegression
    * MultilayerPerceptronClassifier
    
    DeveloperApi changes:
    * ml.tree.Node, ml.tree.Split, and subclasses should no longer be DeveloperApi
    
    ## How was this patch tested?
    
    N/A
    
    Note to reviewers:
    * spark.ml.clustering.LDA underwent significant changes (additional methods), so let me know if you want me to leave it Experimental.
    * Be careful to check for cases where a class should no longer be Experimental but has an Experimental method, val, or other feature.  I did not find such cases, but please verify.
    
    Author: Joseph K. Bradley <joseph@databricks.com>
    
    Closes #14147 from jkbradley/experimental-audit.
    01f09b16
    History
    [SPARK-14812][ML][MLLIB][PYTHON] Experimental, DeveloperApi annotation audit for ML
    Joseph K. Bradley authored
    ## What changes were proposed in this pull request?
    
    General decisions to follow, except where noted:
    * spark.mllib, pyspark.mllib: Remove all Experimental annotations.  Leave DeveloperApi annotations alone.
    * spark.ml, pyspark.ml
    ** Annotate Estimator-Model pairs of classes and companion objects the same way.
    ** For all algorithms marked Experimental with Since tag <= 1.6, remove Experimental annotation.
    ** For all algorithms marked Experimental with Since tag = 2.0, leave Experimental annotation.
    * DeveloperApi annotations are left alone, except where noted.
    * No changes to which types are sealed.
    
    Exceptions where I am leaving items Experimental in spark.ml, pyspark.ml, mainly because the items are new:
    * Model Summary classes
    * MLWriter, MLReader, MLWritable, MLReadable
    * Evaluator and subclasses: There is discussion of changes around evaluating multiple metrics at once for efficiency.
    * RFormula: Its behavior may need to change slightly to match R in edge cases.
    * AFTSurvivalRegression
    * MultilayerPerceptronClassifier
    
    DeveloperApi changes:
    * ml.tree.Node, ml.tree.Split, and subclasses should no longer be DeveloperApi
    
    ## How was this patch tested?
    
    N/A
    
    Note to reviewers:
    * spark.ml.clustering.LDA underwent significant changes (additional methods), so let me know if you want me to leave it Experimental.
    * Be careful to check for cases where a class should no longer be Experimental but has an Experimental method, val, or other feature.  I did not find such cases, but please verify.
    
    Author: Joseph K. Bradley <joseph@databricks.com>
    
    Closes #14147 from jkbradley/experimental-audit.