Skip to content
Snippets Groups Projects
  • Joseph K. Bradley's avatar
    7bf6cc97
    [SPARK-3751] [mllib] DecisionTree: example update + print options · 7bf6cc97
    Joseph K. Bradley authored
    DecisionTreeRunner functionality additions:
    * Allow user to pass in a test dataset
    * Do not print full model if the model is too large.
    
    As part of this, modify DecisionTreeModel and RandomForestModel to allow printing less info.  Proposed updates:
    * toString: prints model summary
    * toDebugString: prints full model (named after RDD.toDebugString)
    
    Similar update to Python API:
    * __repr__() now prints a model summary
    * toDebugString() now prints the full model
    
    CC: mengxr  chouqin manishamde codedeft  Small update (whomever can take a look).  Thanks!
    
    Author: Joseph K. Bradley <joseph.kurata.bradley@gmail.com>
    
    Closes #2604 from jkbradley/dtrunner-update and squashes the following commits:
    
    b2b3c60 [Joseph K. Bradley] re-added python sql doc test, temporarily removed before
    07b1fae [Joseph K. Bradley] repr() now prints a model summary toDebugString() now prints the full model
    1d0d93d [Joseph K. Bradley] Updated DT and RF to print less when toString is called. Added toDebugString for verbose printing.
    22eac8c [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dtrunner-update
    e007a95 [Joseph K. Bradley] Updated DecisionTreeRunner to accept a test dataset.
    7bf6cc97
    History
    [SPARK-3751] [mllib] DecisionTree: example update + print options
    Joseph K. Bradley authored
    DecisionTreeRunner functionality additions:
    * Allow user to pass in a test dataset
    * Do not print full model if the model is too large.
    
    As part of this, modify DecisionTreeModel and RandomForestModel to allow printing less info.  Proposed updates:
    * toString: prints model summary
    * toDebugString: prints full model (named after RDD.toDebugString)
    
    Similar update to Python API:
    * __repr__() now prints a model summary
    * toDebugString() now prints the full model
    
    CC: mengxr  chouqin manishamde codedeft  Small update (whomever can take a look).  Thanks!
    
    Author: Joseph K. Bradley <joseph.kurata.bradley@gmail.com>
    
    Closes #2604 from jkbradley/dtrunner-update and squashes the following commits:
    
    b2b3c60 [Joseph K. Bradley] re-added python sql doc test, temporarily removed before
    07b1fae [Joseph K. Bradley] repr() now prints a model summary toDebugString() now prints the full model
    1d0d93d [Joseph K. Bradley] Updated DT and RF to print less when toString is called. Added toDebugString for verbose printing.
    22eac8c [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dtrunner-update
    e007a95 [Joseph K. Bradley] Updated DecisionTreeRunner to accept a test dataset.