Skip to content
Snippets Groups Projects
  1. Oct 01, 2014
    • Joseph K. Bradley's avatar
      [SPARK-3751] [mllib] DecisionTree: example update + print options · 7bf6cc97
      Joseph K. Bradley authored
      DecisionTreeRunner functionality additions:
      * Allow user to pass in a test dataset
      * Do not print full model if the model is too large.
      
      As part of this, modify DecisionTreeModel and RandomForestModel to allow printing less info.  Proposed updates:
      * toString: prints model summary
      * toDebugString: prints full model (named after RDD.toDebugString)
      
      Similar update to Python API:
      * __repr__() now prints a model summary
      * toDebugString() now prints the full model
      
      CC: mengxr  chouqin manishamde codedeft  Small update (whomever can take a look).  Thanks!
      
      Author: Joseph K. Bradley <joseph.kurata.bradley@gmail.com>
      
      Closes #2604 from jkbradley/dtrunner-update and squashes the following commits:
      
      b2b3c60 [Joseph K. Bradley] re-added python sql doc test, temporarily removed before
      07b1fae [Joseph K. Bradley] repr() now prints a model summary toDebugString() now prints the full model
      1d0d93d [Joseph K. Bradley] Updated DT and RF to print less when toString is called. Added toDebugString for verbose printing.
      22eac8c [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dtrunner-update
      e007a95 [Joseph K. Bradley] Updated DecisionTreeRunner to accept a test dataset.
      7bf6cc97
  2. Sep 19, 2014
    • Davies Liu's avatar
      [SPARK-3491] [MLlib] [PySpark] use pickle to serialize data in MLlib · fce5e251
      Davies Liu authored
      Currently, we serialize the data between JVM and Python case by case manually, this cannot scale to support so many APIs in MLlib.
      
      This patch will try to address this problem by serialize the data using pickle protocol, using Pyrolite library to serialize/deserialize in JVM. Pickle protocol can be easily extended to support customized class.
      
      All the modules are refactored to use this protocol.
      
      Known issues: There will be some performance regression (both CPU and memory, the serialized data increased)
      
      Author: Davies Liu <davies.liu@gmail.com>
      
      Closes #2378 from davies/pickle_mllib and squashes the following commits:
      
      dffbba2 [Davies Liu] Merge branch 'master' of github.com:apache/spark into pickle_mllib
      810f97f [Davies Liu] fix equal of matrix
      032cd62 [Davies Liu] add more type check and conversion for user_product
      bd738ab [Davies Liu] address comments
      e431377 [Davies Liu] fix cache of rdd, refactor
      19d0967 [Davies Liu] refactor Picklers
      2511e76 [Davies Liu] cleanup
      1fccf1a [Davies Liu] address comments
      a2cc855 [Davies Liu] fix tests
      9ceff73 [Davies Liu] test size of serialized Rating
      44e0551 [Davies Liu] fix cache
      a379a81 [Davies Liu] fix pickle array in python2.7
      df625c7 [Davies Liu] Merge commit '154d141' into pickle_mllib
      154d141 [Davies Liu] fix autobatchedpickler
      44736d7 [Davies Liu] speed up pickling array in Python 2.7
      e1d1bfc [Davies Liu] refactor
      708dc02 [Davies Liu] fix tests
      9dcfb63 [Davies Liu] fix style
      88034f0 [Davies Liu] rafactor, address comments
      46a501e [Davies Liu] choose batch size automatically
      df19464 [Davies Liu] memorize the module and class name during pickleing
      f3506c5 [Davies Liu] Merge branch 'master' into pickle_mllib
      722dd96 [Davies Liu] cleanup _common.py
      0ee1525 [Davies Liu] remove outdated tests
      b02e34f [Davies Liu] remove _common.py
      84c721d [Davies Liu] Merge branch 'master' into pickle_mllib
      4d7963e [Davies Liu] remove muanlly serialization
      6d26b03 [Davies Liu] fix tests
      c383544 [Davies Liu] classification
      f2a0856 [Davies Liu] mllib/regression
      d9f691f [Davies Liu] mllib/util
      cccb8b1 [Davies Liu] mllib/tree
      8fe166a [Davies Liu] Merge branch 'pickle' into pickle_mllib
      aa2287e [Davies Liu] random
      f1544c4 [Davies Liu] refactor clustering
      52d1350 [Davies Liu] use new protocol in mllib/stat
      b30ef35 [Davies Liu] use pickle to serialize data for mllib/recommendation
      f44f771 [Davies Liu] enable tests about array
      3908f5c [Davies Liu] Merge branch 'master' into pickle
      c77c87b [Davies Liu] cleanup debugging code
      60e4e2f [Davies Liu] support unpickle array.array for Python 2.6
      fce5e251
  3. Sep 15, 2014
    • qiping.lqp's avatar
      [SPARK-3516] [mllib] DecisionTree: Add minInstancesPerNode, minInfoGain params... · fdb302f4
      qiping.lqp authored
      [SPARK-3516] [mllib] DecisionTree: Add minInstancesPerNode, minInfoGain params to example and Python API
      
      Added minInstancesPerNode, minInfoGain params to:
      * DecisionTreeRunner.scala example
      * Python API (tree.py)
      
      Also:
      * Fixed typo in tree suite test "do not choose split that does not satisfy min instance per node requirements"
      * small style fixes
      
      CC: mengxr
      
      Author: qiping.lqp <qiping.lqp@alibaba-inc.com>
      Author: Joseph K. Bradley <joseph.kurata.bradley@gmail.com>
      Author: chouqin <liqiping1991@gmail.com>
      
      Closes #2349 from jkbradley/chouqin-dt-preprune and squashes the following commits:
      
      61b2e72 [Joseph K. Bradley] Added max of 10GB for maxMemoryInMB in Strategy.
      a95e7c8 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into chouqin-dt-preprune
      95c479d [Joseph K. Bradley] * Fixed typo in tree suite test "do not choose split that does not satisfy min instance per node requirements" * small style fixes
      e2628b6 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into chouqin-dt-preprune
      19b01af [Joseph K. Bradley] Merge remote-tracking branch 'chouqin/dt-preprune' into chouqin-dt-preprune
      f1d11d1 [chouqin] fix typo
      c7ebaf1 [chouqin] fix typo
      39f9b60 [chouqin] change edge `minInstancesPerNode` to 2 and add one more test
      c6e2dfc [Joseph K. Bradley] Added minInstancesPerNode and minInfoGain parameters to DecisionTreeRunner.scala and to Python API in tree.py
      0278a11 [chouqin] remove `noSplit` and set `Predict` private to tree
      d593ec7 [chouqin] fix docs and change minInstancesPerNode to 1
      efcc736 [qiping.lqp] fix bug
      10b8012 [qiping.lqp] fix style
      6728fad [qiping.lqp] minor fix: remove empty lines
      bb465ca [qiping.lqp] Merge branch 'master' of https://github.com/apache/spark into dt-preprune
      cadd569 [qiping.lqp] add api docs
      46b891f [qiping.lqp] fix bug
      e72c7e4 [qiping.lqp] add comments
      845c6fa [qiping.lqp] fix style
      f195e83 [qiping.lqp] fix style
      987cbf4 [qiping.lqp] fix bug
      ff34845 [qiping.lqp] separate calculation of predict of node from calculation of info gain
      ac42378 [qiping.lqp] add min info gain and min instances per node parameters in decision tree
      fdb302f4
  4. Sep 08, 2014
    • Xiangrui Meng's avatar
      [SPARK-3443][MLLIB] update default values of tree: · 50a4fa77
      Xiangrui Meng authored
      Adjust the default values of decision tree, based on the memory requirement discussed in https://github.com/apache/spark/pull/2125 :
      
      1. maxMemoryInMB: 128 -> 256
      2. maxBins: 100 -> 32
      3. maxDepth: 4 -> 5 (in some example code)
      
      jkbradley
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #2322 from mengxr/tree-defaults and squashes the following commits:
      
      cda453a [Xiangrui Meng] fix tests
      5900445 [Xiangrui Meng] update comments
      8c81831 [Xiangrui Meng] update default values of tree:
      50a4fa77
  5. Sep 03, 2014
    • Davies Liu's avatar
      [SPARK-3309] [PySpark] Put all public API in __all__ · 6481d274
      Davies Liu authored
      Put all public API in __all__, also put them all in pyspark.__init__.py, then we can got all the documents for public API by `pydoc pyspark`. It also can be used by other programs (such as Sphinx or Epydoc) to generate only documents for public APIs.
      
      Author: Davies Liu <davies.liu@gmail.com>
      
      Closes #2205 from davies/public and squashes the following commits:
      
      c6c5567 [Davies Liu] fix message
      f7b35be [Davies Liu] put SchemeRDD, Row in pyspark.sql module
      7e3016a [Davies Liu] add __all__ in mllib
      6281b48 [Davies Liu] fix doc for SchemaRDD
      6caab21 [Davies Liu] add public interfaces into pyspark.__init__.py
      6481d274
  6. Aug 18, 2014
    • Joseph K. Bradley's avatar
      [mllib] DecisionTree: treeAggregate + Python example bug fix · 115eeb30
      Joseph K. Bradley authored
      Small DecisionTree updates:
      * Changed main DecisionTree aggregate to treeAggregate.
      * Fixed bug in python example decision_tree_runner.py with missing argument (since categoricalFeaturesInfo is no longer an optional argument for trainClassifier).
      * Fixed same bug in python doc tests, and added tree.py to doc tests.
      
      CC: mengxr
      
      Author: Joseph K. Bradley <joseph.kurata.bradley@gmail.com>
      
      Closes #2015 from jkbradley/dt-opt2 and squashes the following commits:
      
      b5114fa [Joseph K. Bradley] Fixed python tree.py doc test (extra newline)
      8e4665d [Joseph K. Bradley] Added tree.py to python doc tests.  Fixed bug from missing categoricalFeaturesInfo argument.
      b7b2922 [Joseph K. Bradley] Fixed bug in python example decision_tree_runner.py with missing argument.  Changed main DecisionTree aggregate to treeAggregate.
      85bbc1f [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dt-opt2
      66d076f [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dt-opt2
      a0ed0da [Joseph K. Bradley] Renamed DTMetadata to DecisionTreeMetadata.  Small doc updates.
      3726d20 [Joseph K. Bradley] Small code improvements based on code review.
      ac0b9f8 [Joseph K. Bradley] Small updates based on code review. Main change: Now using << instead of math.pow.
      db0d773 [Joseph K. Bradley] scala style fix
      6a38f48 [Joseph K. Bradley] Added DTMetadata class for cleaner code
      931a3a7 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dt-opt2
      797f68a [Joseph K. Bradley] Fixed DecisionTreeSuite bug for training second level.  Needed to update treePointToNodeIndex with groupShift.
      f40381c [Joseph K. Bradley] Merge branch 'dt-opt1' into dt-opt2
      5f2dec2 [Joseph K. Bradley] Fixed scalastyle issue in TreePoint
      6b5651e [Joseph K. Bradley] Updates based on code review.  1 major change: persisting to memory + disk, not just memory.
      2d2aaaf [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dt-opt1
      26d10dd [Joseph K. Bradley] Removed tree/model/Filter.scala since no longer used.  Removed debugging println calls in DecisionTree.scala.
      356daba [Joseph K. Bradley] Merge branch 'dt-opt1' into dt-opt2
      430d782 [Joseph K. Bradley] Added more debug info on binning error.  Added some docs.
      d036089 [Joseph K. Bradley] Print timing info to logDebug.
      e66f1b1 [Joseph K. Bradley] TreePoint * Updated doc * Made some methods private
      8464a6e [Joseph K. Bradley] Moved TimeTracker to tree/impl/ in its own file, and cleaned it up.  Removed debugging println calls from DecisionTree.  Made TreePoint extend Serialiable
      a87e08f [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dt-opt1
      c1565a5 [Joseph K. Bradley] Small DecisionTree updates: * Simplification: Updated calculateGainForSplit to take aggregates for a single (feature, split) pair. * Internal doc: findAggForOrderedFeatureClassification
      b914f3b [Joseph K. Bradley] DecisionTree optimization: eliminated filters + small changes
      b2ed1f3 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dt-opt
      0f676e2 [Joseph K. Bradley] Optimizations + Bug fix for DecisionTree
      3211f02 [Joseph K. Bradley] Optimizing DecisionTree * Added TreePoint representation to avoid calling findBin multiple times. * (not working yet, but debugging)
      f61e9d2 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dt-timing
      bcf874a [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dt-timing
      511ec85 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dt-timing
      a95bc22 [Joseph K. Bradley] timing for DecisionTree internals
      115eeb30
  7. Aug 07, 2014
    • Joseph K. Bradley's avatar
      [SPARK-2851] [mllib] DecisionTree Python consistency update · 47ccd5e7
      Joseph K. Bradley authored
      Added 6 static train methods to match Python API, but without default arguments (but with Python default args noted in docs).
      
      Added factory classes for Algo and Impurity, but made private[mllib].
      
      CC: mengxr dorx  Please let me know if there are other changes which would help with API consistency---thanks!
      
      Author: Joseph K. Bradley <joseph.kurata.bradley@gmail.com>
      
      Closes #1798 from jkbradley/dt-python-consistency and squashes the following commits:
      
      6f7edf8 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dt-python-consistency
      a0d7dbe [Joseph K. Bradley] DecisionTree: In Java-friendly train* methods, changed to use JavaRDD instead of RDD.
      ee1d236 [Joseph K. Bradley] DecisionTree API updates: * Removed train() function in Python API (tree.py) ** Removed corresponding function in Scala/Java API (the ones taking basic types)
      00f820e [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into dt-python-consistency
      fe6dbfa [Joseph K. Bradley] removed unnecessary imports
      e358661 [Joseph K. Bradley] DecisionTree API change: * Added 6 static train methods to match Python API, but without default arguments (but with Python default args noted in docs).
      c699850 [Joseph K. Bradley] a few doc comments
      eaf84c0 [Joseph K. Bradley] Added DecisionTree static train() methods API to match Python, but without default parameters
      47ccd5e7
  8. Aug 06, 2014
    • Nicholas Chammas's avatar
      [SPARK-2627] [PySpark] have the build enforce PEP 8 automatically · d614967b
      Nicholas Chammas authored
      As described in [SPARK-2627](https://issues.apache.org/jira/browse/SPARK-2627), we'd like Python code to automatically be checked for PEP 8 compliance by Jenkins. This pull request aims to do that.
      
      Notes:
      * We may need to install [`pep8`](https://pypi.python.org/pypi/pep8) on the build server.
      * I'm expecting tests to fail now that PEP 8 compliance is being checked as part of the build. I'm fine with cleaning up any remaining PEP 8 violations as part of this pull request.
      * I did not understand why the RAT and scalastyle reports are saved to text files. I did the same for the PEP 8 check, but only so that the console output style can match those for the RAT and scalastyle checks. The PEP 8 report is removed right after the check is complete.
      * Updates to the ["Contributing to Spark"](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark) guide will be submitted elsewhere, as I don't believe that text is part of the Spark repo.
      
      Author: Nicholas Chammas <nicholas.chammas@gmail.com>
      Author: nchammas <nicholas.chammas@gmail.com>
      
      Closes #1744 from nchammas/master and squashes the following commits:
      
      274b238 [Nicholas Chammas] [SPARK-2627] [PySpark] minor indentation changes
      983d963 [nchammas] Merge pull request #5 from apache/master
      1db5314 [nchammas] Merge pull request #4 from apache/master
      0e0245f [Nicholas Chammas] [SPARK-2627] undo erroneous whitespace fixes
      bf30942 [Nicholas Chammas] [SPARK-2627] PEP8: comment spacing
      6db9a44 [nchammas] Merge pull request #3 from apache/master
      7b4750e [Nicholas Chammas] merge upstream changes
      91b7584 [Nicholas Chammas] [SPARK-2627] undo unnecessary line breaks
      44e3e56 [Nicholas Chammas] [SPARK-2627] use tox.ini to exclude files
      b09fae2 [Nicholas Chammas] don't wrap comments unnecessarily
      bfb9f9f [Nicholas Chammas] [SPARK-2627] keep up with the PEP 8 fixes
      9da347f [nchammas] Merge pull request #2 from apache/master
      aa5b4b5 [Nicholas Chammas] [SPARK-2627] follow Spark bash style for if blocks
      d0a83b9 [Nicholas Chammas] [SPARK-2627] check that pep8 downloaded fine
      dffb5dd [Nicholas Chammas] [SPARK-2627] download pep8 at runtime
      a1ce7ae [Nicholas Chammas] [SPARK-2627] space out test report sections
      21da538 [Nicholas Chammas] [SPARK-2627] it's PEP 8, not PEP8
      6f4900b [Nicholas Chammas] [SPARK-2627] more misc PEP 8 fixes
      fe57ed0 [Nicholas Chammas] removing merge conflict backups
      9c01d4c [nchammas] Merge pull request #1 from apache/master
      9a66cb0 [Nicholas Chammas] resolving merge conflicts
      a31ccc4 [Nicholas Chammas] [SPARK-2627] miscellaneous PEP 8 fixes
      beaa9ac [Nicholas Chammas] [SPARK-2627] fail check on non-zero status
      723ed39 [Nicholas Chammas] always delete the report file
      0541ebb [Nicholas Chammas] [SPARK-2627] call Python linter from run-tests
      12440fa [Nicholas Chammas] [SPARK-2627] add Scala linter
      61c07b9 [Nicholas Chammas] [SPARK-2627] add Python linter
      75ad552 [Nicholas Chammas] make check output style consistent
      d614967b
  9. Aug 02, 2014
    • Joseph K. Bradley's avatar
      [SPARK-2478] [mllib] DecisionTree Python API · 3f67382e
      Joseph K. Bradley authored
      Added experimental Python API for Decision Trees.
      
      API:
      * class DecisionTreeModel
      ** predict() for single examples and RDDs, taking both feature vectors and LabeledPoints
      ** numNodes()
      ** depth()
      ** __str__()
      * class DecisionTree
      ** trainClassifier()
      ** trainRegressor()
      ** train()
      
      Examples and testing:
      * Added example testing classification and regression with batch prediction: examples/src/main/python/mllib/tree.py
      * Have also tested example usage in doc of python/pyspark/mllib/tree.py which tests single-example prediction with dense and sparse vectors
      
      Also: Small bug fix in python/pyspark/mllib/_common.py: In _linear_predictor_typecheck, changed check for RDD to use isinstance() instead of type() in order to catch RDD subclasses.
      
      CC mengxr manishamde
      
      Author: Joseph K. Bradley <joseph.kurata.bradley@gmail.com>
      
      Closes #1727 from jkbradley/decisiontree-python-new and squashes the following commits:
      
      3744488 [Joseph K. Bradley] Renamed test tree.py to decision_tree_runner.py Small updates based on github review.
      6b86a9d [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into decisiontree-python-new
      affceb9 [Joseph K. Bradley] * Fixed bug in doc tests in pyspark/mllib/util.py caused by change in loadLibSVMFile behavior.  (It used to threshold labels at 0 to make them 0/1, but it now leaves them as they are.) * Fixed small bug in loadLibSVMFile: If a data file had no features, then loadLibSVMFile would create a single all-zero feature.
      67a29bc [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into decisiontree-python-new
      cf46ad7 [Joseph K. Bradley] Python DecisionTreeModel * predict(empty RDD) returns an empty RDD instead of an error. * Removed support for calling predict() on LabeledPoint and RDD[LabeledPoint] * predict() does not cache serialized RDD any more.
      aa29873 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into decisiontree-python-new
      bf21be4 [Joseph K. Bradley] removed old run() func from DecisionTree
      fa10ea7 [Joseph K. Bradley] Small style update
      7968692 [Joseph K. Bradley] small braces typo fix
      e34c263 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into decisiontree-python-new
      4801b40 [Joseph K. Bradley] Small style update to DecisionTreeSuite
      db0eab2 [Joseph K. Bradley] Merge branch 'decisiontree-bugfix2' into decisiontree-python-new
      6873fa9 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into decisiontree-python-new
      225822f [Joseph K. Bradley] Bug: In DecisionTree, the method sequentialBinSearchForOrderedCategoricalFeatureInClassification() indexed bins from 0 to (math.pow(2, featureCategories.toInt - 1) - 1). This upper bound is the bound for unordered categorical features, not ordered ones. The upper bound should be the arity (i.e., max value) of the feature.
      93953f1 [Joseph K. Bradley] Likely done with Python API.
      6df89a9 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into decisiontree-python-new
      4562c08 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into decisiontree-python-new
      665ba78 [Joseph K. Bradley] Small updates towards Python DecisionTree API
      188cb0d [Joseph K. Bradley] Merge branch 'decisiontree-bugfix' into decisiontree-python-new
      6622247 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into decisiontree-python-new
      b8fac57 [Joseph K. Bradley] Finished Python DecisionTree API and example but need to test a bit more.
      2b20c61 [Joseph K. Bradley] Small doc and style updates
      1b29c13 [Joseph K. Bradley] Merge branch 'decisiontree-bugfix' into decisiontree-python-new
      584449a [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into decisiontree-python-new
      dab0b67 [Joseph K. Bradley] Added documentation for DecisionTree internals
      8bb8aa0 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into decisiontree-bugfix
      978cfcf [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into decisiontree-bugfix
      6eed482 [Joseph K. Bradley] In DecisionTree: Changed from using procedural syntax for functions returning Unit to explicitly writing Unit return type.
      376dca2 [Joseph K. Bradley] Updated meaning of maxDepth by 1 to fit scikit-learn and rpart. * In code, replaced usages of maxDepth <-- maxDepth + 1 * In params, replace settings of maxDepth <-- maxDepth - 1
      e06e423 [Joseph K. Bradley] Merge branch 'decisiontree-bugfix' into decisiontree-python-new
      bab3f19 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into decisiontree-python-new
      59750f8 [Joseph K. Bradley] * Updated Strategy to check numClassesForClassification only if algo=Classification. * Updates based on comments: ** DecisionTreeRunner *** Made dataFormat arg default to libsvm ** Small cleanups ** tree.Node: Made recursive helper methods private, and renamed them.
      52e17c5 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into decisiontree-bugfix
      f5a036c [Joseph K. Bradley] Merge branch 'decisiontree-bugfix' into decisiontree-python-new
      da50db7 [Joseph K. Bradley] Added one more test to DecisionTreeSuite: stump with 2 continuous variables for binary classification.  Caused problems in past, but fixed now.
      8e227ea [Joseph K. Bradley] Changed Strategy so it only requires numClassesForClassification >= 2 for classification
      cd1d933 [Joseph K. Bradley] Merge branch 'decisiontree-bugfix' into decisiontree-python-new
      8ea8750 [Joseph K. Bradley] Bug fix: Off-by-1 when finding thresholds for splits for continuous features.
      8a758db [Joseph K. Bradley] Merge branch 'decisiontree-bugfix' into decisiontree-python-new
      5fe44ed [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into decisiontree-python-new
      2283df8 [Joseph K. Bradley] 2 bug fixes.
      73fbea2 [Joseph K. Bradley] Merge remote-tracking branch 'upstream/master' into decisiontree-bugfix
      5f920a1 [Joseph K. Bradley] Demonstration of bug before submitting fix: Updated DecisionTreeSuite so that 3 tests fail.  Will describe bug in next commit.
      f825352 [Joseph K. Bradley] Wrote Python API and example for DecisionTree.  Also added toString, depth, and numNodes methods to DecisionTreeModel.
      3f67382e
Loading