Skip to content
Snippets Groups Projects
  1. Jul 14, 2016
    • Bryan Cutler's avatar
      [SPARK-16403][EXAMPLES] Cleanup to remove unused imports, consistent style, minor fixes · e3f8a033
      Bryan Cutler authored
      ## What changes were proposed in this pull request?
      
      Cleanup of examples, mostly from PySpark-ML to fix minor issues:  unused imports, style consistency, pipeline_example is a duplicate, use future print funciton, and a spelling error.
      
      * The "Pipeline Example" is duplicated by "Simple Text Classification Pipeline" in Scala, Python, and Java.
      
      * "Estimator Transformer Param Example" is duplicated by "Simple Params Example" in Scala, Python and Java
      
      * Synced random_forest_classifier_example.py with Scala by adding IndexToString label converted
      
      * Synced train_validation_split.py (in Scala ModelSelectionViaTrainValidationExample) by adjusting data split, adding grid for intercept.
      
      * RegexTokenizer was doing nothing in tokenizer_example.py and JavaTokenizerExample.java, synced with Scala version
      
      ## How was this patch tested?
      local tests and run modified examples
      
      Author: Bryan Cutler <cutlerb@gmail.com>
      
      Closes #14081 from BryanCutler/examples-cleanup-SPARK-16403.
      e3f8a033
  2. Jun 06, 2016
  3. May 11, 2016
    • Zheng RuiFeng's avatar
      [SPARK-15141][EXAMPLE][DOC] Update OneVsRest Examples · ad1a8466
      Zheng RuiFeng authored
      ## What changes were proposed in this pull request?
      1, Add python example for OneVsRest
      2, remove args-parsing
      
      ## How was this patch tested?
      manual tests
      `./bin/spark-submit examples/src/main/python/ml/one_vs_rest_example.py`
      
      Author: Zheng RuiFeng <ruifengz@foxmail.com>
      
      Closes #12920 from zhengruifeng/ovr_pe.
      ad1a8466
  4. May 04, 2016
    • Dongjoon Hyun's avatar
      [SPARK-15031][EXAMPLE] Use SparkSession in Scala/Python/Java example. · cdce4e62
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      This PR aims to update Scala/Python/Java examples by replacing `SQLContext` with newly added `SparkSession`.
      
      - Use **SparkSession Builder Pattern** in 154(Scala 55, Java 52, Python 47) files.
      - Add `getConf` in Python SparkContext class: `python/pyspark/context.py`
      - Replace **SQLContext Singleton Pattern** with **SparkSession Singleton Pattern**:
        - `SqlNetworkWordCount.scala`
        - `JavaSqlNetworkWordCount.java`
        - `sql_network_wordcount.py`
      
      Now, `SQLContexts` are used only in R examples and the following two Python examples. The python examples are untouched in this PR since it already fails some unknown issue.
      - `simple_params_example.py`
      - `aft_survival_regression.py`
      
      ## How was this patch tested?
      
      Manual.
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #12809 from dongjoon-hyun/SPARK-15031.
      cdce4e62
  5. Nov 13, 2015
    • Yanbo Liang's avatar
      [SPARK-11723][ML][DOC] Use LibSVM data source rather than MLUtils.loadLibSVMFile to load DataFrame · 99693fef
      Yanbo Liang authored
      Use LibSVM data source rather than MLUtils.loadLibSVMFile to load DataFrame, include:
      * Use libSVM data source for all example codes under examples/ml, and remove unused import.
      * Use libSVM data source for user guides under ml-*** which were omitted by #8697.
      * Fix bug: We should use ```sqlContext.read().format("libsvm").load(path)``` at Java side, but the API doc and user guides misuse as ```sqlContext.read.format("libsvm").load(path)```.
      * Code cleanup.
      
      mengxr
      
      Author: Yanbo Liang <ybliang8@gmail.com>
      
      Closes #9690 from yanboliang/spark-11723.
      99693fef
  6. Nov 12, 2015
Loading