Skip to content
Snippets Groups Projects
  1. May 16, 2016
    • Sean Owen's avatar
      [SPARK-12972][CORE][TEST-MAVEN][TEST-HADOOP2.2] Update... · fabc8e5b
      Sean Owen authored
      [SPARK-12972][CORE][TEST-MAVEN][TEST-HADOOP2.2] Update org.apache.httpcomponents.httpclient, commons-io
      
      ## What changes were proposed in this pull request?
      
      This is sort of a hot-fix for https://github.com/apache/spark/pull/13117, but, the problem is limited to Hadoop 2.2. The change is to manage `commons-io` to 2.4 for all Hadoop builds, which is only a net change for Hadoop 2.2, which was using 2.1.
      
      ## How was this patch tested?
      
      Jenkins tests -- normal PR builder, then the `[test-hadoop2.2] [test-maven]` if successful.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #13132 from srowen/SPARK-12972.3.
      fabc8e5b
  2. May 15, 2016
    • Sean Owen's avatar
      [SPARK-12972][CORE] Update org.apache.httpcomponents.httpclient · f5576a05
      Sean Owen authored
      ## What changes were proposed in this pull request?
      
      (Retry of https://github.com/apache/spark/pull/13049)
      
      - update to httpclient 4.5 / httpcore 4.4
      - remove some defunct exclusions
      - manage httpmime version to match
      - update selenium / httpunit to support 4.5 (possible now that Jetty 9 is used)
      
      ## How was this patch tested?
      
      Jenkins tests. Also, locally running the same test command of one Jenkins profile that failed: `mvn -Phadoop-2.6 -Pyarn -Phive -Phive-thriftserver -Pkinesis-asl ...`
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #13117 from srowen/SPARK-12972.2.
      f5576a05
  3. May 13, 2016
  4. May 12, 2016
  5. May 11, 2016
  6. May 05, 2016
    • hyukjinkwon's avatar
      [SPARK-15148][SQL] Upgrade Univocity library from 2.0.2 to 2.1.0 · ac12b35d
      hyukjinkwon authored
      ## What changes were proposed in this pull request?
      
      https://issues.apache.org/jira/browse/SPARK-15148
      
      Mainly it improves the performance roughtly about 30%-40% according to the [release note](https://github.com/uniVocity/univocity-parsers/releases/tag/v2.1.0). For the details of the purpose is described in the JIRA.
      
      This PR upgrades Univocity library from 2.0.2 to 2.1.0.
      
      ## How was this patch tested?
      
      Existing tests should cover this.
      
      Author: hyukjinkwon <gurwls223@gmail.com>
      
      Closes #12923 from HyukjinKwon/SPARK-15148.
      ac12b35d
    • mcheah's avatar
      [SPARK-12154] Upgrade to Jersey 2 · b7fdc23c
      mcheah authored
      ## What changes were proposed in this pull request?
      
      Replace com.sun.jersey with org.glassfish.jersey. Changes to the Spark Web UI code were required to compile. The changes were relatively standard Jersey migration things.
      
      ## How was this patch tested?
      
      I did a manual test for the standalone web APIs. Although I didn't test the functionality of the security filter itself, the code that changed non-trivially is how we actually register the filter. I attached a debugger to the Spark master and verified that the SecurityFilter code is indeed invoked upon hitting /api/v1/applications.
      
      Author: mcheah <mcheah@palantir.com>
      
      Closes #12715 from mccheah/feature/upgrade-jersey.
      b7fdc23c
    • Lining Sun's avatar
      [SPARK-15123] upgrade org.json4s to 3.2.11 version · 592fc455
      Lining Sun authored
      ## What changes were proposed in this pull request?
      
      We had the issue when using snowplow in our Spark applications. Snowplow requires json4s version 3.2.11 while Spark still use a few years old version 3.2.10. The change is to upgrade json4s jar to 3.2.11.
      
      ## How was this patch tested?
      
      We built Spark jar and successfully ran our applications in local and cluster modes.
      
      Author: Lining Sun <lining@gmail.com>
      
      Closes #12901 from liningalex/master.
      592fc455
  7. May 03, 2016
    • Dongjoon Hyun's avatar
      [SPARK-15053][BUILD] Fix Java Lint errors on Hive-Thriftserver module · a7444570
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      This issue fixes or hides 181 Java linter errors introduced by SPARK-14987 which copied hive service code from Hive. We had better clean up these errors before releasing Spark 2.0.
      
      - Fix UnusedImports (15 lines), RedundantModifier (14 lines), SeparatorWrap (9 lines), MethodParamPad (6 lines), FileTabCharacter (5 lines), ArrayTypeStyle (3 lines), ModifierOrder (3 lines), RedundantImport (1 line), CommentsIndentation (1 line), UpperEll (1 line), FallThrough (1 line), OneStatementPerLine (1 line), NewlineAtEndOfFile (1 line) errors.
      - Ignore `LineLength` errors under `hive/service/*` (118 lines).
      - Ignore `MethodName` error in `PasswdAuthenticationProvider.java` (1 line).
      - Ignore `NoFinalizer` error in `ThreadWithGarbageCleanup.java` (1 line).
      
      ## How was this patch tested?
      
      After passing Jenkins building, run `dev/lint-java` manually.
      ```bash
      $ dev/lint-java
      Checkstyle checks passed.
      ```
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #12831 from dongjoon-hyun/SPARK-15053.
      a7444570
  8. Apr 29, 2016
    • Andrew Or's avatar
      [SPARK-14988][PYTHON] SparkSession catalog and conf API · a7d0fedc
      Andrew Or authored
      ## What changes were proposed in this pull request?
      
      The `catalog` and `conf` APIs were exposed in `SparkSession` in #12713 and #12669. This patch adds those to the python API.
      
      ## How was this patch tested?
      
      Python tests.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #12765 from andrewor14/python-spark-session-more.
      a7d0fedc
    • Davies Liu's avatar
      [SPARK-14987][SQL] inline hive-service (cli) into sql/hive-thriftserver · 7feeb82c
      Davies Liu authored
      ## What changes were proposed in this pull request?
      
      This PR copy the thrift-server from hive-service-1.2 (including  TCLIService.thrift and generated Java source code) into sql/hive-thriftserver, so we can do further cleanup and improvements.
      
      ## How was this patch tested?
      
      Existing tests.
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #12764 from davies/thrift_server.
      7feeb82c
  9. Apr 28, 2016
  10. Apr 27, 2016
    • Dongjoon Hyun's avatar
      [SPARK-14867][BUILD] Remove `--force` option in `build/mvn` · f405de87
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      Currently, `build/mvn` provides a convenient option, `--force`, in order to use the recommended version of maven without changing PATH environment variable. However, there were two problems.
      
      - `dev/lint-java` does not use the newly installed maven.
      
        ```bash
      $ ./build/mvn --force clean
      $ ./dev/lint-java
      Using `mvn` from path: /usr/local/bin/mvn
      ```
      - It's not easy to type `--force` option always.
      
      If '--force' option is used once, we had better prefer the installed maven recommended by Spark.
      This PR makes `build/mvn` check the existence of maven installed by `--force` option first.
      
      According to the comments, this PR aims to the followings:
      - Detect the maven version from `pom.xml`.
      - Install maven if there is no or old maven.
      - Remove `--force` option.
      
      ## How was this patch tested?
      
      Manual.
      
      ```bash
      $ ./build/mvn --force clean
      $ ./dev/lint-java
      Using `mvn` from path: /Users/dongjoon/spark/build/apache-maven-3.3.9/bin/mvn
      ...
      $ rm -rf ./build/apache-maven-3.3.9/
      $ ./dev/lint-java
      Using `mvn` from path: /usr/local/bin/mvn
      ```
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #12631 from dongjoon-hyun/SPARK-14867.
      f405de87
    • Dongjoon Hyun's avatar
      [MINOR][BUILD] Enable RAT checking on `LZ4BlockInputStream.java`. · c5443560
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      Since `LZ4BlockInputStream.java` is not licensed to Apache Software Foundation (ASF), the Apache License header of that file is not monitored until now.
      This PR aims to enable RAT checking on `LZ4BlockInputStream.java` by excluding from `dev/.rat-excludes`.
      This will prevent accidental removal of Apache License header from that file.
      
      ## How was this patch tested?
      
      Pass the Jenkins tests (Specifically, RAT check stage).
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #12677 from dongjoon-hyun/minor_rat_exclusion_file.
      c5443560
  11. Apr 25, 2016
    • Andrew Or's avatar
      [SPARK-14721][SQL] Remove HiveContext (part 2) · 3c5e65c3
      Andrew Or authored
      ## What changes were proposed in this pull request?
      
      This removes the class `HiveContext` itself along with all code usages associated with it. The bulk of the work was already done in #12485. This is mainly just code cleanup and actually removing the class.
      
      Note: A couple of things will break after this patch. These will be fixed separately.
      - the python HiveContext
      - all the documentation / comments referencing HiveContext
      - there will be no more HiveContext in the REPL (fixed by #12589)
      
      ## How was this patch tested?
      
      No change in functionality.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #12585 from andrewor14/delete-hive-context.
      3c5e65c3
  12. Apr 24, 2016
    • Dongjoon Hyun's avatar
      [SPARK-14868][BUILD] Enable NewLineAtEofChecker in checkstyle and fix lint-java errors · d34d6503
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      Spark uses `NewLineAtEofChecker` rule in Scala by ScalaStyle. And, most Java code also comply with the rule. This PR aims to enforce the same rule `NewlineAtEndOfFile` by CheckStyle explicitly. Also, this fixes lint-java errors since SPARK-14465. The followings are the items.
      
      - Adds a new line at the end of the files (19 files)
      - Fixes 25 lint-java errors (12 RedundantModifier, 6 **ArrayTypeStyle**, 2 LineLength, 2 UnusedImports, 2 RegexpSingleline, 1 ModifierOrder)
      
      ## How was this patch tested?
      
      After the Jenkins test succeeds, `dev/lint-java` should pass. (Currently, Jenkins dose not run lint-java.)
      ```bash
      $ dev/lint-java
      Using `mvn` from path: /usr/local/bin/mvn
      Checkstyle checks passed.
      ```
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #12632 from dongjoon-hyun/SPARK-14868.
      d34d6503
  13. Apr 22, 2016
    • Yin Huai's avatar
      [SPARK-14807] Create a compatibility module · 7dde1da9
      Yin Huai authored
      ## What changes were proposed in this pull request?
      
      This PR creates a compatibility module in sql (called `hive-1-x-compatibility`), which will host HiveContext in Spark 2.0 (moving HiveContext to here will be done separately). This module is not included in assembly because only users who still want to access HiveContext need it.
      
      ## How was this patch tested?
      I manually tested `sbt/sbt -Phive package` and `mvn -Phive package -DskipTests`.
      
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #12580 from yhuai/compatibility.
      7dde1da9
  14. Apr 21, 2016
  15. Apr 17, 2016
    • Hemant Bhanawat's avatar
      [SPARK-13904][SCHEDULER] Add support for pluggable cluster manager · af1f4da7
      Hemant Bhanawat authored
      ## What changes were proposed in this pull request?
      
      This commit adds support for pluggable cluster manager. And also allows a cluster manager to clean up tasks without taking the parent process down.
      
      To plug a new external cluster manager, ExternalClusterManager trait should be implemented. It returns task scheduler and backend scheduler that will be used by SparkContext to schedule tasks. An external cluster manager is registered using the java.util.ServiceLoader mechanism (This mechanism is also being used to register data sources like parquet, json, jdbc etc.). This allows auto-loading implementations of ExternalClusterManager interface.
      
      Currently, when a driver fails, executors exit using system.exit. This does not bode well for cluster managers that would like to reuse the parent process of an executor. Hence,
      
        1. Moving system.exit to a function that can be overriden in subclasses of CoarseGrainedExecutorBackend.
        2. Added functionality of killing all the running tasks in an executor.
      
      ## How was this patch tested?
      ExternalClusterManagerSuite.scala was added to test this patch.
      
      Author: Hemant Bhanawat <hemant@snappydata.io>
      
      Closes #11723 from hbhanawat/pluggableScheduler.
      af1f4da7
  16. Apr 11, 2016
    • DB Tsai's avatar
      [SPARK-14462][ML][MLLIB] Add the mllib-local build to maven pom · efaf7d18
      DB Tsai authored
      ## What changes were proposed in this pull request?
      
      In order to separate the linear algebra, and vector matrix classes into a standalone jar, we need to setup the build first. This PR will create a new jar called mllib-local with minimal dependencies.
      
      The previous PR was failing the build because of `spark-core:test` dependency, and that was reverted. In this PR, `FunSuite` with `// scalastyle:ignore funsuite` in mllib-local test was used, similar to sketch.
      
      Thanks.
      
      ## How was this patch tested?
      
      Unit tests
      
      mengxr tedyu holdenk
      
      Author: DB Tsai <dbt@netflix.com>
      
      Closes #12298 from dbtsai/dbtsai-mllib-local-build-fix.
      efaf7d18
  17. Apr 09, 2016
    • Xiangrui Meng's avatar
      415446cc
    • DB Tsai's avatar
      [SPARK-14462][ML][MLLIB] add the mllib-local build to maven pom · 1598d11b
      DB Tsai authored
      ## What changes were proposed in this pull request?
      
      In order to separate the linear algebra, and vector matrix classes into a standalone jar, we need to setup the build first. This PR will create a new jar called mllib-local with minimal dependencies. The test scope will still depend on spark-core and spark-core-test in order to use the common utilities, but the runtime will avoid any platform dependency. Couple platform independent classes will be moved to this package to demonstrate how this work.
      
      ## How was this patch tested?
      
      Unit tests
      
      Author: DB Tsai <dbt@netflix.com>
      
      Closes #12241 from dbtsai/dbtsai-mllib-local-build.
      1598d11b
  18. Apr 08, 2016
    • Josh Rosen's avatar
      [SPARK-11416][BUILD] Update to Chill 0.8.0 & Kryo 3.0.3 · 906eef4c
      Josh Rosen authored
      This patch upgrades Chill to 0.8.0 and Kryo to 3.0.3. While we'll likely need to bump these dependencies again before Spark 2.0 (due to SPARK-14221 / https://github.com/twitter/chill/issues/252), I wanted to get the bulk of the Kryo 2 -> Kryo 3 migration done now in order to figure out whether there are any unexpected surprises.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #12076 from JoshRosen/kryo3.
      906eef4c
    • hyukjinkwon's avatar
      [SPARK-14103][SQL] Parse unescaped quotes in CSV data source. · 725b860e
      hyukjinkwon authored
      ## What changes were proposed in this pull request?
      
      This PR resolves the problem during parsing unescaped quotes in input data. For example, currently the data below:
      
      ```
      "a"b,ccc,ddd
      e,f,g
      ```
      
      produces a data below:
      
      - **Before**
      
      ```bash
      ["a"b,ccc,ddd[\n]e,f,g]  <- as a value.
      ```
      
      - **After**
      
      ```bash
      ["a"b], [ccc], [ddd]
      [e], [f], [g]
      ```
      
      This PR bumps up the Univocity parser's version. This was fixed in `2.0.2`, https://github.com/uniVocity/univocity-parsers/issues/60.
      
      ## How was this patch tested?
      
      Unit tests in `CSVSuite` and `sbt/sbt scalastyle`.
      
      Author: hyukjinkwon <gurwls223@gmail.com>
      
      Closes #12226 from HyukjinKwon/SPARK-14103-quote.
      725b860e
  19. Apr 04, 2016
    • Marcelo Vanzin's avatar
      [SPARK-13579][BUILD] Stop building the main Spark assembly. · 24d7d2e4
      Marcelo Vanzin authored
      This change modifies the "assembly/" module to just copy needed
      dependencies to its build directory, and modifies the packaging
      script to pick those up (and remove duplicate jars packages in the
      examples module).
      
      I also made some minor adjustments to dependencies to remove some
      test jars from the final packaging, and remove jars that conflict with each
      other when packaged separately (e.g. servlet api).
      
      Also note that this change restores guava in applications' classpaths, even
      though it's still shaded inside Spark. This is now needed for the Hadoop
      libraries that are packaged with Spark, which now are not processed by
      the shade plugin.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #11796 from vanzin/SPARK-13579.
      24d7d2e4
  20. Apr 01, 2016
    • Jacek Laskowski's avatar
      [SPARK-13825][CORE] Upgrade to Scala 2.11.8 · c16a3968
      Jacek Laskowski authored
      ## What changes were proposed in this pull request?
      
      Upgrade to 2.11.8 (from the current 2.11.7)
      
      ## How was this patch tested?
      
      A manual build
      
      Author: Jacek Laskowski <jacek@japila.pl>
      
      Closes #11681 from jaceklaskowski/SPARK-13825-scala-2_11_8.
      c16a3968
  21. Mar 31, 2016
    • Sital Kedia's avatar
      [SPARK-14277][CORE] Upgrade Snappy Java to 1.1.2.4 · 8de201ba
      Sital Kedia authored
      ## What changes were proposed in this pull request?
      
      Upgrade snappy to 1.1.2.4 to improve snappy read/write performance.
      
      ## How was this patch tested?
      
      Tested by running a job on the cluster and saw 7.5% cpu savings after this change.
      
      Author: Sital Kedia <skedia@fb.com>
      
      Closes #12096 from sitalkedia/snappyRelease.
      8de201ba
    • Herman van Hovell's avatar
      [SPARK-14211][SQL] Remove ANTLR3 based parser · a9b93e07
      Herman van Hovell authored
      ### What changes were proposed in this pull request?
      
      This PR removes the ANTLR3 based parser, and moves the new ANTLR4 based parser into the `org.apache.spark.sql.catalyst.parser package`.
      
      ### How was this patch tested?
      
      Existing unit tests.
      
      cc rxin andrewor14 yhuai
      
      Author: Herman van Hovell <hvanhovell@questtec.nl>
      
      Closes #12071 from hvanhovell/SPARK-14211.
      a9b93e07
  22. Mar 28, 2016
    • Herman van Hovell's avatar
      [SPARK-13713][SQL] Migrate parser from ANTLR3 to ANTLR4 · 600c0b69
      Herman van Hovell authored
      ### What changes were proposed in this pull request?
      The current ANTLR3 parser is quite complex to maintain and suffers from code blow-ups. This PR introduces a new parser that is based on ANTLR4.
      
      This parser is based on the [Presto's SQL parser](https://github.com/facebook/presto/blob/master/presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4). The current implementation can parse and create Catalyst and SQL plans. Large parts of the HiveQl DDL and some of the DML functionality is currently missing, the plan is to add this in follow-up PRs.
      
      This PR is a work in progress, and work needs to be done in the following area's:
      
      - [x] Error handling should be improved.
      - [x] Documentation should be improved.
      - [x] Multi-Insert needs to be tested.
      - [ ] Naming and package locations.
      
      ### How was this patch tested?
      
      Catalyst and SQL unit tests.
      
      Author: Herman van Hovell <hvanhovell@questtec.nl>
      
      Closes #11557 from hvanhovell/ngParser.
      600c0b69
  23. Mar 25, 2016
    • Shixiong Zhu's avatar
      [SPARK-14073][STREAMING][TEST-MAVEN] Move flume back to Spark · 24587ce4
      Shixiong Zhu authored
      ## What changes were proposed in this pull request?
      
      This PR moves flume back to Spark as per the discussion in the dev mail-list.
      
      ## How was this patch tested?
      
      Existing Jenkins tests.
      
      Author: Shixiong Zhu <shixiong@databricks.com>
      
      Closes #11895 from zsxwing/move-flume-back.
      24587ce4
    • Holden Karau's avatar
      [SPARK-13887][PYTHON][TRIVIAL][BUILD] Make lint-python script fail fast · 55a60576
      Holden Karau authored
      ## What changes were proposed in this pull request?
      
      Change lint python script to stop on first error rather than building them up so its clearer why we failed (requested by rxin). Also while in the file, remove the commented out code.
      
      ## How was this patch tested?
      
      Manually ran lint-python script with & without pep8 errors locally and verified expected results.
      
      Author: Holden Karau <holden@us.ibm.com>
      
      Closes #11898 from holdenk/SPARK-13887-pylint-fast-fail.
      55a60576
  24. Mar 23, 2016
    • Sun Rui's avatar
      [SPARK-14074][SPARKR] Specify commit sha1 ID when using install_github to install intr package. · 7d117501
      Sun Rui authored
      ## What changes were proposed in this pull request?
      
      In dev/lint-r.R, `install_github` makes our builds depend on a unstable source. This may cause un-expected test failures and then build break. This PR adds a specified commit sha1 ID to `install_github` to get a stable source.
      
      ## How was this patch tested?
      dev/lint-r
      
      Author: Sun Rui <rui.sun@intel.com>
      
      Closes #11913 from sun-rui/SPARK-14074.
      7d117501
  25. Mar 21, 2016
    • Dongjoon Hyun's avatar
      [SPARK-14011][CORE][SQL] Enable `LineLength` Java checkstyle rule · 20fd2541
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      [Spark Coding Style Guide](https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide) has 100-character limit on lines, but it's disabled for Java since 11/09/15. This PR enables **LineLength** checkstyle again. To help that, this also introduces **RedundantImport** and **RedundantModifier**, too. The following is the diff on `checkstyle.xml`.
      
      ```xml
      -        <!-- TODO: 11/09/15 disabled - the lengths are currently > 100 in many places -->
      -        <!--
               <module name="LineLength">
                   <property name="max" value="100"/>
                   <property name="ignorePattern" value="^package.*|^import.*|a href|href|http://|https://|ftp://"/>
               </module>
      -        -->
               <module name="NoLineWrap"/>
               <module name="EmptyBlock">
                   <property name="option" value="TEXT"/>
       -167,5 +164,7
               </module>
               <module name="CommentsIndentation"/>
               <module name="UnusedImports"/>
      +        <module name="RedundantImport"/>
      +        <module name="RedundantModifier"/>
      ```
      
      ## How was this patch tested?
      
      Currently, `lint-java` is disabled in Jenkins. It needs a manual test.
      After passing the Jenkins tests, `dev/lint-java` should passes locally.
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #11831 from dongjoon-hyun/SPARK-14011.
      20fd2541
  26. Mar 17, 2016
    • Josh Rosen's avatar
      [SPARK-13948] MiMa check should catch if the visibility changes to private · 82066a16
      Josh Rosen authored
      MiMa excludes are currently generated using both the current Spark version's classes and Spark 1.2.0's classes, but this doesn't make sense: we should only be ignoring classes which were `private` in the previous Spark version, not classes which became private in the current version.
      
      This patch updates `dev/mima` to only generate excludes with respect to the previous artifacts that MiMa checks against. It also updates `MimaBuild` so that `excludeClass` only applies directly to the class being excluded and not to its companion object (since a class and its companion object can have different accessibility).
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #11774 from JoshRosen/SPARK-13948.
      82066a16
  27. Mar 15, 2016
    • Marcelo Vanzin's avatar
      [SPARK-13576][BUILD] Don't create assembly for examples. · 48978abf
      Marcelo Vanzin authored
      As part of the goal to stop creating assemblies in Spark, this change
      modifies the mvn and sbt builds to not create an assembly for examples.
      
      Instead, dependencies are copied to the build directory (under
      target/scala-xx/jars), and in the final archive, into the "examples/jars"
      directory.
      
      To avoid having to deal too much with Windows batch files, I made examples
      run through the launcher library; the spark-submit launcher now has a
      special mode to run examples, which adds all the necessary jars to the
      spark-submit command line, and replaces the bash and batch scripts that
      were used to run examples. The scripts are now just a thin wrapper around
      spark-submit; another advantage is that now all spark-submit options are
      supported.
      
      There are a few glitches; in the mvn build, a lot of duplicated dependencies
      get copied, because they are promoted to "compile" scope due to extra
      dependencies in the examples module (such as HBase). In the sbt build,
      all dependencies are copied, because there doesn't seem to be an easy
      way to filter things.
      
      I plan to clean some of this up when the rest of the tasks are finished.
      When the main assembly is replaced with jars, we can remove duplicate jars
      from the examples directory during packaging.
      
      Tested by running SparkPi in: maven build, sbt build, dist created by
      make-distribution.sh.
      
      Finally: note that running the "assembly" target in sbt doesn't build
      the examples anymore. You need to run "package" for that.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #11452 from vanzin/SPARK-13576.
      48978abf
Loading