Skip to content
Snippets Groups Projects
  1. Jan 29, 2015
    • Patrick Wendell's avatar
      Revert "[WIP] [SPARK-3996]: Shade Jetty in Spark deliverables" · d2071e8f
      Patrick Wendell authored
      This reverts commit f240fe39.
      d2071e8f
    • Patrick Wendell's avatar
      [WIP] [SPARK-3996]: Shade Jetty in Spark deliverables · f240fe39
      Patrick Wendell authored
      This patch piggy-back's on vanzin's work to simplify the Guava shading,
      and adds Jetty as a shaded library in Spark. Other than adding Jetty,
      it consilidates the \<artifactSet\>'s into the root pom. I found it was
      a bit easier to follow that way, since you don't need to look into
      child pom's to find out specific artifact sets included in shading.
      
      Author: Patrick Wendell <patrick@databricks.com>
      
      Closes #4252 from pwendell/jetty and squashes the following commits:
      
      19f0710 [Patrick Wendell] More code review feedback
      961452d [Patrick Wendell] Responding to feedback from Marcello
      6df25ca [Patrick Wendell] [WIP] [SPARK-3996]: Shade Jetty in Spark deliverables
      f240fe39
  2. Jan 25, 2015
  3. Jan 19, 2015
    • Venkata Ramana Gollamudi's avatar
      [SPARK-4504][Examples] fix run-example failure if multiple assembly jars exist · 74de94ea
      Venkata Ramana Gollamudi authored
      Fix run-example script to fail fast with useful error message if multiple
      example assembly JARs are present.
      
      Author: Venkata Ramana Gollamudi <ramana.gollamudi@huawei.com>
      
      Closes #3377 from gvramana/run-example_fails and squashes the following commits:
      
      fa7f481 [Venkata Ramana Gollamudi] Fixed review comments, avoiding ls output scanning.
      6aa1ab7 [Venkata Ramana Gollamudi] Fix run-examples script error during multiple jars
      74de94ea
    • Jongyoul Lee's avatar
      [SPARK-5088] Use spark-class for running executors directly · 4a4f9ccb
      Jongyoul Lee authored
      Author: Jongyoul Lee <jongyoul@gmail.com>
      
      Closes #3897 from jongyoul/SPARK-5088 and squashes the following commits:
      
      8232aa8 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Added a listenerBus for fixing test cases
      932289f [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Rebased from master
      613cb47 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Fixed code if spark.executor.uri doesn't have any value - Added test cases
      ff57bda [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Adjusted orders of import
      97e4bd4 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Changed command for using spark-class directly - Delete sbin/spark-executor and moved some codes into spark-class' case statement
      4a4f9ccb
  4. Jan 16, 2015
  5. Jan 09, 2015
    • WangTaoTheTonic's avatar
      [SPARK-4990][Deploy]to find default properties file, search SPARK_CONF_DIR first · 8782eb99
      WangTaoTheTonic authored
      https://issues.apache.org/jira/browse/SPARK-4990
      
      Author: WangTaoTheTonic <barneystinson@aliyun.com>
      Author: WangTao <barneystinson@aliyun.com>
      
      Closes #3823 from WangTaoTheTonic/SPARK-4990 and squashes the following commits:
      
      133c43e [WangTao] Update spark-submit2.cmd
      b1ab402 [WangTao] Update spark-submit
      4cc7f34 [WangTaoTheTonic] rebase
      55300bc [WangTaoTheTonic] use export to make it global
      d8d3cb7 [WangTaoTheTonic] remove blank line
      07b9ebf [WangTaoTheTonic] check SPARK_CONF_DIR instead of checking properties file
      c5a85eb [WangTaoTheTonic] to find default properties file, search SPARK_CONF_DIR first
      8782eb99
  6. Jan 08, 2015
    • Marcelo Vanzin's avatar
      [SPARK-4048] Enhance and extend hadoop-provided profile. · 48cecf67
      Marcelo Vanzin authored
      This change does a few things to make the hadoop-provided profile more useful:
      
      - Create new profiles for other libraries / services that might be provided by the infrastructure
      - Simplify and fix the poms so that the profiles are only activated while building assemblies.
      - Fix tests so that they're able to run when the profiles are activated
      - Add a new env variable to be used by distributions that use these profiles to provide the runtime
        classpath for Spark jobs and daemons.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #2982 from vanzin/SPARK-4048 and squashes the following commits:
      
      82eb688 [Marcelo Vanzin] Add a comment.
      eb228c0 [Marcelo Vanzin] Fix borked merge.
      4e38f4e [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
      9ef79a3 [Marcelo Vanzin] Alternative way to propagate test classpath to child processes.
      371ebee [Marcelo Vanzin] Review feedback.
      52f366d [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
      83099fc [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
      7377e7b [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
      322f882 [Marcelo Vanzin] Fix merge fail.
      f24e9e7 [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
      8b00b6a [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
      9640503 [Marcelo Vanzin] Cleanup child process log message.
      115fde5 [Marcelo Vanzin] Simplify a comment (and make it consistent with another pom).
      e3ab2da [Marcelo Vanzin] Fix hive-thriftserver profile.
      7820d58 [Marcelo Vanzin] Fix CliSuite with provided profiles.
      1be73d4 [Marcelo Vanzin] Restore flume-provided profile.
      d1399ed [Marcelo Vanzin] Restore jetty dependency.
      82a54b9 [Marcelo Vanzin] Remove unused profile.
      5c54a25 [Marcelo Vanzin] Fix HiveThriftServer2Suite with *-provided profiles.
      1fc4d0b [Marcelo Vanzin] Update dependencies for hive-thriftserver.
      f7b3bbe [Marcelo Vanzin] Add snappy to hadoop-provided list.
      9e4e001 [Marcelo Vanzin] Remove duplicate hive profile.
      d928d62 [Marcelo Vanzin] Redirect child stderr to parent's log.
      4d67469 [Marcelo Vanzin] Propagate SPARK_DIST_CLASSPATH on Yarn.
      417d90e [Marcelo Vanzin] Introduce "SPARK_DIST_CLASSPATH".
      2f95f0d [Marcelo Vanzin] Propagate classpath to child processes during testing.
      1adf91c [Marcelo Vanzin] Re-enable maven-install-plugin for a few projects.
      284dda6 [Marcelo Vanzin] Rework the "hadoop-provided" profile, add new ones.
      48cecf67
    • WangTaoTheTonic's avatar
      [SPARK-5130][Deploy]Take yarn-cluster as cluster mode in spark-submit · 0760787d
      WangTaoTheTonic authored
      https://issues.apache.org/jira/browse/SPARK-5130
      
      Author: WangTaoTheTonic <barneystinson@aliyun.com>
      
      Closes #3929 from WangTaoTheTonic/SPARK-5130 and squashes the following commits:
      
      c490648 [WangTaoTheTonic] take yarn-cluster as cluster mode in spark-submit
      0760787d
  7. Dec 19, 2014
  8. Dec 10, 2014
    • Daoyuan Wang's avatar
      [SPARK-4793] [Deploy] ensure .jar at end of line · e230da18
      Daoyuan Wang authored
      sometimes I switch between different version and do not want to rebuild spark, so I rename assembly.jar into .jar.bak, but still caught by `compute-classpath.sh`
      
      Author: Daoyuan Wang <daoyuan.wang@intel.com>
      
      Closes #3641 from adrian-wang/jar and squashes the following commits:
      
      45cbfd0 [Daoyuan Wang] ensure .jar at end of line
      e230da18
    • GuoQiang Li's avatar
      [SPARK-4161]Spark shell class path is not correctly set if... · 742e7093
      GuoQiang Li authored
      [SPARK-4161]Spark shell class path is not correctly set if "spark.driver.extraClassPath" is set in defaults.conf
      
      Author: GuoQiang Li <witgo@qq.com>
      
      Closes #3050 from witgo/SPARK-4161 and squashes the following commits:
      
      abb6fa4 [GuoQiang Li] move usejavacp opt to spark-shell
      89e39e7 [GuoQiang Li] review commit
      c2a6f04 [GuoQiang Li] Spark shell class path is not correctly set if "spark.driver.extraClassPath" is set in defaults.conf
      742e7093
  9. Dec 04, 2014
  10. Nov 30, 2014
    • carlmartin's avatar
      [SPARK-4623]Add the some error infomation if using spark-sql in yarn-cluster mode · aea7a997
      carlmartin authored
      If using spark-sql in yarn-cluster mode, print an error infomation just as the spark shell in yarn-cluster mode.
      
      Author: carlmartin <carlmartinmax@gmail.com>
      Author: huangzhaowei <carlmartinmax@gmail.com>
      
      Closes #3479 from SaintBacchus/sparkSqlShell and squashes the following commits:
      
      35829a9 [carlmartin] improve the description of comment
      e6c1eb7 [carlmartin] add a comment in bin/spark-sql to remind user who wants to change the class
      f1c5c8d [carlmartin] Merge branch 'master' into sparkSqlShell
      8e112c5 [huangzhaowei] singular form
      ec957bc [carlmartin] Add the some error infomation if using spark-sql in yarn-cluster mode
      7bcecc2 [carlmartin] Merge branch 'master' of https://github.com/apache/spark into codereview
      4fad75a [carlmartin] Add the Error infomation using spark-sql in yarn-cluster mode
      aea7a997
  11. Nov 18, 2014
    • Davies Liu's avatar
      [SPARK-4017] show progress bar in console · e34f38ff
      Davies Liu authored
      The progress bar will look like this:
      
      ![1___spark_job__85_250_finished__4_are_running___java_](https://cloud.githubusercontent.com/assets/40902/4854813/a02f44ac-6099-11e4-9060-7c73a73151d6.png)
      
      In the right corner, the numbers are: finished tasks, running tasks, total tasks.
      
      After the stage has finished, it will disappear.
      
      The progress bar is only showed if logging level is WARN or higher (but progress in title is still showed), it can be turned off by spark.driver.showConsoleProgress.
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #3029 from davies/progress and squashes the following commits:
      
      95336d5 [Davies Liu] Merge branch 'master' of github.com:apache/spark into progress
      fc49ac8 [Davies Liu] address commentse
      2e90f75 [Davies Liu] show multiple stages in same time
      0081bcc [Davies Liu] address comments
      38c42f1 [Davies Liu] fix tests
      ab87958 [Davies Liu] disable progress bar during tests
      30ac852 [Davies Liu] re-implement progress bar
      b3f34e5 [Davies Liu] Merge branch 'master' of github.com:apache/spark into progress
      6fd30ff [Davies Liu] show progress bar if no task finished in 500ms
      e4e7344 [Davies Liu] refactor
      e1f524d [Davies Liu] revert unnecessary change
      a60477c [Davies Liu] Merge branch 'master' of github.com:apache/spark into progress
      5cae3f2 [Davies Liu] fix style
      ea49fe0 [Davies Liu] address comments
      bc53d99 [Davies Liu] refactor
      e6bb189 [Davies Liu] fix logging in sparkshell
      7e7d4e7 [Davies Liu] address commments
      5df26bb [Davies Liu] fix style
      9e42208 [Davies Liu] show progress bar in console and title
      e34f38ff
  12. Nov 14, 2014
    • Davies Liu's avatar
      [SPARK-4415] [PySpark] JVM should exit after Python exit · 7fe08b43
      Davies Liu authored
      When JVM is started in a Python process, it should exit once the stdin is closed.
      
      test: add spark.driver.memory in conf/spark-defaults.conf
      
      ```
      daviesdm:~/work/spark$ cat conf/spark-defaults.conf
      spark.driver.memory       8g
      daviesdm:~/work/spark$ bin/pyspark
      >>> quit
      daviesdm:~/work/spark$ jps
      4931 Jps
      286
      daviesdm:~/work/spark$ python wc.py
      943738
      0.719928026199
      daviesdm:~/work/spark$ jps
      286
      4990 Jps
      ```
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #3274 from davies/exit and squashes the following commits:
      
      df0e524 [Davies Liu] address comments
      ce8599c [Davies Liu] address comments
      050651f [Davies Liu] JVM should exit after Python exit
      7fe08b43
  13. Nov 11, 2014
    • Prashant Sharma's avatar
      Support cross building for Scala 2.11 · daaca14c
      Prashant Sharma authored
      Let's give this another go using a version of Hive that shades its JLine dependency.
      
      Author: Prashant Sharma <prashant.s@imaginea.com>
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #3159 from pwendell/scala-2.11-prashant and squashes the following commits:
      
      e93aa3e [Patrick Wendell] Restoring -Phive-thriftserver profile and cleaning up build script.
      f65d17d [Patrick Wendell] Fixing build issue due to merge conflict
      a8c41eb [Patrick Wendell] Reverting dev/run-tests back to master state.
      7a6eb18 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into scala-2.11-prashant
      583aa07 [Prashant Sharma] REVERT ME: removed hive thirftserver
      3680e58 [Prashant Sharma] Revert "REVERT ME: Temporarily removing some Cli tests."
      935fb47 [Prashant Sharma] Revert "Fixed by disabling a few tests temporarily."
      925e90f [Prashant Sharma] Fixed by disabling a few tests temporarily.
      2fffed3 [Prashant Sharma] Exclude groovy from sbt build, and also provide a way for such instances in future.
      8bd4e40 [Prashant Sharma] Switched to gmaven plus, it fixes random failures observer with its predecessor gmaven.
      5272ce5 [Prashant Sharma] SPARK_SCALA_VERSION related bugs.
      2121071 [Patrick Wendell] Migrating version detection to PySpark
      b1ed44d [Patrick Wendell] REVERT ME: Temporarily removing some Cli tests.
      1743a73 [Patrick Wendell] Removing decimal test that doesn't work with Scala 2.11
      f5cad4e [Patrick Wendell] Add Scala 2.11 docs
      210d7e1 [Patrick Wendell] Revert "Testing new Hive version with shaded jline"
      48518ce [Patrick Wendell] Remove association of Hive and Thriftserver profiles.
      e9d0a06 [Patrick Wendell] Revert "Enable thritfserver for Scala 2.10 only"
      67ec364 [Patrick Wendell] Guard building of thriftserver around Scala 2.10 check
      8502c23 [Patrick Wendell] Enable thritfserver for Scala 2.10 only
      e22b104 [Patrick Wendell] Small fix in pom file
      ec402ab [Patrick Wendell] Various fixes
      0be5a9d [Patrick Wendell] Testing new Hive version with shaded jline
      4eaec65 [Prashant Sharma] Changed scripts to ignore target.
      5167bea [Prashant Sharma] small correction
      a4fcac6 [Prashant Sharma] Run against scala 2.11 on jenkins.
      80285f4 [Prashant Sharma] MAven equivalent of setting spark.executor.extraClasspath during tests.
      034b369 [Prashant Sharma] Setting test jars on executor classpath during tests from sbt.
      d4874cb [Prashant Sharma] Fixed Python Runner suite. null check should be first case in scala 2.11.
      6f50f13 [Prashant Sharma] Fixed build after rebasing with master. We should use ${scala.binary.version} instead of just 2.10
      e56ca9d [Prashant Sharma] Print an error if build for 2.10 and 2.11 is spotted.
      937c0b8 [Prashant Sharma] SCALA_VERSION -> SPARK_SCALA_VERSION
      cb059b0 [Prashant Sharma] Code review
      0476e5e [Prashant Sharma] Scala 2.11 support with repl and all build changes.
      daaca14c
  14. Oct 31, 2014
    • Kousuke Saruta's avatar
      [SPARK-3870] EOL character enforcement · 55ab7770
      Kousuke Saruta authored
      We have shell scripts and Windows batch files, so we should enforce proper EOL character.
      
      Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
      
      Closes #2726 from sarutak/eol-enforcement and squashes the following commits:
      
      9748c3f [Kousuke Saruta] Fixed make.bat
      252de89 [Kousuke Saruta] Removed extra characters from make.bat
      5b81c00 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into eol-enforcement
      8633ed2 [Kousuke Saruta] merge branch 'master' of git://git.apache.org/spark into eol-enforcement
      5d630d8 [Kousuke Saruta] Merged
      ba10797 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into eol-enforcement
      7407515 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into eol-enforcement
      772fd4e [Kousuke Saruta] Normized EOL character in make.bat and compute-classpath.cmd
      ac7f873 [Kousuke Saruta] Added an entry for .gitattributes to .rat-excludes
      1570e77 [Kousuke Saruta] Added .gitattributes
      55ab7770
  15. Oct 30, 2014
    • GuoQiang Li's avatar
      [SPARK-1720][SPARK-1719] use LD_LIBRARY_PATH instead of -Djava.library.path · cd739bd7
      GuoQiang Li authored
      - [X] Standalone
      - [X] YARN
      - [X] Mesos
      - [X]  Mac OS X
      - [X] Linux
      - [ ]  Windows
      
      This is another implementation about #1031
      
      Author: GuoQiang Li <witgo@qq.com>
      
      Closes #2711 from witgo/SPARK-1719 and squashes the following commits:
      
      c7b26f6 [GuoQiang Li] review commits
      4488e41 [GuoQiang Li] Refactoring CommandUtils
      a444094 [GuoQiang Li] review commits
      40c0b4a [GuoQiang Li] Add buildLocalCommand method
      c1a0ddd [GuoQiang Li] fix comments
      156ce88 [GuoQiang Li] review commit
      38aa377 [GuoQiang Li] Refactor CommandUtils.scala
      4269e00 [GuoQiang Li] Refactor SparkSubmitDriverBootstrapper.scala
      7a1d634 [GuoQiang Li] use LD_LIBRARY_PATH instead of -Djava.library.path
      cd739bd7
  16. Oct 28, 2014
    • Michael Griffiths's avatar
      [SPARK-4065] Add check for IPython on Windows · 2f254dac
      Michael Griffiths authored
      This issue employs logic similar to the bash launcher (pyspark) to check
      if IPTYHON=1, and if so launch ipython with options in IPYTHON_OPTS.
      This fix assumes that ipython is available in the system Path, and can
      be invoked with a plain "ipython" command.
      
      Author: Michael Griffiths <msjgriffiths@gmail.com>
      
      Closes #2910 from msjgriffiths/pyspark-windows and squashes the following commits:
      
      ef34678 [Michael Griffiths] Change build message to comply with [SPARK-3775]
      361e3d8 [Michael Griffiths] [SPARK-4065] Add check for IPython on Windows
      9ce72d1 [Michael Griffiths] [SPARK-4065] Add check for IPython on Windows
      2f254dac
  17. Oct 14, 2014
    • Masayoshi TSUZUKI's avatar
      [SPARK-3943] Some scripts bin\*.cmd pollutes environment variables in Windows · 66af8e25
      Masayoshi TSUZUKI authored
      Modified not to pollute environment variables.
      Just moved the main logic into `XXX2.cmd` from `XXX.cmd`, and call `XXX2.cmd` with cmd command in `XXX.cmd`.
      `pyspark.cmd` and `spark-class.cmd` are already using the same way, but `spark-shell.cmd`, `spark-submit.cmd` and `/python/docs/make.bat` are not.
      
      Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp>
      
      Closes #2797 from tsudukim/feature/SPARK-3943 and squashes the following commits:
      
      b397a7d [Masayoshi TSUZUKI] [SPARK-3943] Some scripts bin\*.cmd pollutes environment variables in Windows
      66af8e25
    • cocoatomo's avatar
      [SPARK-3869] ./bin/spark-class miss Java version with _JAVA_OPTIONS set · 7b4f39f6
      cocoatomo authored
      When _JAVA_OPTIONS environment variable is set, a command "java -version" outputs a message like "Picked up _JAVA_OPTIONS: -Dfile.encoding=UTF-8".
      ./bin/spark-class knows java version from the first line of "java -version" output, so it mistakes java version with _JAVA_OPTIONS set.
      
      Author: cocoatomo <cocoatomo77@gmail.com>
      
      Closes #2725 from cocoatomo/issues/3869-mistake-java-version and squashes the following commits:
      
      f894ebd [cocoatomo] [SPARK-3869] ./bin/spark-class miss Java version with _JAVA_OPTIONS set
      7b4f39f6
  18. Oct 09, 2014
    • Josh Rosen's avatar
      [SPARK-3772] Allow `ipython` to be used by Pyspark workers; IPython support improvements: · 4e9b551a
      Josh Rosen authored
      This pull request addresses a few issues related to PySpark's IPython support:
      
      - Fix the remaining uses of the '-u' flag, which IPython doesn't support (see SPARK-3772).
      - Change PYSPARK_PYTHON_OPTS to PYSPARK_DRIVER_PYTHON_OPTS, so that the old name is reserved in case we ever want to allow the worker Python options to be customized (this variable was introduced in #2554 and hasn't landed in a release yet, so this doesn't break any compatibility).
      - Introduce a PYSPARK_DRIVER_PYTHON option that allows the driver to use `ipython` while the workers use a different Python version.
      - Attempt to use Python 2.7 by default if PYSPARK_PYTHON is not specified.
      - Retain the old semantics for IPYTHON=1 and IPYTHON_OPTS (to avoid breaking existing example programs).
      
      There are more details in a block comment in `bin/pyspark`.
      
      Author: Josh Rosen <joshrosen@apache.org>
      
      Closes #2651 from JoshRosen/SPARK-3772 and squashes the following commits:
      
      7b8eb86 [Josh Rosen] More changes to PySpark python executable configuration:
      c4f5778 [Josh Rosen] [SPARK-3772] Allow ipython to be used by Pyspark workers; IPython fixes:
      4e9b551a
  19. Oct 07, 2014
    • Masayoshi TSUZUKI's avatar
      [SPARK-3808] PySpark fails to start in Windows · 12e2551e
      Masayoshi TSUZUKI authored
      Modified syntax error of *.cmd script.
      
      Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp>
      
      Closes #2669 from tsudukim/feature/SPARK-3808 and squashes the following commits:
      
      7f804e6 [Masayoshi TSUZUKI] [SPARK-3808] PySpark fails to start in Windows
      12e2551e
  20. Oct 03, 2014
    • Masayoshi TSUZUKI's avatar
      [SPARK-3774] typo comment in bin/utils.sh · e5566e05
      Masayoshi TSUZUKI authored
      Modified the comment of bin/utils.sh.
      
      Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp>
      
      Closes #2639 from tsudukim/feature/SPARK-3774 and squashes the following commits:
      
      707b779 [Masayoshi TSUZUKI] [SPARK-3774] typo comment in bin/utils.sh
      e5566e05
    • Masayoshi TSUZUKI's avatar
      [SPARK-3775] Not suitable error message in spark-shell.cmd · 358d7ffd
      Masayoshi TSUZUKI authored
      Modified some sentence of error message in bin\*.cmd.
      
      Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp>
      
      Closes #2640 from tsudukim/feature/SPARK-3775 and squashes the following commits:
      
      3458afb [Masayoshi TSUZUKI] [SPARK-3775] Not suitable error message in spark-shell.cmd
      358d7ffd
    • EugenCepoi's avatar
      SPARK-2058: Overriding SPARK_HOME/conf with SPARK_CONF_DIR · f0811f92
      EugenCepoi authored
      Update of PR #997.
      
      With this PR, setting SPARK_CONF_DIR overrides SPARK_HOME/conf (not only spark-defaults.conf and spark-env).
      
      Author: EugenCepoi <cepoi.eugen@gmail.com>
      
      Closes #2481 from EugenCepoi/SPARK-2058 and squashes the following commits:
      
      0bb32c2 [EugenCepoi] use orElse orNull and fixing trailing percent in compute-classpath.cmd
      77f35d7 [EugenCepoi] SPARK-2058: Overriding SPARK_HOME/conf with SPARK_CONF_DIR
      f0811f92
  21. Oct 02, 2014
    • cocoatomo's avatar
      [SPARK-3706][PySpark] Cannot run IPython REPL with IPYTHON set to "1" and PYSPARK_PYTHON unset · 5b4a5b1a
      cocoatomo authored
      ### Problem
      
      The section "Using the shell" in Spark Programming Guide (https://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) says that we can run pyspark REPL through IPython.
      But a folloing command does not run IPython but a default Python executable.
      
      ```
      $ IPYTHON=1 ./bin/pyspark
      Python 2.7.8 (default, Jul  2 2014, 10:14:46)
      ...
      ```
      
      the spark/bin/pyspark script on the commit b235e013 decides which executable and options it use folloing way.
      
      1. if PYSPARK_PYTHON unset
         * → defaulting to "python"
      2. if IPYTHON_OPTS set
         * → set IPYTHON "1"
      3. some python scripts passed to ./bin/pyspak → run it with ./bin/spark-submit
         * out of this issues scope
      4. if IPYTHON set as "1"
         * → execute $PYSPARK_PYTHON (default: ipython) with arguments $IPYTHON_OPTS
         * otherwise execute $PYSPARK_PYTHON
      
      Therefore, when PYSPARK_PYTHON is unset, python is executed though IPYTHON is "1".
      In other word, when PYSPARK_PYTHON is unset, IPYTHON_OPS and IPYTHON has no effect on decide which command to use.
      
      PYSPARK_PYTHON | IPYTHON_OPTS | IPYTHON | resulting command | expected command
      ---- | ---- | ----- | ----- | -----
      (unset → defaults to python) | (unset) | (unset) | python | (same)
      (unset → defaults to python) | (unset) | 1 | python | ipython
      (unset → defaults to python) | an_option | (unset → set to 1) | python an_option | ipython an_option
      (unset → defaults to python) | an_option | 1 | python an_option | ipython an_option
      ipython | (unset) | (unset) | ipython | (same)
      ipython | (unset) | 1 | ipython | (same)
      ipython | an_option | (unset → set to 1) | ipython an_option | (same)
      ipython | an_option | 1 | ipython an_option | (same)
      
      ### Suggestion
      
      The pyspark script should determine firstly whether a user wants to run IPython or other executables.
      
      1. if IPYTHON_OPTS set
         * set IPYTHON "1"
      2.  if IPYTHON has a value "1"
         * PYSPARK_PYTHON defaults to "ipython" if not set
      3. PYSPARK_PYTHON defaults to "python" if not set
      
      See the pull request for more detailed modification.
      
      Author: cocoatomo <cocoatomo77@gmail.com>
      
      Closes #2554 from cocoatomo/issues/cannot-run-ipython-without-options and squashes the following commits:
      
      d2a9b06 [cocoatomo] [SPARK-3706][PySpark] Use PYTHONUNBUFFERED environment variable instead of -u option
      264114c [cocoatomo] [SPARK-3706][PySpark] Remove the sentence about deprecated environment variables
      42e02d5 [cocoatomo] [SPARK-3706][PySpark] Replace environment variables used to customize execution of PySpark REPL
      10d56fb [cocoatomo] [SPARK-3706][PySpark] Cannot run IPython REPL with IPYTHON set to "1" and PYSPARK_PYTHON unset
      5b4a5b1a
  22. Oct 01, 2014
  23. Sep 18, 2014
  24. Sep 15, 2014
  25. Sep 12, 2014
    • Marcelo Vanzin's avatar
      [SPARK-3217] Add Guava to classpath when SPARK_PREPEND_CLASSES is set. · af258382
      Marcelo Vanzin authored
      When that option is used, the compiled classes from the build directory
      are prepended to the classpath. Now that we avoid packaging Guava, that
      means we have classes referencing the original Guava location in the app's
      classpath, so errors happen.
      
      For that case, add Guava manually to the classpath.
      
      Note: if Spark is compiled with "-Phadoop-provided", it's tricky to
      make things work with SPARK_PREPEND_CLASSES, because you need to add
      the Hadoop classpath using SPARK_CLASSPATH and that means the older
      Hadoop Guava overrides the newer one Spark needs. So someone using
      SPARK_PREPEND_CLASSES needs to remember to not use that profile.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #2141 from vanzin/SPARK-3217 and squashes the following commits:
      
      b967324 [Marcelo Vanzin] [SPARK-3217] Add Guava to classpath when SPARK_PREPEND_CLASSES is set.
      af258382
  26. Sep 08, 2014
    • Prashant Sharma's avatar
      SPARK-3337 Paranoid quoting in shell to allow install dirs with spaces within. · e16a8e7d
      Prashant Sharma authored
      ...
      
      Tested ! TBH, it isn't a great idea to have directory with spaces within. Because emacs doesn't like it then hadoop doesn't like it. and so on...
      
      Author: Prashant Sharma <prashant.s@imaginea.com>
      
      Closes #2229 from ScrapCodes/SPARK-3337/quoting-shell-scripts and squashes the following commits:
      
      d4ad660 [Prashant Sharma] SPARK-3337 Paranoid quoting in shell to allow install dirs with spaces within.
      e16a8e7d
  27. Sep 05, 2014
  28. Aug 28, 2014
    • Andrew Or's avatar
      [HOTFIX] Wait for EOF only for the PySpark shell · dafe3434
      Andrew Or authored
      In `SparkSubmitDriverBootstrapper`, we wait for the parent process to send us an `EOF` before finishing the application. This is applicable for the PySpark shell because we terminate the application the same way. However if we run a python application, for instance, the JVM actually never exits unless it receives a manual EOF from the user. This is causing a few tests to timeout.
      
      We only need to do this for the PySpark shell because Spark submit runs as a python subprocess only in this case. Thus, the normal Spark shell doesn't need to go through this case even though it is also a REPL.
      
      Thanks davies for reporting this.
      
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #2170 from andrewor14/bootstrap-hotfix and squashes the following commits:
      
      42963f5 [Andrew Or] Do not wait for EOF unless this is the pyspark shell
      dafe3434
  29. Aug 27, 2014
    • Rob O'Dwyer's avatar
      SPARK-3265 Allow using custom ipython executable with pyspark · f38fab97
      Rob O'Dwyer authored
      Although you can make pyspark use ipython with `IPYTHON=1`, and also change the python executable with `PYSPARK_PYTHON=...`, you can't use both at the same time because it hardcodes the default ipython script.
      
      This makes it use the `PYSPARK_PYTHON` variable if present and fall back to default python, similarly to how the default python executable is handled.
      
      So you can use a custom ipython like so:
      `PYSPARK_PYTHON=./anaconda/bin/ipython IPYTHON_OPTS="notebook" pyspark`
      
      Author: Rob O'Dwyer <odwyerrob@gmail.com>
      
      Closes #2167 from robbles/patch-1 and squashes the following commits:
      
      d98e8a9 [Rob O'Dwyer] Allow using custom ipython executable with pyspark
      f38fab97
    • Andrew Or's avatar
      [SPARK-3167] Handle special driver configs in Windows · 7557c4cf
      Andrew Or authored
      This is an effort to bring the Windows scripts up to speed after recent splashing changes in #1845.
      
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #2129 from andrewor14/windows-config and squashes the following commits:
      
      881a8f0 [Andrew Or] Add reference to Windows taskkill
      92e6047 [Andrew Or] Update a few comments (minor)
      22b1acd [Andrew Or] Fix style again (minor)
      afcffea [Andrew Or] Fix style (minor)
      72004c2 [Andrew Or] Actually respect --driver-java-options
      803218b [Andrew Or] Actually respect SPARK_*_CLASSPATH
      eeb34a0 [Andrew Or] Update outdated comment (minor)
      35caecc [Andrew Or] In Windows, actually kill Java processes on exit
      f97daa2 [Andrew Or] Fix Windows spark shell stdin issue
      83ebe60 [Andrew Or] Parse special driver configs in Windows (broken)
      7557c4cf
  30. Aug 26, 2014
    • Cheng Lian's avatar
      [SPARK-2964] [SQL] Remove duplicated code from spark-sql and start-thriftserver.sh · faeb9c0e
      Cheng Lian authored
      Author: Cheng Lian <lian.cs.zju@gmail.com>
      Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
      
      Closes #1886 from sarutak/SPARK-2964 and squashes the following commits:
      
      8ef8751 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-2964
      26e7c95 [Kousuke Saruta] Revert "Shorten timeout to more reasonable value"
      ffb68fa [Kousuke Saruta] Modified spark-sql and start-thriftserver.sh to use bin/utils.sh
      8c6f658 [Kousuke Saruta] Merge branch 'spark-3026' of https://github.com/liancheng/spark into SPARK-2964
      81b43a8 [Cheng Lian] Shorten timeout to more reasonable value
      a89e66d [Cheng Lian] Fixed command line options quotation in scripts
      9c894d3 [Cheng Lian] Fixed bin/spark-sql -S option typo
      be4736b [Cheng Lian] Report better error message when running JDBC/CLI without hive-thriftserver profile enabled
      faeb9c0e
    • WangTao's avatar
      [SPARK-3225]Typo in script · 2ffd3290
      WangTao authored
      use_conf_dir => user_conf_dir in load-spark-env.sh.
      
      Author: WangTao <barneystinson@aliyun.com>
      
      Closes #1926 from WangTaoTheTonic/TypoInScript and squashes the following commits:
      
      0c104ad [WangTao] Typo in script
      2ffd3290
Loading