Skip to content
Snippets Groups Projects
  1. Jul 11, 2016
    • Reynold Xin's avatar
      [SPARK-16477] Bump master version to 2.1.0-SNAPSHOT · ffcb6e05
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      After SPARK-16476 (committed earlier today as #14128), we can finally bump the version number.
      
      ## How was this patch tested?
      N/A
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #14130 from rxin/SPARK-16477.
      ffcb6e05
  2. Jul 01, 2016
  3. Jun 29, 2016
    • jerryshao's avatar
      [SPARK-15990][YARN] Add rolling log aggregation support for Spark on yarn · 272a2f78
      jerryshao authored
      ## What changes were proposed in this pull request?
      
      Yarn supports rolling log aggregation since 2.6, previously log will only be aggregated to HDFS after application is finished, it is quite painful for long running applications like Spark Streaming, thriftserver. Also out of disk problem will be occurred when log file is too large. So here propose to add support of rolling log aggregation for Spark on yarn.
      
      One limitation for this is that log4j should be set to change to file appender, now in Spark itself uses console appender by default, in which file will not be created again once removed after aggregation. But I think lots of production users should have changed their log4j configuration instead of default on, so this is not a big problem.
      
      ## How was this patch tested?
      
      Manually verified with Hadoop 2.7.1.
      
      Author: jerryshao <sshao@hortonworks.com>
      
      Closes #13712 from jerryshao/SPARK-15990.
      272a2f78
  4. Jun 24, 2016
    • peng.zhang's avatar
      [SPARK-16125][YARN] Fix not test yarn cluster mode correctly in YarnClusterSuite · f4fd7432
      peng.zhang authored
      ## What changes were proposed in this pull request?
      
      Since SPARK-13220(Deprecate "yarn-client" and "yarn-cluster"), YarnClusterSuite doesn't test "yarn cluster" mode correctly.
      This pull request fixes it.
      
      ## How was this patch tested?
      Unit test
      
      (If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
      
      Author: peng.zhang <peng.zhang@xiaomi.com>
      
      Closes #13836 from renozhang/SPARK-16125-test-yarn-cluster-mode.
      f4fd7432
  5. Jun 23, 2016
    • Ryan Blue's avatar
      [SPARK-13723][YARN] Change behavior of --num-executors with dynamic allocation. · 738f134b
      Ryan Blue authored
      ## What changes were proposed in this pull request?
      
      This changes the behavior of --num-executors and spark.executor.instances when using dynamic allocation. Instead of turning dynamic allocation off, it uses the value for the initial number of executors.
      
      This changes was discussed on [SPARK-13723](https://issues.apache.org/jira/browse/SPARK-13723). I highly recommend using it while we can change the behavior for 2.0.0. In practice, the 1.x behavior causes unexpected behavior for users (it is not clear that it disables dynamic allocation) and wastes cluster resources because users rarely notice the log message.
      
      ## How was this patch tested?
      
      This patch updates tests and adds a test for Utils.getDynamicAllocationInitialExecutors.
      
      Author: Ryan Blue <blue@apache.org>
      
      Closes #13338 from rdblue/SPARK-13723-num-executors-with-dynamic-allocation.
      738f134b
    • Ryan Blue's avatar
      [SPARK-15725][YARN] Ensure ApplicationMaster sleeps for the min interval. · a410814c
      Ryan Blue authored
      ## What changes were proposed in this pull request?
      
      Update `ApplicationMaster` to sleep for at least the minimum allocation interval before calling `allocateResources`. This prevents overloading the `YarnAllocator` that is happening because the thread is triggered when an executor is killed and its connections die. In YARN, this prevents the app from overloading the allocator and becoming unstable.
      
      ## How was this patch tested?
      
      Tested that this allows the an app to recover instead of hanging. It is still possible for the YarnAllocator to be overwhelmed by requests, but this prevents the issue for the most common cause.
      
      Author: Ryan Blue <blue@apache.org>
      
      Closes #13482 from rdblue/SPARK-15725-am-sleep-work-around.
      a410814c
    • Peter Ableda's avatar
      [SPARK-16138] Try to cancel executor requests only if we have at least 1 · 5bf2889b
      Peter Ableda authored
      ## What changes were proposed in this pull request?
      Adding additional check to if statement
      
      ## How was this patch tested?
      I built and deployed to internal cluster to observe behaviour. After the change the invalid logging is gone:
      
      ```
      16/06/22 08:46:36 INFO yarn.YarnAllocator: Driver requested a total number of 1 executor(s).
      16/06/22 08:46:36 INFO yarn.YarnAllocator: Canceling requests for 1 executor container(s) to have a new desired total 1 executors.
      16/06/22 08:46:36 INFO yarn.YarnAllocator: Driver requested a total number of 0 executor(s).
      16/06/22 08:47:36 INFO yarn.ApplicationMaster$AMEndpoint: Driver requested to kill executor(s) 1.
      ```
      
      Author: Peter Ableda <abledapeter@gmail.com>
      
      Closes #13850 from peterableda/patch-2.
      5bf2889b
  6. Jun 21, 2016
  7. Jun 15, 2016
  8. Jun 14, 2016
    • Sean Owen's avatar
      [MINOR] Clean up several build warnings, mostly due to internal use of old accumulators · 6151d264
      Sean Owen authored
      ## What changes were proposed in this pull request?
      
      Another PR to clean up recent build warnings. This particularly cleans up several instances of the old accumulator API usage in tests that are straightforward to update. I think this qualifies as "minor".
      
      ## How was this patch tested?
      
      Jenkins
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #13642 from srowen/BuildWarnings.
      6151d264
  9. Jun 13, 2016
  10. Jun 09, 2016
  11. Jun 03, 2016
    • Subroto Sanyal's avatar
      [SPARK-15754][YARN] Not letting the credentials containing hdfs delegation... · 61d729ab
      Subroto Sanyal authored
      [SPARK-15754][YARN] Not letting the credentials containing hdfs delegation tokens to be added in current user credential.
      
      ## What changes were proposed in this pull request?
      The credentials are not added to the credentials of UserGroupInformation.getCurrentUser(). Further if the client has possibility to login using keytab then the updateDelegationToken thread is not started on client.
      
      ## How was this patch tested?
      ran dev/run-tests
      
      Author: Subroto Sanyal <ssanyal@datameer.com>
      
      Closes #13499 from subrotosanyal/SPARK-15754-save-ugi-from-changing.
      61d729ab
  12. May 29, 2016
    • Sean Owen's avatar
      [MINOR] Resolve a number of miscellaneous build warnings · ce1572d1
      Sean Owen authored
      ## What changes were proposed in this pull request?
      
      This change resolves a number of build warnings that have accumulated, before 2.x. It does not address a large number of deprecation warnings, especially related to the Accumulator API. That will happen separately.
      
      ## How was this patch tested?
      
      Jenkins
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #13377 from srowen/BuildWarnings.
      ce1572d1
  13. May 26, 2016
  14. May 24, 2016
    • Marcelo Vanzin's avatar
      [SPARK-15405][YARN] Remove unnecessary upload of config archive. · a313a5ae
      Marcelo Vanzin authored
      We only need one copy of it. The client code that was uploading the
      second copy just needs to be modified to update the metadata in the
      cache, so that the AM knows where to find the configuration.
      
      Tested by running app on YARN and verifying in the logs only one archive
      is uploaded.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #13232 from vanzin/SPARK-15405.
      a313a5ae
  15. May 20, 2016
    • tedyu's avatar
      [SPARK-15273] YarnSparkHadoopUtil#getOutOfMemoryErrorArgument should respect... · 06c9f520
      tedyu authored
      [SPARK-15273] YarnSparkHadoopUtil#getOutOfMemoryErrorArgument should respect OnOutOfMemoryError parameter given by user
      
      ## What changes were proposed in this pull request?
      
      As Nirav reported in this thread:
      http://search-hadoop.com/m/q3RTtdF3yNLMd7u
      
      YarnSparkHadoopUtil#getOutOfMemoryErrorArgument previously specified 'kill %p' unconditionally.
      We should respect the parameter given by user.
      
      ## How was this patch tested?
      
      Existing tests
      
      Author: tedyu <yuzhihong@gmail.com>
      
      Closes #13057 from tedyu/master.
      06c9f520
  16. May 17, 2016
  17. May 13, 2016
  18. May 12, 2016
    • bomeng's avatar
      [SPARK-14897][SQL] upgrade to jetty 9.2.16 · 81bf8708
      bomeng authored
      ## What changes were proposed in this pull request?
      
      Since Jetty 8 is EOL (end of life) and has critical security issue [http://www.securityweek.com/critical-vulnerability-found-jetty-web-server], I think upgrading to 9 is necessary. I am using latest 9.2 since 9.3 requires Java 8+.
      
      `javax.servlet` and `derby` were also upgraded since Jetty 9.2 needs corresponding version.
      
      ## How was this patch tested?
      
      Manual test and current test cases should cover it.
      
      Author: bomeng <bmeng@us.ibm.com>
      
      Closes #12916 from bomeng/SPARK-14897.
      81bf8708
  19. May 10, 2016
    • Marcelo Vanzin's avatar
      [SPARK-11249][LAUNCHER] Throw error if app resource is not provided. · 0b9cae42
      Marcelo Vanzin authored
      Without this, the code would build an invalid spark-submit command line,
      and a more cryptic error would be presented to the user. Also, expose
      a constant that allows users to set a dummy resource in cases where
      they don't need an actual resource file; for backwards compatibility,
      that uses the same "spark-internal" resource that Spark itself uses.
      
      Tested via unit tests, run-example, spark-shell, and running the
      thrift server with mixed spark and hive command line arguments.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #12909 from vanzin/SPARK-11249.
      0b9cae42
    • jerryshao's avatar
      [SPARK-14963][YARN] Using recoveryPath if NM recovery is enabled · aab99d31
      jerryshao authored
      ## What changes were proposed in this pull request?
      
      From Hadoop 2.5+, Yarn NM supports NM recovery which using recovery path for auxiliary services such as spark_shuffle, mapreduce_shuffle. So here change to use this path install of NM local dir if NM recovery is enabled.
      
      ## How was this patch tested?
      
      Unit test + local test.
      
      Author: jerryshao <sshao@hortonworks.com>
      
      Closes #12994 from jerryshao/SPARK-14963.
      aab99d31
  20. May 05, 2016
    • Jacek Laskowski's avatar
      [SPARK-15152][DOC][MINOR] Scaladoc and Code style Improvements · bbb77734
      Jacek Laskowski authored
      ## What changes were proposed in this pull request?
      
      Minor doc and code style fixes
      
      ## How was this patch tested?
      
      local build
      
      Author: Jacek Laskowski <jacek@japila.pl>
      
      Closes #12928 from jaceklaskowski/SPARK-15152.
      bbb77734
    • mcheah's avatar
      [SPARK-12154] Upgrade to Jersey 2 · b7fdc23c
      mcheah authored
      ## What changes were proposed in this pull request?
      
      Replace com.sun.jersey with org.glassfish.jersey. Changes to the Spark Web UI code were required to compile. The changes were relatively standard Jersey migration things.
      
      ## How was this patch tested?
      
      I did a manual test for the standalone web APIs. Although I didn't test the functionality of the security filter itself, the code that changed non-trivially is how we actually register the filter. I attached a debugger to the Spark master and verified that the SecurityFilter code is indeed invoked upon hitting /api/v1/applications.
      
      Author: mcheah <mcheah@palantir.com>
      
      Closes #12715 from mccheah/feature/upgrade-jersey.
      b7fdc23c
  21. May 04, 2016
    • Dhruve Ashar's avatar
      [SPARK-4224][CORE][YARN] Support group acls · a4564774
      Dhruve Ashar authored
      ## What changes were proposed in this pull request?
      Currently only a list of users can be specified for view and modify acls. This change enables a group of admins/devs/users to be provisioned for viewing and modifying Spark jobs.
      
      **Changes Proposed in the fix**
      Three new corresponding config entries have been added where the user can specify the groups to be given access.
      
      ```
      spark.admin.acls.groups
      spark.modify.acls.groups
      spark.ui.view.acls.groups
      ```
      
      New config entries were added because specifying the users and groups explicitly is a better and cleaner way compared to specifying them in the existing config entry using a delimiter.
      
      A generic trait has been introduced to provide the user to group mapping which makes it pluggable to support a variety of mapping protocols - similar to the one used in hadoop. A default unix shell based implementation has been provided.
      Custom user to group mapping protocol can be specified and configured by the entry ```spark.user.groups.mapping```
      
      **How the patch was Tested**
      We ran different spark jobs setting the config entries in combinations of admin, modify and ui acls. For modify acls we tried killing the job stages from the ui and using yarn commands. For view acls we tried accessing the UI tabs and the logs. Headless accounts were used to launch these jobs and different users tried to modify and view the jobs to ensure that the groups mapping applied correctly.
      
      Additional Unit tests have been added without modifying the existing ones. These test for different ways of setting the acls through configuration and/or API and validate the expected behavior.
      
      Author: Dhruve Ashar <dhruveashar@gmail.com>
      
      Closes #12760 from dhruve/impr/SPARK-4224.
      a4564774
  22. Apr 28, 2016
  23. Apr 27, 2016
    • Hemant Bhanawat's avatar
      [SPARK-14729][SCHEDULER] Refactored YARN scheduler creation code to use newly... · e4d439c8
      Hemant Bhanawat authored
      [SPARK-14729][SCHEDULER] Refactored YARN scheduler creation code to use newly added ExternalClusterManager
      
      ## What changes were proposed in this pull request?
      With the addition of ExternalClusterManager(ECM) interface in PR #11723, any cluster manager can now be integrated with Spark. It was suggested in  ExternalClusterManager PR that one of the existing cluster managers should start using the new interface to ensure that the API is correct. Ideally, all the existing cluster managers should eventually use the ECM interface but as a first step yarn will now use the ECM interface. This PR refactors YARN code from SparkContext.createTaskScheduler function  into YarnClusterManager that implements ECM interface.
      
      ## How was this patch tested?
      Since this is refactoring, no new tests has been added. Existing tests have been run. Basic manual testing with YARN was done too.
      
      Author: Hemant Bhanawat <hemant@snappydata.io>
      
      Closes #12641 from hbhanawat/yarnClusterMgr.
      e4d439c8
  24. Apr 26, 2016
    • Azeem Jiva's avatar
      [SPARK-14756][CORE] Use parseLong instead of valueOf · de6e6334
      Azeem Jiva authored
      ## What changes were proposed in this pull request?
      
      Use Long.parseLong which returns a primative.
      Use a series of appends() reduces the creation of an extra StringBuilder type
      
      ## How was this patch tested?
      
      Unit tests
      
      Author: Azeem Jiva <azeemj@gmail.com>
      
      Closes #12520 from javawithjiva/minor.
      de6e6334
  25. Apr 25, 2016
  26. Apr 22, 2016
    • Reynold Xin's avatar
      [SPARK-10001] Consolidate Signaling and SignalLogger. · c089c6f4
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      This is a follow-up to #12557, with the following changes:
      
      1. Fixes some of the style issues.
      2. Merges Signaling and SignalLogger into a new class called SignalUtils. It was pretty confusing to have Signaling and Signal in one file, and it was also confusing to have two classes named Signaling and one called the other.
      3. Made logging registration idempotent.
      
      ## How was this patch tested?
      N/A.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #12605 from rxin/SPARK-10001.
      c089c6f4
    • Joan's avatar
      [SPARK-6429] Implement hashCode and equals together · bf95b8da
      Joan authored
      ## What changes were proposed in this pull request?
      
      Implement some `hashCode` and `equals` together in order to enable the scalastyle.
      This is a first batch, I will continue to implement them but I wanted to know your thoughts.
      
      Author: Joan <joan@goyeau.com>
      
      Closes #12157 from joan38/SPARK-6429-HashCode-Equals.
      bf95b8da
  27. Apr 20, 2016
    • Marcelo Vanzin's avatar
      [SPARK-14602][YARN] Use SparkConf to propagate the list of cached files. · f47dbf27
      Marcelo Vanzin authored
      This change avoids using the environment to pass this information, since
      with many jars it's easy to hit limits on certain OSes. Instead, it encodes
      the information into the Spark configuration propagated to the AM.
      
      The first problem that needed to be solved is a chicken & egg issue: the
      config file is distributed using the cache, and it needs to contain information
      about the files that are being distributed. To solve that, the code now treats
      the config archive especially, and uses slightly different code to distribute
      it, so that only its cache path needs to be saved to the config file.
      
      The second problem is that the extra information would show up in the Web UI,
      which made the environment tab even more noisy than it already is when lots
      of jars are listed. This is solved by two changes: the list of cached files
      is now read only once in the AM, and propagated down to the ExecutorRunnable
      code (which actually sends the list to the NMs when starting containers). The
      second change is to unset those config entries after the list is read, so that
      the SparkContext never sees them.
      
      Tested with both client and cluster mode by running "run-example SparkPi". This
      uploads a whole lot of files when run from a build dir (instead of a distribution,
      where the list is cleaned up), and I verified that the configs do not show
      up in the UI.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #12487 from vanzin/SPARK-14602.
      f47dbf27
  28. Apr 19, 2016
    • Lianhui Wang's avatar
      [SPARK-14705][YARN] support Multiple FileSystem for YARN STAGING DIR · 4514aebd
      Lianhui Wang authored
      ## What changes were proposed in this pull request?
      In SPARK-13063, It makes the SPARK YARN STAGING DIR as configurable. But it only support default FileSystem. If there are many clusters, It can be different FileSystem for different cluster in our spark.
      
      ## How was this patch tested?
      I have tested it successfully with following commands:
      MASTER=yarn-client ./bin/spark-shell --conf spark.yarn.stagingDir=hdfs:namenode2/temp
      $SPARK_HOME/bin/spark-submit --conf spark.yarn.stagingDir=hdfs:namenode2/temp
      
      cc tgravescs vanzin andrewor14
      
      Author: Lianhui Wang <lianhuiwang09@gmail.com>
      
      Closes #12473 from lianhuiwang/SPARK-14705.
      4514aebd
  29. Apr 18, 2016
    • jerryshao's avatar
      [SPARK-14423][YARN] Avoid same name files added to distributed cache again · d6fb485d
      jerryshao authored
      ## What changes were proposed in this pull request?
      
      In the current implementation of assembly-free spark deployment, jars under `assembly/target/scala-xxx/jars` will be uploaded to distributed cache by default, there's a chance these jars' name will be conflicted with name of jars specified in `--jars`, this will introduce exception when starting application:
      
      ```
      client token: N/A
      	 diagnostics: Application application_1459907402325_0004 failed 2 times due to AM Container for appattempt_1459907402325_0004_000002 exited with  exitCode: -1000
      For more detailed output, check application tracking page:http://hw12100.local:8088/proxy/application_1459907402325_0004/Then, click on links to logs of each attempt.
      Diagnostics: Resource hdfs://localhost:8020/user/sshao/.sparkStaging/application_1459907402325_0004/avro-mapred-1.7.7-hadoop2.jar changed on src filesystem (expected 1459909780508, was 1459909782590
      java.io.IOException: Resource hdfs://localhost:8020/user/sshao/.sparkStaging/application_1459907402325_0004/avro-mapred-1.7.7-hadoop2.jar changed on src filesystem (expected 1459909780508, was 1459909782590
      	at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
      	at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
      	at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
      	at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:422)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
      	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
      	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      ```
      
      So here by checking the name of file to avoid same name files uploaded again.
      
      ## How was this patch tested?
      
      Unit test and manual integrated test is done locally.
      
      Author: jerryshao <sshao@hortonworks.com>
      
      Closes #12203 from jerryshao/SPARK-14423.
      d6fb485d
  30. Apr 12, 2016
  31. Apr 07, 2016
    • Dhruve Ashar's avatar
      [SPARK-12384] Enables spark-clients to set the min(-Xms) and max(*.memory config) j… · 033d8081
      Dhruve Ashar authored
      ## What changes were proposed in this pull request?
      
      Currently Spark clients are started with the same memory setting for Xms and Xms leading to reserving unnecessary higher amounts of memory.
      This behavior is changed and the clients can now specify an initial heap size using the extraJavaOptions in the config for driver,executor and am individually.
       Note, that only -Xms can be provided through this config option, if the client wants to set the max size(-Xmx), this has to be done via the *.memory configuration knobs which are currently supported.
      
      ## How was this patch tested?
      
      Monitored executor and yarn logs in debug mode to verify the commands through which they are being launched in client and cluster mode. The driver memory was verified locally using jps -v. Setting up -Xmx parameter in the javaExtraOptions raises exception with the info provided.
      
      Author: Dhruve Ashar <dhruveashar@gmail.com>
      
      Closes #12115 from dhruve/impr/SPARK-12384.
      033d8081
  32. Apr 06, 2016
    • Marcelo Vanzin's avatar
      [SPARK-14134][CORE] Change the package name used for shading classes. · 21d5ca12
      Marcelo Vanzin authored
      The current package name uses a dash, which is a little weird but seemed
      to work. That is, until a new test tried to mock a class that references
      one of those shaded types, and then things started failing.
      
      Most changes are just noise to fix the logging configs.
      
      For reference, SPARK-8815 also raised this issue, although at the time it
      did not cause any issues in Spark, so it was not addressed.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #11941 from vanzin/SPARK-14134.
      21d5ca12
Loading