Skip to content
Snippets Groups Projects
  1. Jul 17, 2015
    • Hari Shreedharan's avatar
      [SPARK-8851] [YARN] In Client mode, make sure the client logs in and updates tokens · c043a3e9
      Hari Shreedharan authored
      In client side, the flow is SparkSubmit -> SparkContext -> yarn/Client. Since the yarn client only gets a cloned config and the staging dir is set here, it is not really possible to do re-logins in the SparkContext. So, do the initial logins in Spark Submit and do re-logins as we do now in the AM, but the Client behaves like an executor in this specific context and reads the credentials file to update the tokens. This way, even if the streaming context is started up from checkpoint - it is fine since we have logged in from SparkSubmit itself itself.
      
      Author: Hari Shreedharan <hshreedharan@apache.org>
      
      Closes #7394 from harishreedharan/yarn-client-login and squashes the following commits:
      
      9a2166f [Hari Shreedharan] make it possible to use command line args and config parameters together.
      de08f57 [Hari Shreedharan] Fix import order.
      5c4fa63 [Hari Shreedharan] Add a comment explaining what is being done in YarnClientSchedulerBackend.
      c872caa [Hari Shreedharan] Fix typo in log message.
      2c80540 [Hari Shreedharan] Move token renewal to YarnClientSchedulerBackend.
      0c48ac2 [Hari Shreedharan] Remove direct use of ExecutorDelegationTokenUpdater in Client.
      26f8bfa [Hari Shreedharan] [SPARK-8851][YARN] In Client mode, make sure the client logs in and updates tokens.
      58b1969 [Hari Shreedharan] Simple attempt 1.
      c043a3e9
  2. Jul 16, 2015
  3. Jul 14, 2015
    • Josh Rosen's avatar
      [SPARK-8962] Add Scalastyle rule to ban direct use of Class.forName; fix existing uses · 11e5c372
      Josh Rosen authored
      This pull request adds a Scalastyle regex rule which fails the style check if `Class.forName` is used directly.  `Class.forName` always loads classes from the default / system classloader, but in a majority of cases, we should be using Spark's own `Utils.classForName` instead, which tries to load classes from the current thread's context classloader and falls back to the classloader which loaded Spark when the context classloader is not defined.
      
      <!-- Reviewable:start -->
      [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/7350)
      <!-- Reviewable:end -->
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #7350 from JoshRosen/ban-Class.forName and squashes the following commits:
      
      e3e96f7 [Josh Rosen] Merge remote-tracking branch 'origin/master' into ban-Class.forName
      c0b7885 [Josh Rosen] Hopefully fix the last two cases
      d707ba7 [Josh Rosen] Fix uses of Class.forName that I missed in my first cleanup pass
      046470d [Josh Rosen] Merge remote-tracking branch 'origin/master' into ban-Class.forName
      62882ee [Josh Rosen] Fix uses of Class.forName or add exclusion.
      d9abade [Josh Rosen] Add stylechecker rule to ban uses of Class.forName
      11e5c372
  4. Jul 10, 2015
    • Jonathan Alter's avatar
      [SPARK-7977] [BUILD] Disallowing println · e14b545d
      Jonathan Alter authored
      Author: Jonathan Alter <jonalter@users.noreply.github.com>
      
      Closes #7093 from jonalter/SPARK-7977 and squashes the following commits:
      
      ccd44cc [Jonathan Alter] Changed println to log in ThreadingSuite
      7fcac3e [Jonathan Alter] Reverting to println in ThreadingSuite
      10724b6 [Jonathan Alter] Changing some printlns to logs in tests
      eeec1e7 [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977
      0b1dcb4 [Jonathan Alter] More println cleanup
      aedaf80 [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977
      925fd98 [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977
      0c16fa3 [Jonathan Alter] Replacing some printlns with logs
      45c7e05 [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977
      5c8e283 [Jonathan Alter] Allowing println in audit-release examples
      5b50da1 [Jonathan Alter] Allowing printlns in example files
      ca4b477 [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977
      83ab635 [Jonathan Alter] Fixing new printlns
      54b131f [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977
      1cd8a81 [Jonathan Alter] Removing some unnecessary comments and printlns
      b837c3a [Jonathan Alter] Disallowing println
      e14b545d
  5. Jul 08, 2015
  6. Jul 02, 2015
    • huangzhaowei's avatar
      [SPARK-8687] [YARN] Fix bug: Executor can't fetch the new set configuration in yarn-client · 1b0c8e61
      huangzhaowei authored
      Spark initi the properties CoarseGrainedSchedulerBackend.start
      ```scala
          // TODO (prashant) send conf instead of properties
          driverEndpoint = rpcEnv.setupEndpoint(
            CoarseGrainedSchedulerBackend.ENDPOINT_NAME, new DriverEndpoint(rpcEnv, properties))
      ```
      Then the yarn logic will set some configuration but not update in this `properties`.
      So `Executor` won't gain the `properties`.
      
      [Jira](https://issues.apache.org/jira/browse/SPARK-8687)
      
      Author: huangzhaowei <carlmartinmax@gmail.com>
      
      Closes #7066 from SaintBacchus/SPARK-8687 and squashes the following commits:
      
      1de4f48 [huangzhaowei] Ensure all necessary properties have already been set before startup ExecutorLaucher
      1b0c8e61
    • Ilya Ganelin's avatar
      [SPARK-3071] Increase default driver memory · 3697232b
      Ilya Ganelin authored
      I've updated default values in comments, documentation, and in the command line builder to be 1g based on comments in the JIRA. I've also updated most usages to point at a single variable defined in the Utils.scala and JavaUtils.java files. This wasn't possible in all cases (R, shell scripts etc.) but usage in most code is now pointing at the same place.
      
      Please let me know if I've missed anything.
      
      Will the spark-shell use the value within the command line builder during instantiation?
      
      Author: Ilya Ganelin <ilya.ganelin@capitalone.com>
      
      Closes #7132 from ilganeli/SPARK-3071 and squashes the following commits:
      
      4074164 [Ilya Ganelin] String fix
      271610b [Ilya Ganelin] Merge branch 'SPARK-3071' of github.com:ilganeli/spark into SPARK-3071
      273b6e9 [Ilya Ganelin] Test fix
      fd67721 [Ilya Ganelin] Update JavaUtils.java
      26cc177 [Ilya Ganelin] test fix
      e5db35d [Ilya Ganelin] Fixed test failure
      39732a1 [Ilya Ganelin] merge fix
      a6f7deb [Ilya Ganelin] Created default value for DRIVER MEM in Utils that's now used in almost all locations instead of setting manually in each
      09ad698 [Ilya Ganelin] Update SubmitRestProtocolSuite.scala
      19b6f25 [Ilya Ganelin] Missed one doc update
      2698a3d [Ilya Ganelin] Updated default value for driver memory
      3697232b
    • huangzhaowei's avatar
      [SPARK-8688] [YARN] Bug fix: disable the cache fs to gain the HDFS connection. · 646366b5
      huangzhaowei authored
      If `fs.hdfs.impl.disable.cache` was `false`(default), `FileSystem` will use the cached `DFSClient` which use old token.
      [AMDelegationTokenRenewer](https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala#L196)
      ```scala
          val credentials = UserGroupInformation.getCurrentUser.getCredentials
          credentials.writeTokenStorageFile(tempTokenPath, discachedConfiguration)
      ```
      Although the `credentials` had the new Token, but it still use the cached client and old token.
      So It's better to set the `fs.hdfs.impl.disable.cache`  as `true` to avoid token expired.
      
      [Jira](https://issues.apache.org/jira/browse/SPARK-8688)
      
      Author: huangzhaowei <carlmartinmax@gmail.com>
      
      Closes #7069 from SaintBacchus/SPARK-8688 and squashes the following commits:
      
      f94cd0b [huangzhaowei] modify function parameter
      8fb9eb9 [huangzhaowei] explicit  the comment
      0cd55c9 [huangzhaowei] Rename function name to be an accurate one
      cf776a1 [huangzhaowei] [SPARK-8688][YARN]Bug fix: disable the cache fs to gain the HDFS connection.
      646366b5
    • Devaraj K's avatar
      [SPARK-8754] [YARN] YarnClientSchedulerBackend doesn't stop gracefully in failure conditions · 792fcd80
      Devaraj K authored
      In YarnClientSchedulerBackend.stop(), added a check for monitorThread.
      
      Author: Devaraj K <devaraj@apache.org>
      
      Closes #7153 from devaraj-kavali/master and squashes the following commits:
      
      66be9ad [Devaraj K] https://issues.apache.org/jira/browse/SPARK-8754 YarnClientSchedulerBackend doesn't stop gracefully in failure conditions
      792fcd80
  7. Jun 28, 2015
    • Josh Rosen's avatar
      [SPARK-8683] [BUILD] Depend on mockito-core instead of mockito-all · f5100451
      Josh Rosen authored
      Spark's tests currently depend on `mockito-all`, which bundles Hamcrest and Objenesis classes. Instead, it should depend on `mockito-core`, which declares those libraries as Maven dependencies. This is necessary in order to fix a dependency conflict that leads to a NoSuchMethodError when using certain Hamcrest matchers.
      
      See https://github.com/mockito/mockito/wiki/Declaring-mockito-dependency for more details.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #7061 from JoshRosen/mockito-core-instead-of-all and squashes the following commits:
      
      70eccbe [Josh Rosen] Depend on mockito-core instead of mockito-all.
      f5100451
  8. Jun 26, 2015
    • Marcelo Vanzin's avatar
      [SPARK-8302] Support heterogeneous cluster install paths on YARN. · 37bf76a2
      Marcelo Vanzin authored
      Some users have Hadoop installations on different paths across
      their cluster. Currently, that makes it hard to set up some
      configuration in Spark since that requires hardcoding paths to
      jar files or native libraries, which wouldn't work on such a cluster.
      
      This change introduces a couple of YARN-specific configurations
      that instruct the backend to replace certain paths when launching
      remote processes. That way, if the configuration says the Spark
      jar is in "/spark/spark.jar", and also says that "/spark" should be
      replaced with "{{SPARK_INSTALL_DIR}}", YARN will start containers
      in the NMs with "{{SPARK_INSTALL_DIR}}/spark.jar" as the location
      of the jar.
      
      Coupled with YARN's environment whitelist (which allows certain
      env variables to be exposed to containers), this allows users to
      support such heterogeneous environments, as long as a single
      replacement is enough. (Otherwise, this feature would need to be
      extended to support multiple path replacements.)
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #6752 from vanzin/SPARK-8302 and squashes the following commits:
      
      4bff8d4 [Marcelo Vanzin] Add docs, rename configs.
      0aa2a02 [Marcelo Vanzin] Only do replacement for paths that need it.
      2e9cc9d [Marcelo Vanzin] Style.
      a5e1f68 [Marcelo Vanzin] [SPARK-8302] Support heterogeneous cluster install paths on YARN.
      37bf76a2
  9. Jun 19, 2015
    • Carson Wang's avatar
      [SPARK-8387] [FOLLOWUP ] [WEBUI] Update driver log URL to show only 4096 bytes · 54557f35
      Carson Wang authored
      This is to follow up #6834 , update the driver log URL as well for consistency.
      
      Author: Carson Wang <carson.wang@intel.com>
      
      Closes #6878 from carsonwang/logUrl and squashes the following commits:
      
      13be948 [Carson Wang] update log URL in YarnClusterSuite
      a0004f4 [Carson Wang] Update driver log URL to show only 4096 bytes
      54557f35
  10. Jun 16, 2015
  11. Jun 10, 2015
    • WangTaoTheTonic's avatar
      [SPARK-8273] Driver hangs up when yarn shutdown in client mode · 5014d0ed
      WangTaoTheTonic authored
      In client mode, if yarn was shut down with spark application running, the application will hang up after several retries(default: 30) because the exception throwed by YarnClientImpl could not be caught by upper level, we should exit in case that user can not be aware that.
      
      The exception we wanna catch is [here](https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java#L122), and I try to fix it refer to [MR](https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ClientServiceDelegate.java#L320).
      
      Author: WangTaoTheTonic <wangtao111@huawei.com>
      
      Closes #6717 from WangTaoTheTonic/SPARK-8273 and squashes the following commits:
      
      28752d6 [WangTaoTheTonic] catch the throwed exception
      5014d0ed
    • Marcelo Vanzin's avatar
      [SPARK-5479] [YARN] Handle --py-files correctly in YARN. · 38112905
      Marcelo Vanzin authored
      The bug description is a little misleading: the actual issue is that
      .py files are not handled correctly when distributed by YARN. They're
      added to "spark.submit.pyFiles", which, when processed by context.py,
      explicitly whitelists certain extensions (see PACKAGE_EXTENSIONS),
      and that does not include .py files.
      
      On top of that, archives were not handled at all! They made it to the
      driver's python path, but never made it to executors, since the mechanism
      used to propagate their location (spark.submit.pyFiles) only works on
      the driver side.
      
      So, instead, ignore "spark.submit.pyFiles" and just build PYTHONPATH
      correctly for both driver and executors. Individual .py files are
      placed in a subdirectory of the container's local dir in the cluster,
      which is then added to the python path. Archives are added directly.
      
      The change, as a side effect, ends up solving the symptom described
      in the bug. The issue was not that the files were not being distributed,
      but that they were never made visible to the python application
      running under Spark.
      
      Also included is a proper unit test for running python on YARN, which
      broke in several different ways with the previous code.
      
      A short walk around of the changes:
      - SparkSubmit does not try to be smart about how YARN handles python
        files anymore. It just passes down the configs to the YARN client
        code.
      - The YARN client distributes python files and archives differently,
        placing the files in a subdirectory.
      - The YARN client now sets PYTHONPATH for the processes it launches;
        to properly handle different locations, it uses YARN's support for
        embedding env variables, so to avoid YARN expanding those at the
        wrong time, SparkConf is now propagated to the AM using a conf file
        instead of command line options.
      - Because the Client initialization code is a maze of implicit
        dependencies, some code needed to be moved around to make sure
        all needed state was available when the code ran.
      - The pyspark tests in YarnClusterSuite now actually distribute and try
        to use both a python file and an archive containing a different python
        module. Also added a yarn-client tests for completeness.
      - I cleaned up some of the code around distributing files to YARN, to
        avoid adding more copied & pasted code to handle the new files being
        distributed.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #6360 from vanzin/SPARK-5479 and squashes the following commits:
      
      bcaf7e6 [Marcelo Vanzin] Feedback.
      c47501f [Marcelo Vanzin] Fix yarn-client mode.
      46b1d0c [Marcelo Vanzin] Merge branch 'master' into SPARK-5479
      c743778 [Marcelo Vanzin] Only pyspark cares about python archives.
      c8e5a82 [Marcelo Vanzin] Actually run pyspark in client mode.
      705571d [Marcelo Vanzin] Move some code to the YARN module.
      1dd4d0c [Marcelo Vanzin] Review feedback.
      71ee736 [Marcelo Vanzin] Merge branch 'master' into SPARK-5479
      220358b [Marcelo Vanzin] Scalastyle.
      cdbb990 [Marcelo Vanzin] Merge branch 'master' into SPARK-5479
      7fe3cd4 [Marcelo Vanzin] No need to distribute primary file to executors.
      09045f1 [Marcelo Vanzin] Style.
      943cbf4 [Marcelo Vanzin] [SPARK-5479] [yarn] Handle --py-files correctly in YARN.
      38112905
  12. Jun 08, 2015
    • linweizhong's avatar
      [SPARK-7705] [YARN] Cleanup of .sparkStaging directory fails if application is killed · eacd4a92
      linweizhong authored
      As I have tested, if we cancel or kill the app then the final status may be undefined, killed or succeeded, so clean up staging directory when appMaster exit at any final application status.
      
      Author: linweizhong <linweizhong@huawei.com>
      
      Closes #6409 from Sephiroth-Lin/SPARK-7705 and squashes the following commits:
      
      3a5a0a5 [linweizhong] Update
      83dc274 [linweizhong] Update
      923d44d [linweizhong] Update
      0dd7c2d [linweizhong] Update
      b76a102 [linweizhong] Update code style
      7846b69 [linweizhong] Update
      bd6cf0d [linweizhong] Refactor
      aed9f18 [linweizhong] Clean up stagingDir when launch app on yarn
      95595c3 [linweizhong] Cleanup of .sparkStaging directory when AppMaster exit at any final application status
      eacd4a92
  13. Jun 06, 2015
    • Hari Shreedharan's avatar
      [SPARK-8136] [YARN] Fix flakiness in YarnClusterSuite. · ed2cc3ee
      Hari Shreedharan authored
      Instead of actually downloading the logs, just verify that the logs link is actually
      a URL and is in the expected format.
      
      Author: Hari Shreedharan <hshreedharan@apache.org>
      
      Closes #6680 from harishreedharan/simplify-am-log-tests and squashes the following commits:
      
      3183aeb [Hari Shreedharan] Remove check for hostname which can fail on machines with several hostnames. Removed some unused imports.
      50d69a7 [Hari Shreedharan] [SPARK-8136][YARN] Fix flakiness in YarnClusterSuite.
      ed2cc3ee
  14. Jun 03, 2015
    • zsxwing's avatar
      [SPARK-8001] [CORE] Make AsynchronousListenerBus.waitUntilEmpty throw TimeoutException if timeout · 1d8669f1
      zsxwing authored
      Some places forget to call `assert` to check the return value of `AsynchronousListenerBus.waitUntilEmpty`. Instead of adding `assert` in these places, I think it's better to make `AsynchronousListenerBus.waitUntilEmpty` throw `TimeoutException`.
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #6550 from zsxwing/SPARK-8001 and squashes the following commits:
      
      607674a [zsxwing] Make AsynchronousListenerBus.waitUntilEmpty throw TimeoutException if timeout
      1d8669f1
    • Marcelo Vanzin's avatar
      [SPARK-8059] [YARN] Wake up allocation thread when new requests arrive. · aa40c442
      Marcelo Vanzin authored
      This should help reduce latency for new executor allocations.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #6600 from vanzin/SPARK-8059 and squashes the following commits:
      
      8387a3a [Marcelo Vanzin] [SPARK-8059] [yarn] Wake up allocation thread when new requests arrive.
      aa40c442
    • Patrick Wendell's avatar
      [SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0 · 2c4d550e
      Patrick Wendell authored
      Author: Patrick Wendell <patrick@databricks.com>
      
      Closes #6328 from pwendell/spark-1.5-update and squashes the following commits:
      
      2f42d02 [Patrick Wendell] A few more excludes
      4bebcf0 [Patrick Wendell] Update to RC4
      61aaf46 [Patrick Wendell] Using new release candidate
      55f1610 [Patrick Wendell] Another exclude
      04b4f04 [Patrick Wendell] More issues with transient 1.4 changes
      36f549b [Patrick Wendell] [SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0
      2c4d550e
  15. May 31, 2015
    • Reynold Xin's avatar
      [SPARK-3850] Trim trailing spaces for examples/streaming/yarn. · 564bc11e
      Reynold Xin authored
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #6530 from rxin/trim-whitespace-1 and squashes the following commits:
      
      7b7b3a0 [Reynold Xin] Reset again.
      dc14597 [Reynold Xin] Reset scalastyle.
      cd556c4 [Reynold Xin] YARN, Kinesis, Flume.
      4223fe1 [Reynold Xin] [SPARK-3850] Trim trailing spaces for examples/streaming.
      564bc11e
  16. May 29, 2015
    • Andrew Or's avatar
      [SPARK-7558] Demarcate tests in unit-tests.log · 9eb222c1
      Andrew Or authored
      Right now `unit-tests.log` are not of much value because we can't tell where the test boundaries are easily. This patch adds log statements before and after each test to outline the test boundaries, e.g.:
      
      ```
      ===== TEST OUTPUT FOR o.a.s.serializer.KryoSerializerSuite: 'kryo with parallelize for primitive arrays' =====
      
      15/05/27 12:36:39.596 pool-1-thread-1-ScalaTest-running-KryoSerializerSuite INFO SparkContext: Starting job: count at KryoSerializerSuite.scala:230
      15/05/27 12:36:39.596 dag-scheduler-event-loop INFO DAGScheduler: Got job 3 (count at KryoSerializerSuite.scala:230) with 4 output partitions (allowLocal=false)
      15/05/27 12:36:39.596 dag-scheduler-event-loop INFO DAGScheduler: Final stage: ResultStage 3(count at KryoSerializerSuite.scala:230)
      15/05/27 12:36:39.596 dag-scheduler-event-loop INFO DAGScheduler: Parents of final stage: List()
      15/05/27 12:36:39.597 dag-scheduler-event-loop INFO DAGScheduler: Missing parents: List()
      15/05/27 12:36:39.597 dag-scheduler-event-loop INFO DAGScheduler: Submitting ResultStage 3 (ParallelCollectionRDD[5] at parallelize at KryoSerializerSuite.scala:230), which has no missing parents
      
      ...
      
      15/05/27 12:36:39.624 pool-1-thread-1-ScalaTest-running-KryoSerializerSuite INFO DAGScheduler: Job 3 finished: count at KryoSerializerSuite.scala:230, took 0.028563 s
      15/05/27 12:36:39.625 pool-1-thread-1-ScalaTest-running-KryoSerializerSuite INFO KryoSerializerSuite:
      
      ***** FINISHED o.a.s.serializer.KryoSerializerSuite: 'kryo with parallelize for primitive arrays' *****
      
      ...
      ```
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #6441 from andrewor14/demarcate-tests and squashes the following commits:
      
      879b060 [Andrew Or] Fix compile after rebase
      d622af7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests
      017c8ba [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests
      7790b6c [Andrew Or] Fix tests after logical merge conflict
      c7460c0 [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests
      c43ffc4 [Andrew Or] Fix tests?
      8882581 [Andrew Or] Fix tests
      ee22cda [Andrew Or] Fix log message
      fa9450e [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests
      12d1e1b [Andrew Or] Various whitespace changes (minor)
      69cbb24 [Andrew Or] Make all test suites extend SparkFunSuite instead of FunSuite
      bbce12e [Andrew Or] Fix manual things that cannot be covered through automation
      da0b12f [Andrew Or] Add core tests as dependencies in all modules
      f7d29ce [Andrew Or] Introduce base abstract class for all test suites
      9eb222c1
    • WangTaoTheTonic's avatar
      [SPARK-7524] [SPARK-7846] add configs for keytab and principal, pass these two... · a51b133d
      WangTaoTheTonic authored
      [SPARK-7524] [SPARK-7846] add configs for keytab and principal, pass these two configs with different way in different modes
      
      * As spark now supports long running service by updating tokens for namenode, but only accept parameters passed with "--k=v" format which is not very convinient. This patch add spark.* configs in properties file and system property.
      
      *  --principal and --keytabl options are passed to client but when we started thrift server or spark-shell these two are also passed into the Main class (org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 and org.apache.spark.repl.Main).
      In these two main class, arguments passed in will be processed with some 3rd libraries, which will lead to some error: "Invalid option: --principal" or "Unrecgnised option: --principal".
      We should pass these command args in different forms, say system properties.
      
      Author: WangTaoTheTonic <wangtao111@huawei.com>
      
      Closes #6051 from WangTaoTheTonic/SPARK-7524 and squashes the following commits:
      
      e65699a [WangTaoTheTonic] change logic to loadEnvironments
      ebd9ea0 [WangTaoTheTonic] merge master
      ecfe43a [WangTaoTheTonic] pass keytab and principal seperately in different mode
      33a7f40 [WangTaoTheTonic] expand the use of the current configs
      08bb4e8 [WangTaoTheTonic] fix wrong cite
      73afa64 [WangTaoTheTonic] add configs for keytab and principal, move originals to internal
      a51b133d
    • Reynold Xin's avatar
      [SPARK-7929] Turn whitespace checker on for more token types. · 97a60cf7
      Reynold Xin authored
      This is the last batch of changes to complete SPARK-7929.
      
      Previous related PRs:
      https://github.com/apache/spark/pull/6480
      https://github.com/apache/spark/pull/6478
      https://github.com/apache/spark/pull/6477
      https://github.com/apache/spark/pull/6476
      https://github.com/apache/spark/pull/6475
      https://github.com/apache/spark/pull/6474
      https://github.com/apache/spark/pull/6473
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #6487 from rxin/whitespace-lint and squashes the following commits:
      
      b33d43d [Reynold Xin] [SPARK-7929] Turn whitespace checker on for more token types.
      97a60cf7
  17. May 26, 2015
    • zsxwing's avatar
      [SPARK-6602] [CORE] Remove some places in core that calling SparkEnv.actorSystem · 9f742241
      zsxwing authored
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #6333 from zsxwing/remove-actor-system-usage and squashes the following commits:
      
      f125aa6 [zsxwing] Fix YarnAllocatorSuite
      ceadcf6 [zsxwing] Change the "port" parameter type of "AkkaUtils.address" to "int"; update ApplicationMaster and YarnAllocator to get the driverUrl from RpcEnv
      3239380 [zsxwing] Remove some places in core that calling SparkEnv.actorSystem
      9f742241
  18. May 21, 2015
    • Hari Shreedharan's avatar
      [SPARK-7657] [YARN] Add driver logs links in application UI, in cluster mode. · 956c4c91
      Hari Shreedharan authored
      This PR adds the URLs to the driver logs to `SparkListenerApplicationStarted` event, which is later used by the `ExecutorsListener` to populate the URLs to the driver logs in its own state. This info is then used when the UI is rendered to display links to the logs.
      
      Author: Hari Shreedharan <hshreedharan@apache.org>
      
      Closes #6166 from harishreedharan/am-log-link and squashes the following commits:
      
      943fc4f [Hari Shreedharan] Merge remote-tracking branch 'asf/master' into am-log-link
      9e5c04b [Hari Shreedharan] Merge remote-tracking branch 'asf/master' into am-log-link
      b3f9b9d [Hari Shreedharan] Updated comment based on feedback.
      0840a95 [Hari Shreedharan] Move the result and sc.stop back to original location, minor import changes.
      537a2f7 [Hari Shreedharan] Add test to ensure the log urls are populated and valid.
      4033725 [Hari Shreedharan] Adding comments explaining how node reports are used to get the log urls.
      6c5c285 [Hari Shreedharan] Import order.
      346f4ea [Hari Shreedharan] Review feedback fixes.
      629c1dc [Hari Shreedharan] Cleanup.
      99fb1a3 [Hari Shreedharan] Send the log urls in App start event, to ensure that other listeners are not affected.
      c0de336 [Hari Shreedharan] Ensure new unit test cleans up after itself.
      50cdae3 [Hari Shreedharan] Added unit test, made the approach generic.
      402e8e4 [Hari Shreedharan] Use `NodeReport` to get the URL for the logs. Also, make the environment variables generic so other cluster managers can use them as well.
      1cf338f [Hari Shreedharan] [SPARK-7657][YARN] Add driver link in application UI, in cluster mode.
      956c4c91
    • Andrew Or's avatar
      [SPARK-7775] YARN AM negative sleep exception · 15680aee
      Andrew Or authored
      ```
      SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
      SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
      Exception in thread "Reporter" java.lang.IllegalArgumentException: timeout value is negative
        at java.lang.Thread.sleep(Native Method)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:356)
      ```
      This kills the reporter thread. This is caused by #6082 (merged into master branch only).
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #6305 from andrewor14/yarn-negative-sleep and squashes the following commits:
      
      b970770 [Andrew Or] Use existing cap
      56d6e5e [Andrew Or] Avoid negative sleep
      15680aee
  19. May 20, 2015
    • ehnalis's avatar
      [SPARK-7533] [YARN] Decrease spacing between AM-RM heartbeats. · 3ddf051e
      ehnalis authored
      Added faster RM-heartbeats on pending container allocations with multiplicative back-off.
      Also updated related documentations.
      
      Author: ehnalis <zoltan.zvara@gmail.com>
      
      Closes #6082 from ehnalis/yarn and squashes the following commits:
      
      a1d2101 [ehnalis] MIss-spell fixed.
      90f8ba4 [ehnalis] Changed default HB values.
      6120295 [ehnalis] Removed the bug, when allocation heartbeat would not start from initial value.
      08bac63 [ehnalis] Refined style, grammar, removed duplicated code.
      073d283 [ehnalis] [SPARK-7533] [YARN] Decrease spacing between AM-RM heartbeats.
      d4408c9 [ehnalis] [SPARK-7533] [YARN] Decrease spacing between AM-RM heartbeats.
      3ddf051e
  20. May 15, 2015
    • Kousuke Saruta's avatar
      [SPARK-7503] [YARN] Resources in .sparkStaging directory can't be cleaned up on error · c64ff803
      Kousuke Saruta authored
      When we run applications on YARN with cluster mode, uploaded resources on .sparkStaging directory can't be cleaned up in case of failure of uploading local resources.
      
      You can see this issue by running following command.
      ```
      bin/spark-submit --master yarn --deploy-mode cluster --class <someClassName> <non-existing-jar>
      ```
      
      Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
      
      Closes #6026 from sarutak/delete-uploaded-resources-on-error and squashes the following commits:
      
      caef9f4 [Kousuke Saruta] Fixed style
      882f921 [Kousuke Saruta] Wrapped Client#submitApplication with try/catch blocks in order to delete resources on error
      1786ca4 [Kousuke Saruta] Merge branch 'master' of https://github.com/apache/spark into delete-uploaded-resources-on-error
      f61071b [Kousuke Saruta] Fixed cleanup problem
      c64ff803
  21. May 14, 2015
    • FavioVazquez's avatar
      [SPARK-7249] Updated Hadoop dependencies due to inconsistency in the versions · 7fb715de
      FavioVazquez authored
      Updated Hadoop dependencies due to inconsistency in the versions. Now the global properties are the ones used by the hadoop-2.2 profile, and the profile was set to empty but kept for backwards compatibility reasons.
      
      Changes proposed by vanzin resulting from previous pull-request https://github.com/apache/spark/pull/5783 that did not fixed the problem correctly.
      
      Please let me know if this is the correct way of doing this, the comments of vanzin are in the pull-request mentioned.
      
      Author: FavioVazquez <favio.vazquezp@gmail.com>
      
      Closes #5786 from FavioVazquez/update-hadoop-dependencies and squashes the following commits:
      
      11670e5 [FavioVazquez] - Added missing instance of -Phadoop-2.2 in create-release.sh
      379f50d [FavioVazquez] - Added instances of -Phadoop-2.2 in create-release.sh, run-tests, scalastyle and building-spark.md - Reconstructed docs to not ask users to rely on default behavior
      3f9249d [FavioVazquez] Merge branch 'master' of https://github.com/apache/spark into update-hadoop-dependencies
      31bdafa [FavioVazquez] - Added missing instances in -Phadoop-1 in create-release.sh, run-tests and in the building-spark documentation
      cbb93e8 [FavioVazquez] - Added comment related to SPARK-3710 about  hadoop-yarn-server-tests in Hadoop 2.2 that fails to pull some needed dependencies
      83dc332 [FavioVazquez] - Cleaned up the main POM concerning the yarn profile - Erased hadoop-2.2 profile from yarn/pom.xml and its content was integrated into yarn/pom.xml
      93f7624 [FavioVazquez] - Deleted unnecessary comments and <activation> tag on the YARN profile in the main POM
      668d126 [FavioVazquez] - Moved <dependencies> <activation> and <properties> sections of the hadoop-2.2 profile in the YARN POM to the YARN profile in the root POM - Erased unnecessary hadoop-2.2 profile from the YARN POM
      fda6a51 [FavioVazquez] - Updated hadoop1 releases in create-release.sh  due to changes in the default hadoop version set - Erased unnecessary instance of -Dyarn.version=2.2.0 in create-release.sh - Prettify comment in yarn/pom.xml
      0470587 [FavioVazquez] - Erased unnecessary instance of -Phadoop-2.2 -Dhadoop.version=2.2.0 in create-release.sh - Updated how the releases are made in the create-release.sh no that the default hadoop version is the 2.2.0 - Erased unnecessary instance of -Phadoop-2.2 -Dhadoop.version=2.2.0 in scalastyle - Erased unnecessary instance of -Phadoop-2.2 -Dhadoop.version=2.2.0 in run-tests - Better example given in the hadoop-third-party-distributions.md now that the default hadoop version is 2.2.0
      a650779 [FavioVazquez] - Default value of avro.mapred.classifier has been set to hadoop2 in pom.xml - Cleaned up hadoop-2.3 and 2.4 profiles due to change in the default set in avro.mapred.classifier in pom.xml
      199f40b [FavioVazquez] - Erased unnecessary CDH5-specific note in docs/building-spark.md - Remove example of instance -Phadoop-2.2 -Dhadoop.version=2.2.0 in docs/building-spark.md - Enabled hadoop-2.2 profile when the Hadoop version is 2.2.0, which is now the default .Added comment in the yarn/pom.xml to specify that.
      88a8b88 [FavioVazquez] - Simplified Hadoop profiles due to new setting of global properties in the pom.xml file - Added comment to specify that the hadoop-2.2 profile is now the default hadoop profile in the pom.xml file - Erased hadoop-2.2 from related hadoop profiles now that is a no-op in the make-distribution.sh file
      70b8344 [FavioVazquez] - Fixed typo in the make-distribution.sh file and added hadoop-1 in the Related profiles
      287fa2f [FavioVazquez] - Updated documentation about specifying the hadoop version in building-spark. Now is clear that Spark will build against Hadoop 2.2.0 by default. - Added Cloudera CDH 5.3.3 without MapReduce example in the building-spark doc.
      1354292 [FavioVazquez] - Fixed hadoop-1 version to match jenkins build profile in hadoop1.0 tests and documentation
      6b4bfaf [FavioVazquez] - Cleanup in hadoop-2.x profiles since they contained mostly redundant stuff.
      7e9955d [FavioVazquez] - Updated Hadoop dependencies due to inconsistency in the versions. Now the global properties are the ones used by the hadoop-2.2 profile, and the profile was set to empty but kept for backwards compatibility reasons
      660decc [FavioVazquez] - Updated Hadoop dependencies due to inconsistency in the versions. Now the global properties are the ones used by the hadoop-2.2 profile, and the profile was set to empty but kept for backwards compatibility reasons
      ec91ce3 [FavioVazquez] - Updated protobuf-java version of com.google.protobuf dependancy to fix blocking error when connecting to HDFS via the Hadoop Cloudera HDFS CDH5 (fix for 2.5.0-cdh5.3.3 version)
      7fb715de
  22. May 11, 2015
    • Sandy Ryza's avatar
      [SPARK-6470] [YARN] Add support for YARN node labels. · 82fee9d9
      Sandy Ryza authored
      This is difficult to write a test for because it relies on the latest version of YARN, but I verified manually that the patch does pass along the label expression on this version and containers are successfully launched.
      
      Author: Sandy Ryza <sandy@cloudera.com>
      
      Closes #5242 from sryza/sandy-spark-6470 and squashes the following commits:
      
      6af87b9 [Sandy Ryza] Change info to warning
      6e22d99 [Sandy Ryza] [YARN] SPARK-6470.  Add support for YARN node labels.
      82fee9d9
  23. May 08, 2015
    • Ashwin Shankar's avatar
      [SPARK-7451] [YARN] Preemption of executors is counted as failure causing Spark job to fail · b6c797b0
      Ashwin Shankar authored
      Added a check to handle container exit status for the preemption scenario, log an INFO message in such cases and move on.
      andrewor14
      
      Author: Ashwin Shankar <ashankar@netflix.com>
      
      Closes #5993 from ashwinshankar77/SPARK-7451 and squashes the following commits:
      
      90900cf [Ashwin Shankar] Fix log info message
      cf8b6cf [Ashwin Shankar] Stop counting preemption of executors as failure
      b6c797b0
    • Lianhui Wang's avatar
      [SPARK-6869] [PYSPARK] Add pyspark archives path to PYTHONPATH · ebff7327
      Lianhui Wang authored
      Based on https://github.com/apache/spark/pull/5478 that provide a PYSPARK_ARCHIVES_PATH env. within this PR, we just should export PYSPARK_ARCHIVES_PATH=/user/spark/pyspark.zip,/user/spark/python/lib/py4j-0.8.2.1-src.zip in conf/spark-env.sh when we don't install PySpark on each node of Yarn. i run python application successfully on yarn-client and yarn-cluster with this PR.
      andrewor14 sryza Sephiroth-Lin Can you take a look at this?thanks.
      
      Author: Lianhui Wang <lianhuiwang09@gmail.com>
      
      Closes #5580 from lianhuiwang/SPARK-6869 and squashes the following commits:
      
      66ffa43 [Lianhui Wang] Update Client.scala
      c2ad0f9 [Lianhui Wang] Update Client.scala
      1c8f664 [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
      008850a [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
      f0b4ed8 [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
      150907b [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
      20402cd [Lianhui Wang] use ZipEntry
      9d87c3f [Lianhui Wang] update scala style
      e7bd971 [Lianhui Wang] address vanzin's comments
      4b8a3ed [Lianhui Wang] use pyArchivesEnvOpt
      e6b573b [Lianhui Wang] address vanzin's comments
      f11f84a [Lianhui Wang] zip pyspark archives
      5192cca [Lianhui Wang] update import path
      3b1e4c8 [Lianhui Wang] address tgravescs's comments
      9396346 [Lianhui Wang] put zip to make-distribution.sh
      0d2baf7 [Lianhui Wang] update import paths
      e0179be [Lianhui Wang] add zip pyspark archives in build or sparksubmit
      31e8e06 [Lianhui Wang] update code style
      9f31dac [Lianhui Wang] update code and add comments
      f72987c [Lianhui Wang] add archives path to PYTHONPATH
      ebff7327
  24. May 05, 2015
    • shekhar.bansal's avatar
      [SPARK-6653] [YARN] New config to specify port for sparkYarnAM actor system · fc8feaa8
      shekhar.bansal authored
      Author: shekhar.bansal <shekhar.bansal@guavus.com>
      
      Closes #5719 from zuxqoj/master and squashes the following commits:
      
      5574ff7 [shekhar.bansal] [SPARK-6653][yarn] New config to specify port for sparkYarnAM actor system
      5117258 [shekhar.bansal] [SPARK-6653][yarn] New config to specify port for sparkYarnAM actor system
      9de5330 [shekhar.bansal] [SPARK-6653][yarn] New config to specify port for sparkYarnAM actor system
      456a592 [shekhar.bansal] [SPARK-6653][yarn] New configuration property to specify port for sparkYarnAM actor system
      803e93e [shekhar.bansal] [SPARK-6653][yarn] New configuration property to specify port for sparkYarnAM actor system
      fc8feaa8
  25. May 01, 2015
    • Hari Shreedharan's avatar
      [SPARK-5342] [YARN] Allow long running Spark apps to run on secure YARN/HDFS · b1f4ca82
      Hari Shreedharan authored
      Take 2. Does the same thing as #4688, but fixes Hadoop-1 build.
      
      Author: Hari Shreedharan <hshreedharan@apache.org>
      
      Closes #5823 from harishreedharan/kerberos-longrunning and squashes the following commits:
      
      3c86bba [Hari Shreedharan] Import fixes. Import postfixOps explicitly.
      4d04301 [Hari Shreedharan] Minor formatting fixes.
      b5e7a72 [Hari Shreedharan] Remove reflection, use a method in SparkHadoopUtil to update the token renewer.
      7bff6e9 [Hari Shreedharan] Make sure all required classes are present in the jar. Fix import order.
      e851f70 [Hari Shreedharan] Move the ExecutorDelegationTokenRenewer to yarn module. Use reflection to use it.
      36eb8a9 [Hari Shreedharan] Change the renewal interval config param. Fix a bunch of comments.
      611923a [Hari Shreedharan] Make sure the namenodes are listed correctly for creating tokens.
      09fe224 [Hari Shreedharan] Use token.renew to get token's renewal interval rather than using hdfs-site.xml
      6963bbc [Hari Shreedharan] Schedule renewal in AM before starting user class. Else, a restarted AM cannot access HDFS if the user class tries to.
      072659e [Hari Shreedharan] Fix build failure caused by thread factory getting moved to ThreadUtils.
      f041dd3 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
      42eead4 [Hari Shreedharan] Remove RPC part. Refactor and move methods around, use renewal interval rather than max lifetime to create new tokens.
      ebb36f5 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
      bc083e3 [Hari Shreedharan] Overload RegisteredExecutor to send tokens. Minor doc updates.
      7b19643 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
      8a4f268 [Hari Shreedharan] Added docs in the security guide. Changed some code to ensure that the renewer objects are created only if required.
      e800c8b [Hari Shreedharan] Restore original RegisteredExecutor message, and send new tokens via NewTokens message.
      0e9507e [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
      7f1bc58 [Hari Shreedharan] Minor fixes, cleanup.
      bcd11f9 [Hari Shreedharan] Refactor AM and Executor token update code into separate classes, also send tokens via akka on executor startup.
      f74303c [Hari Shreedharan] Move the new logic into specialized classes. Add cleanup for old credentials files.
      2f9975c [Hari Shreedharan] Ensure new tokens are written out immediately on AM restart. Also, pikc up the latest suffix from HDFS if the AM is restarted.
      61b2b27 [Hari Shreedharan] Account for AM restarts by making sure lastSuffix is read from the files on HDFS.
      62c45ce [Hari Shreedharan] Relogin from keytab periodically.
      fa233bd [Hari Shreedharan] Adding logging, fixing minor formatting and ordering issues.
      42813b4 [Hari Shreedharan] Remove utils.sh, which was re-added due to merge with master.
      0de27ee [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
      55522e3 [Hari Shreedharan] Fix failure caused by Preconditions ambiguity.
      9ef5f1b [Hari Shreedharan] Added explanation of how the credentials refresh works, some other minor fixes.
      f4fd711 [Hari Shreedharan] Fix SparkConf usage.
      2debcea [Hari Shreedharan] Change the file structure for credentials files. I will push a followup patch which adds a cleanup mechanism for old credentials files. The credentials files are small and few enough for it to cause issues on HDFS.
      af6d5f0 [Hari Shreedharan] Cleaning up files where changes weren't required.
      f0f54cb [Hari Shreedharan] Be more defensive when updating the credentials file.
      f6954da [Hari Shreedharan] Got rid of Akka communication to renew, instead the executors check a known file's modification time to read the credentials.
      5c11c3e [Hari Shreedharan] Move tests to YarnSparkHadoopUtil to fix compile issues.
      b4cb917 [Hari Shreedharan] Send keytab to AM via DistributedCache rather than directly via HDFS
      0985b4e [Hari Shreedharan] Write tokens to HDFS and read them back when required, rather than sending them over the wire.
      d79b2b9 [Hari Shreedharan] Make sure correct credentials are passed to FileSystem#addDelegationTokens()
      8c6928a [Hari Shreedharan] Fix issue caused by direct creation of Actor object.
      fb27f46 [Hari Shreedharan] Make sure principal and keytab are set before CoarseGrainedSchedulerBackend is started. Also schedule re-logins in CoarseGrainedSchedulerBackend#start()
      41efde0 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
      d282d7a [Hari Shreedharan] Fix ClientSuite to set YARN mode, so that the correct class is used in tests.
      bcfc374 [Hari Shreedharan] Fix Hadoop-1 build by adding no-op methods in SparkHadoopUtil, with impl in YarnSparkHadoopUtil.
      f8fe694 [Hari Shreedharan] Handle None if keytab-login is not scheduled.
      2b0d745 [Hari Shreedharan] [SPARK-5342][YARN] Allow long running Spark apps to run on secure YARN/HDFS.
      ccba5bc [Hari Shreedharan] WIP: More changes wrt kerberos
      77914dd [Hari Shreedharan] WIP: Add kerberos principal and keytab to YARN client.
      b1f4ca82
    • Marcelo Vanzin's avatar
      [SPARK-7281] [YARN] Add option to set AM's lib path in client mode. · 7b5dd3e3
      Marcelo Vanzin authored
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #5813 from vanzin/SPARK-7281 and squashes the following commits:
      
      1cb6f42 [Marcelo Vanzin] [SPARK-7281] [yarn] Add option to set AM's lib path in client mode.
      7b5dd3e3
    • Nishkam Ravi's avatar
      [SPARK-7213] [YARN] Check for read permissions before copying a Hadoop config file · f53a4882
      Nishkam Ravi authored
      Author: Nishkam Ravi <nravi@cloudera.com>
      Author: nishkamravi2 <nishkamravi@gmail.com>
      Author: nravi <nravi@c1704.halxg.cloudera.com>
      
      Closes #5760 from nishkamravi2/master_nravi and squashes the following commits:
      
      eaa13b5 [nishkamravi2] Update Client.scala
      981afd2 [Nishkam Ravi] Check for read permission before initiating copy
      1b81383 [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark into master_nravi
      0f1abd0 [nishkamravi2] Update Utils.scala
      474e3bf [nishkamravi2] Update DiskBlockManager.scala
      97c383e [nishkamravi2] Update Utils.scala
      8691e0c [Nishkam Ravi] Add a try/catch block around Utils.removeShutdownHook
      2be1e76 [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark into master_nravi
      1c13b79 [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark into master_nravi
      bad4349 [nishkamravi2] Update Main.java
      36a6f87 [Nishkam Ravi] Minor changes and bug fixes
      b7f4ae7 [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark into master_nravi
      4a45d6a [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark into master_nravi
      458af39 [Nishkam Ravi] Locate the jar using getLocation, obviates the need to pass assembly path as an argument
      d9658d6 [Nishkam Ravi] Changes for SPARK-6406
      ccdc334 [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark into master_nravi
      3faa7a4 [Nishkam Ravi] Launcher library changes (SPARK-6406)
      345206a [Nishkam Ravi] spark-class merge Merge branch 'master_nravi' of https://github.com/nishkamravi2/spark into master_nravi
      ac58975 [Nishkam Ravi] spark-class changes
      06bfeb0 [nishkamravi2] Update spark-class
      35af990 [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark into master_nravi
      32c3ab3 [nishkamravi2] Update AbstractCommandBuilder.java
      4bd4489 [nishkamravi2] Update AbstractCommandBuilder.java
      746f35b [Nishkam Ravi] "hadoop" string in the assembly name should not be mandatory (everywhere else in spark we mandate spark-assembly*hadoop*.jar)
      bfe96e0 [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark into master_nravi
      ee902fa [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark into master_nravi
      d453197 [nishkamravi2] Update NewHadoopRDD.scala
      6f41a1d [nishkamravi2] Update NewHadoopRDD.scala
      0ce2c32 [nishkamravi2] Update HadoopRDD.scala
      f7e33c2 [Nishkam Ravi] Merge branch 'master_nravi' of https://github.com/nishkamravi2/spark into master_nravi
      ba1eb8b [Nishkam Ravi] Try-catch block around the two occurrences of removeShutDownHook. Deletion of semi-redundant occurrences of expensive operation inShutDown.
      71d0e17 [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark into master_nravi
      494d8c0 [nishkamravi2] Update DiskBlockManager.scala
      3c5ddba [nishkamravi2] Update DiskBlockManager.scala
      f0d12de [Nishkam Ravi] Workaround for IllegalStateException caused by recent changes to BlockManager.stop
      79ea8b4 [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark into master_nravi
      b446edc [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark into master_nravi
      5c9a4cb [nishkamravi2] Update TaskSetManagerSuite.scala
      535295a [nishkamravi2] Update TaskSetManager.scala
      3e1b616 [Nishkam Ravi] Modify test for maxResultSize
      9f6583e [Nishkam Ravi] Changes to maxResultSize code (improve error message and add condition to check if maxResultSize > 0)
      5f8f9ed [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark into master_nravi
      636a9ff [nishkamravi2] Update YarnAllocator.scala
      8f76c8b [Nishkam Ravi] Doc change for yarn memory overhead
      35daa64 [Nishkam Ravi] Slight change in the doc for yarn memory overhead
      5ac2ec1 [Nishkam Ravi] Remove out
      dac1047 [Nishkam Ravi] Additional documentation for yarn memory overhead issue
      42c2c3d [Nishkam Ravi] Additional changes for yarn memory overhead issue
      362da5e [Nishkam Ravi] Additional changes for yarn memory overhead
      c726bd9 [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark into master_nravi
      f00fa31 [Nishkam Ravi] Improving logging for AM memoryOverhead
      1cf2d1e [nishkamravi2] Update YarnAllocator.scala
      ebcde10 [Nishkam Ravi] Modify default YARN memory_overhead-- from an additive constant to a multiplier (redone to resolve merge conflicts)
      2e69f11 [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark into master_nravi
      efd688a [Nishkam Ravi] Merge branch 'master' of https://github.com/apache/spark
      2b630f9 [nravi] Accept memory input as "30g", "512M" instead of an int value, to be consistent with rest of Spark
      3bf8fad [nravi] Merge branch 'master' of https://github.com/apache/spark
      5423a03 [nravi] Merge branch 'master' of https://github.com/apache/spark
      eb663ca [nravi] Merge branch 'master' of https://github.com/apache/spark
      df2aeb1 [nravi] Improved fix for ConcurrentModificationIssue (Spark-1097, Hadoop-10456)
      6b840f0 [nravi] Undo the fix for SPARK-1758 (the problem is fixed)
      5108700 [nravi] Fix in Spark for the Concurrent thread modification issue (SPARK-1097, HADOOP-10456)
      681b36f [nravi] Fix for SPARK-1758: failing test org.apache.spark.JavaAPISuite.wholeTextFiles
      f53a4882
    • Marcelo Vanzin's avatar
      [SPARK-4705] Handle multiple app attempts event logs, history server. · 3052f491
      Marcelo Vanzin authored
      This change modifies the event logging listener to write the logs for different application
      attempts to different files. The attempt ID is set by the scheduler backend, so as long
      as the backend returns that ID to SparkContext, things should work. Currently, the
      YARN backend does that.
      
      The history server was also modified to model multiple attempts per application. Each
      attempt has its own UI and a separate row in the listing table, so that users can look at
      all the attempts separately. The UI "adapts" itself to avoid showing attempt-specific info
      when all the applications being shown have a single attempt.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      Author: twinkle sachdeva <twinkle@kite.ggn.in.guavus.com>
      Author: twinkle.sachdeva <twinkle.sachdeva@guavus.com>
      Author: twinkle sachdeva <twinkle.sachdeva@guavus.com>
      
      Closes #5432 from vanzin/SPARK-4705 and squashes the following commits:
      
      7e289fa [Marcelo Vanzin] Review feedback.
      f66dcc5 [Marcelo Vanzin] Merge branch 'master' into SPARK-4705
      bc885b7 [Marcelo Vanzin] Review feedback.
      76a3651 [Marcelo Vanzin] Fix log cleaner, add test.
      7c381ec [Marcelo Vanzin] Merge branch 'master' into SPARK-4705
      1aa309d [Marcelo Vanzin] Improve sorting of app attempts.
      2ad77e7 [Marcelo Vanzin] Missed a reference to the old property name.
      9d59d92 [Marcelo Vanzin] Scalastyle...
      d5a9c37 [Marcelo Vanzin] Update JsonProtocol test, make property name consistent.
      ba34b69 [Marcelo Vanzin] Use Option[String] for attempt id.
      f1cb9b3 [Marcelo Vanzin] Merge branch 'master' into SPARK-4705
      c14ec19 [Marcelo Vanzin] Merge branch 'master' into SPARK-4705
      9092d39 [Marcelo Vanzin] Merge branch 'master' into SPARK-4705
      86de638 [Marcelo Vanzin] Merge branch 'master' into SPARK-4705
      07446c6 [Marcelo Vanzin] Disable striping for app id / name when multiple attempts exist.
      9092af5 [Marcelo Vanzin] Fix HistoryServer test.
      3a14503 [Marcelo Vanzin] Argh scalastyle.
      657ec18 [Marcelo Vanzin] Fix yarn history URL, app links.
      c3e0a82 [Marcelo Vanzin] Move app name to app info, more UI fixes.
      ce5ee5d [Marcelo Vanzin] Misc UI, test, style fixes.
      cbe8bba [Marcelo Vanzin] Attempt ID in listener event should be an option.
      88b1de8 [Marcelo Vanzin] Add a test for apps with multiple attempts.
      3245aa2 [Marcelo Vanzin] Make app attempts part of the history server model.
      5fd5c6f [Marcelo Vanzin] Fix my broken rebase.
      318525a [twinkle.sachdeva] SPARK-4705: 1) moved from directory structure to single file, as per the master branch. 2) Added the attempt id inside the SparkListenerApplicationStart, to make the info available independent of directory structure. 3) Changes in History Server to render the UI as per the snaphot II
      6b2e521 [twinkle sachdeva] SPARK-4705 Incorporating the review comments regarding formatting, will do the rest of the changes after this
      4c1fc26 [twinkle sachdeva] SPARK-4705 Incorporating the review comments regarding formatting, will do the rest of the changes after this
      0eb7722 [twinkle sachdeva] SPARK-4705: Doing cherry-pick of fix into master
      3052f491
  26. Apr 30, 2015
    • Patrick Wendell's avatar
    • Hari Shreedharan's avatar
      [SPARK-5342] [YARN] Allow long running Spark apps to run on secure YARN/HDFS · 6c65da6b
      Hari Shreedharan authored
      Current Spark apps running on Secure YARN/HDFS would not be able to write data
      to HDFS after 7 days, since delegation tokens cannot be renewed beyond that. This
      means Spark Streaming apps will not be able to run on Secure YARN.
      
      This commit adds basic functionality to fix this issue. In this patch:
      - new parameters are added - principal and keytab, which can be used to login to a KDC
      - the client logs in, and then get tokens to start the AM
      - the keytab is copied to the staging directory
      - the AM waits for 60% of the time till expiry of the tokens and then logs in using the keytab
      - each time after 60% of the time, new tokens are created and sent to the executors
      
      Currently, to avoid complicating the architecture, we set the keytab and principal in the
      SparkHadoopUtil singleton, and schedule a login. Once the login is completed, a callback is scheduled.
      
      This is being posted for feedback, so I can gather feedback on the general implementation.
      
      There are currently a bunch of things to do:
      - [x] logging
      - [x] testing - I plan to manually test this soon. If you have ideas of how to add unit tests, comment.
      - [x] add code to ensure that if these params are set in non-YARN cluster mode, we complain
      - [x] documentation
      - [x] Have the executors request for credentials from the AM, so that retries are possible.
      
      Author: Hari Shreedharan <hshreedharan@apache.org>
      
      Closes #4688 from harishreedharan/kerberos-longrunning and squashes the following commits:
      
      36eb8a9 [Hari Shreedharan] Change the renewal interval config param. Fix a bunch of comments.
      611923a [Hari Shreedharan] Make sure the namenodes are listed correctly for creating tokens.
      09fe224 [Hari Shreedharan] Use token.renew to get token's renewal interval rather than using hdfs-site.xml
      6963bbc [Hari Shreedharan] Schedule renewal in AM before starting user class. Else, a restarted AM cannot access HDFS if the user class tries to.
      072659e [Hari Shreedharan] Fix build failure caused by thread factory getting moved to ThreadUtils.
      f041dd3 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
      42eead4 [Hari Shreedharan] Remove RPC part. Refactor and move methods around, use renewal interval rather than max lifetime to create new tokens.
      ebb36f5 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
      bc083e3 [Hari Shreedharan] Overload RegisteredExecutor to send tokens. Minor doc updates.
      7b19643 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
      8a4f268 [Hari Shreedharan] Added docs in the security guide. Changed some code to ensure that the renewer objects are created only if required.
      e800c8b [Hari Shreedharan] Restore original RegisteredExecutor message, and send new tokens via NewTokens message.
      0e9507e [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
      7f1bc58 [Hari Shreedharan] Minor fixes, cleanup.
      bcd11f9 [Hari Shreedharan] Refactor AM and Executor token update code into separate classes, also send tokens via akka on executor startup.
      f74303c [Hari Shreedharan] Move the new logic into specialized classes. Add cleanup for old credentials files.
      2f9975c [Hari Shreedharan] Ensure new tokens are written out immediately on AM restart. Also, pikc up the latest suffix from HDFS if the AM is restarted.
      61b2b27 [Hari Shreedharan] Account for AM restarts by making sure lastSuffix is read from the files on HDFS.
      62c45ce [Hari Shreedharan] Relogin from keytab periodically.
      fa233bd [Hari Shreedharan] Adding logging, fixing minor formatting and ordering issues.
      42813b4 [Hari Shreedharan] Remove utils.sh, which was re-added due to merge with master.
      0de27ee [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
      55522e3 [Hari Shreedharan] Fix failure caused by Preconditions ambiguity.
      9ef5f1b [Hari Shreedharan] Added explanation of how the credentials refresh works, some other minor fixes.
      f4fd711 [Hari Shreedharan] Fix SparkConf usage.
      2debcea [Hari Shreedharan] Change the file structure for credentials files. I will push a followup patch which adds a cleanup mechanism for old credentials files. The credentials files are small and few enough for it to cause issues on HDFS.
      af6d5f0 [Hari Shreedharan] Cleaning up files where changes weren't required.
      f0f54cb [Hari Shreedharan] Be more defensive when updating the credentials file.
      f6954da [Hari Shreedharan] Got rid of Akka communication to renew, instead the executors check a known file's modification time to read the credentials.
      5c11c3e [Hari Shreedharan] Move tests to YarnSparkHadoopUtil to fix compile issues.
      b4cb917 [Hari Shreedharan] Send keytab to AM via DistributedCache rather than directly via HDFS
      0985b4e [Hari Shreedharan] Write tokens to HDFS and read them back when required, rather than sending them over the wire.
      d79b2b9 [Hari Shreedharan] Make sure correct credentials are passed to FileSystem#addDelegationTokens()
      8c6928a [Hari Shreedharan] Fix issue caused by direct creation of Actor object.
      fb27f46 [Hari Shreedharan] Make sure principal and keytab are set before CoarseGrainedSchedulerBackend is started. Also schedule re-logins in CoarseGrainedSchedulerBackend#start()
      41efde0 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
      d282d7a [Hari Shreedharan] Fix ClientSuite to set YARN mode, so that the correct class is used in tests.
      bcfc374 [Hari Shreedharan] Fix Hadoop-1 build by adding no-op methods in SparkHadoopUtil, with impl in YarnSparkHadoopUtil.
      f8fe694 [Hari Shreedharan] Handle None if keytab-login is not scheduled.
      2b0d745 [Hari Shreedharan] [SPARK-5342][YARN] Allow long running Spark apps to run on secure YARN/HDFS.
      ccba5bc [Hari Shreedharan] WIP: More changes wrt kerberos
      77914dd [Hari Shreedharan] WIP: Add kerberos principal and keytab to YARN client.
      6c65da6b
Loading