Skip to content
Snippets Groups Projects
  • Hari Shreedharan's avatar
    b1f4ca82
    [SPARK-5342] [YARN] Allow long running Spark apps to run on secure YARN/HDFS · b1f4ca82
    Hari Shreedharan authored
    Take 2. Does the same thing as #4688, but fixes Hadoop-1 build.
    
    Author: Hari Shreedharan <hshreedharan@apache.org>
    
    Closes #5823 from harishreedharan/kerberos-longrunning and squashes the following commits:
    
    3c86bba [Hari Shreedharan] Import fixes. Import postfixOps explicitly.
    4d04301 [Hari Shreedharan] Minor formatting fixes.
    b5e7a72 [Hari Shreedharan] Remove reflection, use a method in SparkHadoopUtil to update the token renewer.
    7bff6e9 [Hari Shreedharan] Make sure all required classes are present in the jar. Fix import order.
    e851f70 [Hari Shreedharan] Move the ExecutorDelegationTokenRenewer to yarn module. Use reflection to use it.
    36eb8a9 [Hari Shreedharan] Change the renewal interval config param. Fix a bunch of comments.
    611923a [Hari Shreedharan] Make sure the namenodes are listed correctly for creating tokens.
    09fe224 [Hari Shreedharan] Use token.renew to get token's renewal interval rather than using hdfs-site.xml
    6963bbc [Hari Shreedharan] Schedule renewal in AM before starting user class. Else, a restarted AM cannot access HDFS if the user class tries to.
    072659e [Hari Shreedharan] Fix build failure caused by thread factory getting moved to ThreadUtils.
    f041dd3 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
    42eead4 [Hari Shreedharan] Remove RPC part. Refactor and move methods around, use renewal interval rather than max lifetime to create new tokens.
    ebb36f5 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
    bc083e3 [Hari Shreedharan] Overload RegisteredExecutor to send tokens. Minor doc updates.
    7b19643 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
    8a4f268 [Hari Shreedharan] Added docs in the security guide. Changed some code to ensure that the renewer objects are created only if required.
    e800c8b [Hari Shreedharan] Restore original RegisteredExecutor message, and send new tokens via NewTokens message.
    0e9507e [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
    7f1bc58 [Hari Shreedharan] Minor fixes, cleanup.
    bcd11f9 [Hari Shreedharan] Refactor AM and Executor token update code into separate classes, also send tokens via akka on executor startup.
    f74303c [Hari Shreedharan] Move the new logic into specialized classes. Add cleanup for old credentials files.
    2f9975c [Hari Shreedharan] Ensure new tokens are written out immediately on AM restart. Also, pikc up the latest suffix from HDFS if the AM is restarted.
    61b2b27 [Hari Shreedharan] Account for AM restarts by making sure lastSuffix is read from the files on HDFS.
    62c45ce [Hari Shreedharan] Relogin from keytab periodically.
    fa233bd [Hari Shreedharan] Adding logging, fixing minor formatting and ordering issues.
    42813b4 [Hari Shreedharan] Remove utils.sh, which was re-added due to merge with master.
    0de27ee [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
    55522e3 [Hari Shreedharan] Fix failure caused by Preconditions ambiguity.
    9ef5f1b [Hari Shreedharan] Added explanation of how the credentials refresh works, some other minor fixes.
    f4fd711 [Hari Shreedharan] Fix SparkConf usage.
    2debcea [Hari Shreedharan] Change the file structure for credentials files. I will push a followup patch which adds a cleanup mechanism for old credentials files. The credentials files are small and few enough for it to cause issues on HDFS.
    af6d5f0 [Hari Shreedharan] Cleaning up files where changes weren't required.
    f0f54cb [Hari Shreedharan] Be more defensive when updating the credentials file.
    f6954da [Hari Shreedharan] Got rid of Akka communication to renew, instead the executors check a known file's modification time to read the credentials.
    5c11c3e [Hari Shreedharan] Move tests to YarnSparkHadoopUtil to fix compile issues.
    b4cb917 [Hari Shreedharan] Send keytab to AM via DistributedCache rather than directly via HDFS
    0985b4e [Hari Shreedharan] Write tokens to HDFS and read them back when required, rather than sending them over the wire.
    d79b2b9 [Hari Shreedharan] Make sure correct credentials are passed to FileSystem#addDelegationTokens()
    8c6928a [Hari Shreedharan] Fix issue caused by direct creation of Actor object.
    fb27f46 [Hari Shreedharan] Make sure principal and keytab are set before CoarseGrainedSchedulerBackend is started. Also schedule re-logins in CoarseGrainedSchedulerBackend#start()
    41efde0 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
    d282d7a [Hari Shreedharan] Fix ClientSuite to set YARN mode, so that the correct class is used in tests.
    bcfc374 [Hari Shreedharan] Fix Hadoop-1 build by adding no-op methods in SparkHadoopUtil, with impl in YarnSparkHadoopUtil.
    f8fe694 [Hari Shreedharan] Handle None if keytab-login is not scheduled.
    2b0d745 [Hari Shreedharan] [SPARK-5342][YARN] Allow long running Spark apps to run on secure YARN/HDFS.
    ccba5bc [Hari Shreedharan] WIP: More changes wrt kerberos
    77914dd [Hari Shreedharan] WIP: Add kerberos principal and keytab to YARN client.
    b1f4ca82
    History
    [SPARK-5342] [YARN] Allow long running Spark apps to run on secure YARN/HDFS
    Hari Shreedharan authored
    Take 2. Does the same thing as #4688, but fixes Hadoop-1 build.
    
    Author: Hari Shreedharan <hshreedharan@apache.org>
    
    Closes #5823 from harishreedharan/kerberos-longrunning and squashes the following commits:
    
    3c86bba [Hari Shreedharan] Import fixes. Import postfixOps explicitly.
    4d04301 [Hari Shreedharan] Minor formatting fixes.
    b5e7a72 [Hari Shreedharan] Remove reflection, use a method in SparkHadoopUtil to update the token renewer.
    7bff6e9 [Hari Shreedharan] Make sure all required classes are present in the jar. Fix import order.
    e851f70 [Hari Shreedharan] Move the ExecutorDelegationTokenRenewer to yarn module. Use reflection to use it.
    36eb8a9 [Hari Shreedharan] Change the renewal interval config param. Fix a bunch of comments.
    611923a [Hari Shreedharan] Make sure the namenodes are listed correctly for creating tokens.
    09fe224 [Hari Shreedharan] Use token.renew to get token's renewal interval rather than using hdfs-site.xml
    6963bbc [Hari Shreedharan] Schedule renewal in AM before starting user class. Else, a restarted AM cannot access HDFS if the user class tries to.
    072659e [Hari Shreedharan] Fix build failure caused by thread factory getting moved to ThreadUtils.
    f041dd3 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
    42eead4 [Hari Shreedharan] Remove RPC part. Refactor and move methods around, use renewal interval rather than max lifetime to create new tokens.
    ebb36f5 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
    bc083e3 [Hari Shreedharan] Overload RegisteredExecutor to send tokens. Minor doc updates.
    7b19643 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
    8a4f268 [Hari Shreedharan] Added docs in the security guide. Changed some code to ensure that the renewer objects are created only if required.
    e800c8b [Hari Shreedharan] Restore original RegisteredExecutor message, and send new tokens via NewTokens message.
    0e9507e [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
    7f1bc58 [Hari Shreedharan] Minor fixes, cleanup.
    bcd11f9 [Hari Shreedharan] Refactor AM and Executor token update code into separate classes, also send tokens via akka on executor startup.
    f74303c [Hari Shreedharan] Move the new logic into specialized classes. Add cleanup for old credentials files.
    2f9975c [Hari Shreedharan] Ensure new tokens are written out immediately on AM restart. Also, pikc up the latest suffix from HDFS if the AM is restarted.
    61b2b27 [Hari Shreedharan] Account for AM restarts by making sure lastSuffix is read from the files on HDFS.
    62c45ce [Hari Shreedharan] Relogin from keytab periodically.
    fa233bd [Hari Shreedharan] Adding logging, fixing minor formatting and ordering issues.
    42813b4 [Hari Shreedharan] Remove utils.sh, which was re-added due to merge with master.
    0de27ee [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
    55522e3 [Hari Shreedharan] Fix failure caused by Preconditions ambiguity.
    9ef5f1b [Hari Shreedharan] Added explanation of how the credentials refresh works, some other minor fixes.
    f4fd711 [Hari Shreedharan] Fix SparkConf usage.
    2debcea [Hari Shreedharan] Change the file structure for credentials files. I will push a followup patch which adds a cleanup mechanism for old credentials files. The credentials files are small and few enough for it to cause issues on HDFS.
    af6d5f0 [Hari Shreedharan] Cleaning up files where changes weren't required.
    f0f54cb [Hari Shreedharan] Be more defensive when updating the credentials file.
    f6954da [Hari Shreedharan] Got rid of Akka communication to renew, instead the executors check a known file's modification time to read the credentials.
    5c11c3e [Hari Shreedharan] Move tests to YarnSparkHadoopUtil to fix compile issues.
    b4cb917 [Hari Shreedharan] Send keytab to AM via DistributedCache rather than directly via HDFS
    0985b4e [Hari Shreedharan] Write tokens to HDFS and read them back when required, rather than sending them over the wire.
    d79b2b9 [Hari Shreedharan] Make sure correct credentials are passed to FileSystem#addDelegationTokens()
    8c6928a [Hari Shreedharan] Fix issue caused by direct creation of Actor object.
    fb27f46 [Hari Shreedharan] Make sure principal and keytab are set before CoarseGrainedSchedulerBackend is started. Also schedule re-logins in CoarseGrainedSchedulerBackend#start()
    41efde0 [Hari Shreedharan] Merge branch 'master' into kerberos-longrunning
    d282d7a [Hari Shreedharan] Fix ClientSuite to set YARN mode, so that the correct class is used in tests.
    bcfc374 [Hari Shreedharan] Fix Hadoop-1 build by adding no-op methods in SparkHadoopUtil, with impl in YarnSparkHadoopUtil.
    f8fe694 [Hari Shreedharan] Handle None if keytab-login is not scheduled.
    2b0d745 [Hari Shreedharan] [SPARK-5342][YARN] Allow long running Spark apps to run on secure YARN/HDFS.
    ccba5bc [Hari Shreedharan] WIP: More changes wrt kerberos
    77914dd [Hari Shreedharan] WIP: Add kerberos principal and keytab to YARN client.