Skip to content
Snippets Groups Projects
  • Michael Gummelt's avatar
    a18d6371
    [SPARK-20434][YARN][CORE] Move Hadoop delegation token code from yarn to core · a18d6371
    Michael Gummelt authored
    ## What changes were proposed in this pull request?
    
    Move Hadoop delegation token code from `spark-yarn` to `spark-core`, so that other schedulers (such as Mesos), may use it.  In order to avoid exposing Hadoop interfaces in spark-core, the new Hadoop delegation token classes are kept private.  In order to provider backward compatiblity, and to allow YARN users to continue to load their own delegation token providers via Java service loading, the old YARN interfaces, as well as the client code that uses them, have been retained.
    
    Summary:
    - Move registered `yarn.security.ServiceCredentialProvider` classes from `spark-yarn` to `spark-core`.  Moved them into a new, private hierarchy under `HadoopDelegationTokenProvider`.  Client code in `HadoopDelegationTokenManager` now loads credentials from a whitelist of three providers (`HadoopFSDelegationTokenProvider`, `HiveDelegationTokenProvider`, `HBaseDelegationTokenProvider`), instead of service loading, which means that users are not able to implement their own delegation token providers, as they are in the `spark-yarn` module.
    
    - The `yarn.security.ServiceCredentialProvider` interface has been kept for backwards compatibility, and to continue to allow YARN users to implement their own delegation token provider implementations.  Client code in YARN now fetches tokens via the new `YARNHadoopDelegationTokenManager` class, which fetches tokens from the core providers through `HadoopDelegationTokenManager`, as well as service loads them from `yarn.security.ServiceCredentialProvider`.
    
    Old Hierarchy:
    
    ```
    yarn.security.ServiceCredentialProvider (service loaded)
      HadoopFSCredentialProvider
      HiveCredentialProvider
      HBaseCredentialProvider
    yarn.security.ConfigurableCredentialManager
    ```
    
    New Hierarchy:
    
    ```
    HadoopDelegationTokenManager
    HadoopDelegationTokenProvider (not service loaded)
      HadoopFSDelegationTokenProvider
      HiveDelegationTokenProvider
      HBaseDelegationTokenProvider
    
    yarn.security.ServiceCredentialProvider (service loaded)
    yarn.security.YARNHadoopDelegationTokenManager
    ```
    ## How was this patch tested?
    
    unit tests
    
    Author: Michael Gummelt <mgummelt@mesosphere.io>
    Author: Dr. Stefan Schimanski <sttts@mesosphere.io>
    
    Closes #17723 from mgummelt/SPARK-20434-refactor-kerberos.
    a18d6371
    History
    [SPARK-20434][YARN][CORE] Move Hadoop delegation token code from yarn to core
    Michael Gummelt authored
    ## What changes were proposed in this pull request?
    
    Move Hadoop delegation token code from `spark-yarn` to `spark-core`, so that other schedulers (such as Mesos), may use it.  In order to avoid exposing Hadoop interfaces in spark-core, the new Hadoop delegation token classes are kept private.  In order to provider backward compatiblity, and to allow YARN users to continue to load their own delegation token providers via Java service loading, the old YARN interfaces, as well as the client code that uses them, have been retained.
    
    Summary:
    - Move registered `yarn.security.ServiceCredentialProvider` classes from `spark-yarn` to `spark-core`.  Moved them into a new, private hierarchy under `HadoopDelegationTokenProvider`.  Client code in `HadoopDelegationTokenManager` now loads credentials from a whitelist of three providers (`HadoopFSDelegationTokenProvider`, `HiveDelegationTokenProvider`, `HBaseDelegationTokenProvider`), instead of service loading, which means that users are not able to implement their own delegation token providers, as they are in the `spark-yarn` module.
    
    - The `yarn.security.ServiceCredentialProvider` interface has been kept for backwards compatibility, and to continue to allow YARN users to implement their own delegation token provider implementations.  Client code in YARN now fetches tokens via the new `YARNHadoopDelegationTokenManager` class, which fetches tokens from the core providers through `HadoopDelegationTokenManager`, as well as service loads them from `yarn.security.ServiceCredentialProvider`.
    
    Old Hierarchy:
    
    ```
    yarn.security.ServiceCredentialProvider (service loaded)
      HadoopFSCredentialProvider
      HiveCredentialProvider
      HBaseCredentialProvider
    yarn.security.ConfigurableCredentialManager
    ```
    
    New Hierarchy:
    
    ```
    HadoopDelegationTokenManager
    HadoopDelegationTokenProvider (not service loaded)
      HadoopFSDelegationTokenProvider
      HiveDelegationTokenProvider
      HBaseDelegationTokenProvider
    
    yarn.security.ServiceCredentialProvider (service loaded)
    yarn.security.YARNHadoopDelegationTokenManager
    ```
    ## How was this patch tested?
    
    unit tests
    
    Author: Michael Gummelt <mgummelt@mesosphere.io>
    Author: Dr. Stefan Schimanski <sttts@mesosphere.io>
    
    Closes #17723 from mgummelt/SPARK-20434-refactor-kerberos.