Skip to content
Snippets Groups Projects
  • Marcelo Vanzin's avatar
    07f1c544
    [SPARK-13577][YARN] Allow Spark jar to be multiple jars, archive. · 07f1c544
    Marcelo Vanzin authored
    In preparation for the demise of assemblies, this change allows the
    YARN backend to use multiple jars and globs as the "Spark jar". The
    config option has been renamed to "spark.yarn.jars" to reflect that.
    
    A second option "spark.yarn.archive" was also added; if set, this
    takes precedence and uploads an archive expected to contain the jar
    files with the Spark code and its dependencies.
    
    Existing deployments should keep working, mostly. This change drops
    support for the "SPARK_JAR" environment variable, and also does not
    fall back to using "jarOfClass" if no configuration is set, falling
    back to finding files under SPARK_HOME instead. This should be fine
    since "jarOfClass" probably wouldn't work unless you were using
    spark-submit anyway.
    
    Tested with the unit tests, and trying the different config options
    on a YARN cluster.
    
    Author: Marcelo Vanzin <vanzin@cloudera.com>
    
    Closes #11500 from vanzin/SPARK-13577.
    07f1c544
    History
    [SPARK-13577][YARN] Allow Spark jar to be multiple jars, archive.
    Marcelo Vanzin authored
    In preparation for the demise of assemblies, this change allows the
    YARN backend to use multiple jars and globs as the "Spark jar". The
    config option has been renamed to "spark.yarn.jars" to reflect that.
    
    A second option "spark.yarn.archive" was also added; if set, this
    takes precedence and uploads an archive expected to contain the jar
    files with the Spark code and its dependencies.
    
    Existing deployments should keep working, mostly. This change drops
    support for the "SPARK_JAR" environment variable, and also does not
    fall back to using "jarOfClass" if no configuration is set, falling
    back to finding files under SPARK_HOME instead. This should be fine
    since "jarOfClass" probably wouldn't work unless you were using
    spark-submit anyway.
    
    Tested with the unit tests, and trying the different config options
    on a YARN cluster.
    
    Author: Marcelo Vanzin <vanzin@cloudera.com>
    
    Closes #11500 from vanzin/SPARK-13577.