Skip to content
Snippets Groups Projects
Commit 1b98fa2e authored by jerryshao's avatar jerryshao Committed by Marcelo Vanzin
Browse files

[YARN][DOC][MINOR] Remove several obsolete env variables and update the doc

## What changes were proposed in this pull request?

Remove several obsolete env variables not supported for Spark on YARN now, also updates the docs to include several changes with 2.0.

## How was this patch tested?

N/A

CC vanzin tgravescs

Author: jerryshao <sshao@hortonworks.com>

Closes #13296 from jerryshao/yarn-doc.
parent 623aae59
No related branches found
No related tags found
No related merge requests found
...@@ -40,10 +40,6 @@ ...@@ -40,10 +40,6 @@
# - SPARK_EXECUTOR_CORES, Number of cores for the executors (Default: 1). # - SPARK_EXECUTOR_CORES, Number of cores for the executors (Default: 1).
# - SPARK_EXECUTOR_MEMORY, Memory per Executor (e.g. 1000M, 2G) (Default: 1G) # - SPARK_EXECUTOR_MEMORY, Memory per Executor (e.g. 1000M, 2G) (Default: 1G)
# - SPARK_DRIVER_MEMORY, Memory for Driver (e.g. 1000M, 2G) (Default: 1G) # - SPARK_DRIVER_MEMORY, Memory for Driver (e.g. 1000M, 2G) (Default: 1G)
# - SPARK_YARN_APP_NAME, The name of your application (Default: Spark)
# - SPARK_YARN_QUEUE, The hadoop queue to use for allocation requests (Default: 'default')
# - SPARK_YARN_DIST_FILES, Comma separated list of files to be distributed with the job.
# - SPARK_YARN_DIST_ARCHIVES, Comma separated list of archives to be distributed with the job.
# Options for the daemons used in the standalone deploy mode # Options for the daemons used in the standalone deploy mode
# - SPARK_MASTER_IP, to bind the master to a different IP address or hostname # - SPARK_MASTER_IP, to bind the master to a different IP address or hostname
......
...@@ -60,6 +60,8 @@ Running Spark on YARN requires a binary distribution of Spark which is built wit ...@@ -60,6 +60,8 @@ Running Spark on YARN requires a binary distribution of Spark which is built wit
Binary distributions can be downloaded from the [downloads page](http://spark.apache.org/downloads.html) of the project website. Binary distributions can be downloaded from the [downloads page](http://spark.apache.org/downloads.html) of the project website.
To build Spark yourself, refer to [Building Spark](building-spark.html). To build Spark yourself, refer to [Building Spark](building-spark.html).
To make Spark runtime jars accessible from YARN side, you can specify `spark.yarn.archive` or `spark.yarn.jars`. For details please refer to [Spark Properties](running-on-yarn.html#spark-properties). If neither `spark.yarn.archive` nor `spark.yarn.jars` is specified, Spark will create a zip file with all jars under `$SPARK_HOME/jars` and upload it to the distributed cache.
# Configuration # Configuration
Most of the configs are the same for Spark on YARN as for other deployment modes. See the [configuration page](configuration.html) for more information on those. These are configs that are specific to Spark on YARN. Most of the configs are the same for Spark on YARN as for other deployment modes. See the [configuration page](configuration.html) for more information on those. These are configs that are specific to Spark on YARN.
...@@ -99,6 +101,8 @@ to the same log file). ...@@ -99,6 +101,8 @@ to the same log file).
If you need a reference to the proper location to put log files in the YARN so that YARN can properly display and aggregate them, use `spark.yarn.app.container.log.dir` in your `log4j.properties`. For example, `log4j.appender.file_appender.File=${spark.yarn.app.container.log.dir}/spark.log`. For streaming applications, configuring `RollingFileAppender` and setting file location to YARN's log directory will avoid disk overflow caused by large log files, and logs can be accessed using YARN's log utility. If you need a reference to the proper location to put log files in the YARN so that YARN can properly display and aggregate them, use `spark.yarn.app.container.log.dir` in your `log4j.properties`. For example, `log4j.appender.file_appender.File=${spark.yarn.app.container.log.dir}/spark.log`. For streaming applications, configuring `RollingFileAppender` and setting file location to YARN's log directory will avoid disk overflow caused by large log files, and logs can be accessed using YARN's log utility.
To use a custom metrics.properties for the application master and executors, update the `$SPARK_CONF_DIR/metrics.properties` file. It will automatically be uploaded with other configurations, so you don't need to specify it manually with `--files`.
#### Spark Properties #### Spark Properties
<table class="table"> <table class="table">
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment