Skip to content
Snippets Groups Projects
Commit 1b98fa2e authored by jerryshao's avatar jerryshao Committed by Marcelo Vanzin
Browse files

[YARN][DOC][MINOR] Remove several obsolete env variables and update the doc

## What changes were proposed in this pull request?

Remove several obsolete env variables not supported for Spark on YARN now, also updates the docs to include several changes with 2.0.

## How was this patch tested?

N/A

CC vanzin tgravescs

Author: jerryshao <sshao@hortonworks.com>

Closes #13296 from jerryshao/yarn-doc.
parent 623aae59
No related branches found
No related tags found
No related merge requests found
......@@ -40,10 +40,6 @@
# - SPARK_EXECUTOR_CORES, Number of cores for the executors (Default: 1).
# - SPARK_EXECUTOR_MEMORY, Memory per Executor (e.g. 1000M, 2G) (Default: 1G)
# - SPARK_DRIVER_MEMORY, Memory for Driver (e.g. 1000M, 2G) (Default: 1G)
# - SPARK_YARN_APP_NAME, The name of your application (Default: Spark)
# - SPARK_YARN_QUEUE, The hadoop queue to use for allocation requests (Default: 'default')
# - SPARK_YARN_DIST_FILES, Comma separated list of files to be distributed with the job.
# - SPARK_YARN_DIST_ARCHIVES, Comma separated list of archives to be distributed with the job.
# Options for the daemons used in the standalone deploy mode
# - SPARK_MASTER_IP, to bind the master to a different IP address or hostname
......
......@@ -60,6 +60,8 @@ Running Spark on YARN requires a binary distribution of Spark which is built wit
Binary distributions can be downloaded from the [downloads page](http://spark.apache.org/downloads.html) of the project website.
To build Spark yourself, refer to [Building Spark](building-spark.html).
To make Spark runtime jars accessible from YARN side, you can specify `spark.yarn.archive` or `spark.yarn.jars`. For details please refer to [Spark Properties](running-on-yarn.html#spark-properties). If neither `spark.yarn.archive` nor `spark.yarn.jars` is specified, Spark will create a zip file with all jars under `$SPARK_HOME/jars` and upload it to the distributed cache.
# Configuration
Most of the configs are the same for Spark on YARN as for other deployment modes. See the [configuration page](configuration.html) for more information on those. These are configs that are specific to Spark on YARN.
......@@ -99,6 +101,8 @@ to the same log file).
If you need a reference to the proper location to put log files in the YARN so that YARN can properly display and aggregate them, use `spark.yarn.app.container.log.dir` in your `log4j.properties`. For example, `log4j.appender.file_appender.File=${spark.yarn.app.container.log.dir}/spark.log`. For streaming applications, configuring `RollingFileAppender` and setting file location to YARN's log directory will avoid disk overflow caused by large log files, and logs can be accessed using YARN's log utility.
To use a custom metrics.properties for the application master and executors, update the `$SPARK_CONF_DIR/metrics.properties` file. It will automatically be uploaded with other configurations, so you don't need to specify it manually with `--files`.
#### Spark Properties
<table class="table">
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment