Skip to content
Snippets Groups Projects
Commit 05c2214b authored by Christophe Préaud's avatar Christophe Préaud Committed by Andrew Or
Browse files

[SPARK-6469] Improving documentation on YARN local directories usage

Clarify the local directories usage in YARN

Author: Christophe Préaud <christophe.preaud@kelkoo.com>

Closes #5165 from preaudc/yarn-doc-local-dirs and squashes the following commits:

6912b90 [Christophe Préaud] Fix some formatting issues.
4fa8ec2 [Christophe Préaud] Merge remote-tracking branch 'upstream/master' into yarn-doc-local-dirs
eaaf519 [Christophe Préaud] Clarify the local directories usage in YARN
436fb7d [Christophe Préaud] Revert "Clarify the local directories usage in YARN"
876ae5e [Christophe Préaud] Clarify the local directories usage in YARN
608dbfa [Christophe Préaud] Merge remote-tracking branch 'upstream/master'
a49a2ce [Christophe Préaud] Merge remote-tracking branch 'upstream/master'
9ba89ca [Christophe Préaud] Ensure that files are fetched atomically
54419ae [Christophe Préaud] Merge remote-tracking branch 'upstream/master'
c6a5590 [Christophe Préaud] Revert commit 8ea871f8130b2490f1bad7374a819bf56f0ccbbd
7456a33 [Christophe Préaud] Merge remote-tracking branch 'upstream/master'
8ea871f [Christophe Préaud] Ensure that files are fetched atomically
parent dd907d1a
No related branches found
No related tags found
No related merge requests found
......@@ -274,6 +274,6 @@ If you need a reference to the proper location to put log files in the YARN so t
# Important notes
- Whether core requests are honored in scheduling decisions depends on which scheduler is in use and how it is configured.
- The local directories used by Spark executors will be the local directories configured for YARN (Hadoop YARN config `yarn.nodemanager.local-dirs`). If the user specifies `spark.local.dir`, it will be ignored.
- In `yarn-cluster` mode, the local directories used by the Spark executors and the Spark driver will be the local directories configured for YARN (Hadoop YARN config `yarn.nodemanager.local-dirs`). If the user specifies `spark.local.dir`, it will be ignored. In `yarn-client` mode, the Spark executors will use the local directories configured for YARN while the Spark driver will use those defined in `spark.local.dir`. This is because the Spark driver does not run on the YARN cluster in `yarn-client` mode, only the Spark executors do.
- The `--files` and `--archives` options support specifying file names with the # similar to Hadoop. For example you can specify: `--files localtest.txt#appSees.txt` and this will upload the file you have locally named localtest.txt into HDFS but this will be linked to by the name `appSees.txt`, and your application should use the name as `appSees.txt` to reference it when running on YARN.
- The `--jars` option allows the `SparkContext.addJar` function to work if you are using it with local files and running in `yarn-cluster` mode. It does not need to be used if you are using it with HDFS, HTTP, HTTPS, or FTP files.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment