Skip to content
Snippets Groups Projects
Commit 12a0784a authored by Andrew Or's avatar Andrew Or
Browse files

[SPARK-11667] Update dynamic allocation docs to reflect supported cluster managers

Author: Andrew Or <andrew@databricks.com>

Closes #9637 from andrewor14/update-da-docs.
parent cf38fc75
No related branches found
No related tags found
No related merge requests found
...@@ -56,36 +56,32 @@ provide another approach to share RDDs. ...@@ -56,36 +56,32 @@ provide another approach to share RDDs.
## Dynamic Resource Allocation ## Dynamic Resource Allocation
Spark 1.2 introduces the ability to dynamically scale the set of cluster resources allocated to Spark provides a mechanism to dynamically adjust the resources your application occupies based
your application up and down based on the workload. This means that your application may give on the workload. This means that your application may give resources back to the cluster if they
resources back to the cluster if they are no longer used and request them again later when there are no longer used and request them again later when there is demand. This feature is particularly
is demand. This feature is particularly useful if multiple applications share resources in your useful if multiple applications share resources in your Spark cluster.
Spark cluster. If a subset of the resources allocated to an application becomes idle, it can be
returned to the cluster's pool of resources and acquired by other applications. In Spark, dynamic This feature is disabled by default and available on all coarse-grained cluster managers, i.e.
resource allocation is performed on the granularity of the executor and can be enabled through [standalone mode](spark-standalone.html), [YARN mode](running-on-yarn.html), and
`spark.dynamicAllocation.enabled`. [Mesos coarse-grained mode](running-on-mesos.html#mesos-run-modes).
This feature is currently disabled by default and available only on [YARN](running-on-yarn.html).
A future release will extend this to [standalone mode](spark-standalone.html) and
[Mesos coarse-grained mode](running-on-mesos.html#mesos-run-modes). Note that although Spark on
Mesos already has a similar notion of dynamic resource sharing in fine-grained mode, enabling
dynamic allocation allows your Mesos application to take advantage of coarse-grained low-latency
scheduling while sharing cluster resources efficiently.
### Configuration and Setup ### Configuration and Setup
All configurations used by this feature live under the `spark.dynamicAllocation.*` namespace. There are two requirements for using this feature. First, your application must set
To enable this feature, your application must set `spark.dynamicAllocation.enabled` to `true`. `spark.dynamicAllocation.enabled` to `true`. Second, you must set up an *external shuffle service*
Other relevant configurations are described on the on each worker node in the same cluster and set `spark.shuffle.service.enabled` to true in your
[configurations page](configuration.html#dynamic-allocation) and in the subsequent sections in application. The purpose of the external shuffle service is to allow executors to be removed
detail. without deleting shuffle files written by them (more detail described
[below](job-scheduling.html#graceful-decommission-of-executors)). The way to set up this service
varies across cluster managers:
In standalone mode, simply start your workers with `spark.shuffle.service.enabled` set to `true`.
Additionally, your application must use an external shuffle service. The purpose of the service is In Mesos coarse-grained mode, run `$SPARK_HOME/sbin/start-mesos-shuffle-service.sh` on all
to preserve the shuffle files written by executors so the executors can be safely removed (more slave nodes with `spark.shuffle.service.enabled` set to `true`. For instance, you may do so
detail described [below](job-scheduling.html#graceful-decommission-of-executors)). To enable through Marathon.
this service, set `spark.shuffle.service.enabled` to `true`. In YARN, this external shuffle service
is implemented in `org.apache.spark.yarn.network.YarnShuffleService` that runs in each `NodeManager` In YARN mode, start the shuffle service on each `NodeManager` as follows:
in your cluster. To start this service, follow these steps:
1. Build Spark with the [YARN profile](building-spark.html). Skip this step if you are using a 1. Build Spark with the [YARN profile](building-spark.html). Skip this step if you are using a
pre-packaged distribution. pre-packaged distribution.
...@@ -95,10 +91,13 @@ pre-packaged distribution. ...@@ -95,10 +91,13 @@ pre-packaged distribution.
2. Add this jar to the classpath of all `NodeManager`s in your cluster. 2. Add this jar to the classpath of all `NodeManager`s in your cluster.
3. In the `yarn-site.xml` on each node, add `spark_shuffle` to `yarn.nodemanager.aux-services`, 3. In the `yarn-site.xml` on each node, add `spark_shuffle` to `yarn.nodemanager.aux-services`,
then set `yarn.nodemanager.aux-services.spark_shuffle.class` to then set `yarn.nodemanager.aux-services.spark_shuffle.class` to
`org.apache.spark.network.yarn.YarnShuffleService`. Additionally, set all relevant `org.apache.spark.network.yarn.YarnShuffleService` and `spark.shuffle.service.enabled` to true.
`spark.shuffle.service.*` [configurations](configuration.html).
4. Restart all `NodeManager`s in your cluster. 4. Restart all `NodeManager`s in your cluster.
All other relevant configurations are optional and under the `spark.dynamicAllocation.*` and
`spark.shuffle.service.*` namespaces. For more detail, see the
[configurations page](configuration.html#dynamic-allocation).
### Resource Allocation Policy ### Resource Allocation Policy
At a high level, Spark should relinquish executors when they are no longer used and acquire At a high level, Spark should relinquish executors when they are no longer used and acquire
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment