Skip to content
Snippets Groups Projects
Commit 9c041990 authored by Michael Gummelt's avatar Michael Gummelt Committed by Reynold Xin
Browse files

[MESOS] expand coarse-grained mode docs

## What changes were proposed in this pull request?

docs

## How was this patch tested?

viewed the docs in github

Author: Michael Gummelt <mgummelt@mesosphere.io>

Closes #14059 from mgummelt/coarse-grained.
parent a8f89df3
No related branches found
No related tags found
No related merge requests found
......@@ -180,30 +180,53 @@ Note that jars or python files that are passed to spark-submit should be URIs re
# Mesos Run Modes
Spark can run over Mesos in two modes: "coarse-grained" (default) and "fine-grained".
The "coarse-grained" mode will launch only *one* long-running Spark task on each Mesos
machine, and dynamically schedule its own "mini-tasks" within it. The benefit is much lower startup
overhead, but at the cost of reserving the Mesos resources for the complete duration of the
application.
Coarse-grained is the default mode. You can also set `spark.mesos.coarse` property to true
to turn it on explicitly in [SparkConf](configuration.html#spark-properties):
{% highlight scala %}
conf.set("spark.mesos.coarse", "true")
{% endhighlight %}
In addition, for coarse-grained mode, you can control the maximum number of resources Spark will
acquire. By default, it will acquire *all* cores in the cluster (that get offered by Mesos), which
only makes sense if you run just one application at a time. You can cap the maximum number of cores
using `conf.set("spark.cores.max", "10")` (for example).
In "fine-grained" mode, each Spark task runs as a separate Mesos task. This allows
multiple instances of Spark (and other frameworks) to share machines at a very fine granularity,
where each application gets more or fewer machines as it ramps up and down, but it comes with an
additional overhead in launching each task. This mode may be inappropriate for low-latency
requirements like interactive queries or serving web requests.
Spark can run over Mesos in two modes: "coarse-grained" (default) and
"fine-grained".
## Coarse-Grained
In "coarse-grained" mode, each Spark executor runs as a single Mesos
task. Spark executors are sized according to the following
configuration variables:
* Executor memory: `spark.executor.memory`
* Executor cores: `spark.executor.cores`
* Number of executors: `spark.cores.max`/`spark.executor.cores`
Please see the [Spark Configuration](configuration.html) page for
details and default values.
Executors are brought up eagerly when the application starts, until
`spark.cores.max` is reached. If you don't set `spark.cores.max`, the
Spark application will reserve all resources offered to it by Mesos,
so we of course urge you to set this variable in any sort of
multi-tenant cluster, including one which runs multiple concurrent
Spark applications.
The scheduler will start executors round-robin on the offers Mesos
gives it, but there are no spread guarantees, as Mesos does not
provide such guarantees on the offer stream.
The benefit of coarse-grained mode is much lower startup overhead, but
at the cost of reserving Mesos resources for the complete duration of
the application. To configure your job to dynamically adjust to its
resource requirements, look into
[Dynamic Allocation](#dynamic-resource-allocation-with-mesos).
## Fine-Grained
In "fine-grained" mode, each Spark task inside the Spark executor runs
as a separate Mesos task. This allows multiple instances of Spark (and
other frameworks) to share cores at a very fine granularity, where
each application gets more or fewer cores as it ramps up and down, but
it comes with an additional overhead in launching each task. This mode
may be inappropriate for low-latency requirements like interactive
queries or serving web requests.
Note that while Spark tasks in fine-grained will relinquish cores as
they terminate, they will not relinquish memory, as the JVM does not
give memory back to the Operating System. Neither will executors
terminate when they're idle.
To run in fine-grained mode, set the `spark.mesos.coarse` property to false in your
[SparkConf](configuration.html#spark-properties):
......@@ -212,7 +235,9 @@ To run in fine-grained mode, set the `spark.mesos.coarse` property to false in y
conf.set("spark.mesos.coarse", "false")
{% endhighlight %}
You may also make use of `spark.mesos.constraints` to set attribute based constraints on mesos resource offers. By default, all resource offers will be accepted.
You may also make use of `spark.mesos.constraints` to set
attribute-based constraints on Mesos resource offers. By default, all
resource offers will be accepted.
{% highlight scala %}
conf.set("spark.mesos.constraints", "os:centos7;us-east-1:false")
......@@ -246,7 +271,7 @@ In either case, HDFS runs separately from Hadoop MapReduce, without being schedu
# Dynamic Resource Allocation with Mesos
Mesos supports dynamic allocation only with coarse-grain mode, which can resize the number of
Mesos supports dynamic allocation only with coarse-grained mode, which can resize the number of
executors based on statistics of the application. For general information,
see [Dynamic Resource Allocation](job-scheduling.html#dynamic-resource-allocation).
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment