Skip to content
Snippets Groups Projects
Commit c6f4e704 authored by Sandy Ryza's avatar Sandy Ryza Committed by Patrick Wendell
Browse files

SPARK-4230. Doc for spark.default.parallelism is incorrect

Author: Sandy Ryza <sandy@cloudera.com>

Closes #3107 from sryza/sandy-spark-4230 and squashes the following commits:

37a1d19 [Sandy Ryza] Clear up a couple things
34d53de [Sandy Ryza] SPARK-4230. Doc for spark.default.parallelism is incorrect
parent c5db8e2c
No related branches found
No related tags found
No related merge requests found
......@@ -562,6 +562,9 @@ Apart from these, the following properties are also available, and may be useful
<tr>
<td><code>spark.default.parallelism</code></td>
<td>
For distributed shuffle operations like <code>reduceByKey</code> and <code>join</code>, the
largest number of partitions in a parent RDD. For operations like <code>parallelize</code>
with no parent RDDs, it depends on the cluster manager:
<ul>
<li>Local mode: number of cores on the local machine</li>
<li>Mesos fine grained mode: 8</li>
......@@ -569,8 +572,8 @@ Apart from these, the following properties are also available, and may be useful
</ul>
</td>
<td>
Default number of tasks to use across the cluster for distributed shuffle operations
(<code>groupByKey</code>, <code>reduceByKey</code>, etc) when not set by user.
Default number of partitions in RDDs returned by transformations like <code>join</code>,
<code>reduceByKey</code>, and <code>parallelize</code> when not set by user.
</td>
</tr>
<tr>
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment