[SPARK-16320][DOC] Document G1 heap region's effect on spark 2.0 vs 1.6

## What changes were proposed in this pull request? Collect GC discussion in one section, and documenting findings about G1 GC heap region size. ## How was this patch tested? Jekyll doc build Author: Sean Owen <sowen@cloudera.com> Closes #14732 from srowen/SPARK-16320.

[SPARK-16320][DOC] Document G1 heap region's effect on spark 2.0 vs 1.6
342278c0 · Sean Owen · Yin Huai · 209e1b3c · 342278c0
Commit 342278c0 authored 8 years ago by Sean Owen Committed by Yin Huai 8 years ago
--- a/docs/tuning.md
+++ b/docs/tuning.md
@@ -122,21 +122,8 @@ large records.
 `R` is the storage space within `M` where cached blocks immune to being evicted by execution.
 The value of `spark.memory.fraction` should be set in order to fit this amount of heap space
-comfortably within the JVM's old or "tenured" generation. Otherwise, when much of this space is
+comfortably within the JVM's old or "tenured" generation. See the discussion of advanced GC
-used for caching and execution, the tenured generation will be full, which causes the JVM to
+tuning below for details.
-significantly increase time spent in garbage collection. See
-<a href="https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/sizing.html">Java GC sizing documentation</a>
-for more information.
-The tenured generation size is controlled by the JVM's `NewRatio` parameter, which defaults to 2,
-meaning that the tenured generation is 2 times the size of the new generation (the rest of the heap).
-So, by default, the tenured generation occupies 2/3 or about 0.66 of the heap. A value of
-0.6 for `spark.memory.fraction` keeps storage and execution memory within the old generation with
-room to spare. If `spark.memory.fraction` is increased to, say, 0.8, then `NewRatio` may have to
-increase to 6 or more.
-`NewRatio` is set as a JVM flag for executors, which means adding
-`spark.executor.extraJavaOptions=-XX:NewRatio=x` to a Spark job's configuration.
 ## Determining Memory Consumption
@@ -217,14 +204,22 @@ temporary objects created during task execution. Some steps which may be useful
 * Check if there are too many garbage collections by collecting GC stats. If a full GC is invoked multiple times for
  before a task completes, it means that there isn't enough memory available for executing tasks.
-* In the GC stats that are printed, if the OldGen is close to being full, reduce the amount of
-  memory used for caching by lowering `spark.memory.storageFraction`; it is better to cache fewer
-  objects than to slow down task execution!
 * If there are too many minor collections but not many major GCs, allocating more memory for Eden would help. You
  can set the size of the Eden to be an over-estimate of how much memory each task will need. If the size of Eden
  is determined to be `E`, then you can set the size of the Young generation using the option `-Xmn=4/3*E`. (The scaling
  up by 4/3 is to account for space used by survivor regions as well.)
+* In the GC stats that are printed, if the OldGen is close to being full, reduce the amount of
+  memory used for caching by lowering `spark.memory.fraction`; it is better to cache fewer
+  objects than to slow down task execution. Alternatively, consider decreasing the size of
+  the Young generation. This means lowering `-Xmn` if you've set it as above. If not, try changing the 
+  value of the JVM's `NewRatio` parameter. Many JVMs default this to 2, meaning that the Old generation 
+  occupies 2/3 of the heap. It should be large enough such that this fraction exceeds `spark.memory.fraction`.
+* Try the G1GC garbage collector with `-XX:+UseG1GC`. It can improve performance in some situations where
+  garbage collection is a bottleneck. Note that with large executor heap sizes, it may be important to
+  increase the [G1 region size](https://blogs.oracle.com/g1gc/entry/g1_gc_tuning_a_case) 
+  with `-XX:G1HeapRegionSize`
 * As an example, if your task is reading data from HDFS, the amount of memory used by the task can be estimated using
  the size of the data block read from HDFS. Note that the size of a decompressed block is often 2 or 3 times the
@@ -237,6 +232,9 @@ Our experience suggests that the effect of GC tuning depends on your application
 There are [many more tuning options](http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html) described online,
 but at a high level, managing how frequently full GC takes place can help in reducing the overhead.
+GC tuning flags for executors can be specified by setting `spark.executor.extraJavaOptions` in
+a job's configuration.
 # Other Considerations
 ## Level of Parallelism