Skip to content
Snippets Groups Projects
Commit 0b96d85c authored by Patrick Wendell's avatar Patrick Wendell
Browse files

Merge pull request #399 from pwendell/consolidate-off

Disable shuffle file consolidation by default

After running various performance tests for the 0.9 release, this still seems to have performance issues even on XFS. So let's keep this off-by-default for 0.9 and users can experiment with it depending on their disk configurations.
parents 0ab505a2 2802cc80
No related branches found
No related tags found
No related merge requests found
...@@ -64,7 +64,7 @@ class ShuffleBlockManager(blockManager: BlockManager) { ...@@ -64,7 +64,7 @@ class ShuffleBlockManager(blockManager: BlockManager) {
// Turning off shuffle file consolidation causes all shuffle Blocks to get their own file. // Turning off shuffle file consolidation causes all shuffle Blocks to get their own file.
// TODO: Remove this once the shuffle file consolidation feature is stable. // TODO: Remove this once the shuffle file consolidation feature is stable.
val consolidateShuffleFiles = val consolidateShuffleFiles =
conf.getBoolean("spark.shuffle.consolidateFiles", true) conf.getBoolean("spark.shuffle.consolidateFiles", false)
private val bufferSize = conf.getInt("spark.shuffle.file.buffer.kb", 100) * 1024 private val bufferSize = conf.getInt("spark.shuffle.file.buffer.kb", 100) * 1024
......
...@@ -382,7 +382,7 @@ Apart from these, the following properties are also available, and may be useful ...@@ -382,7 +382,7 @@ Apart from these, the following properties are also available, and may be useful
<tr> <tr>
<td>spark.shuffle.consolidateFiles</td> <td>spark.shuffle.consolidateFiles</td>
<td>true</td> <td>false</td>
<td> <td>
If set to "true", consolidates intermediate files created during a shuffle. Creating fewer files can improve filesystem performance for shuffles with large numbers of reduce tasks. It is recommended to set this to "true" when using ext4 or xfs filesystems. On ext3, this option might degrade performance on machines with many (>8) cores due to filesystem limitations. If set to "true", consolidates intermediate files created during a shuffle. Creating fewer files can improve filesystem performance for shuffles with large numbers of reduce tasks. It is recommended to set this to "true" when using ext4 or xfs filesystems. On ext3, this option might degrade performance on machines with many (>8) cores due to filesystem limitations.
</td> </td>
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment