-
- Downloads
Memory-optimized shuffle file consolidation
Overhead of each shuffle block for consolidation has been reduced from >300 bytes to 8 bytes (1 primitive Long). Verified via profiler testing with 1 mil shuffle blocks, net overhead was ~8,400,000 bytes. Despite the memory-optimized implementation incurring extra CPU overhead, the runtime of the shuffle phase in this test was only around 2% slower, while the reduce phase was 40% faster, when compared to not using any shuffle file consolidation.
Showing
- core/src/main/scala/org/apache/spark/storage/BlockManager.scala 3 additions, 7 deletions...rc/main/scala/org/apache/spark/storage/BlockManager.scala
- core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala 7 additions, 8 deletions...in/scala/org/apache/spark/storage/BlockObjectWriter.scala
- core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala 11 additions, 43 deletions...ain/scala/org/apache/spark/storage/DiskBlockManager.scala
- core/src/main/scala/org/apache/spark/storage/DiskStore.scala 2 additions, 2 deletionscore/src/main/scala/org/apache/spark/storage/DiskStore.scala
- core/src/main/scala/org/apache/spark/storage/ShuffleBlockManager.scala 196 additions, 16 deletions.../scala/org/apache/spark/storage/ShuffleBlockManager.scala
- core/src/main/scala/org/apache/spark/util/MetadataCleaner.scala 1 addition, 1 deletion...rc/main/scala/org/apache/spark/util/MetadataCleaner.scala
- core/src/main/scala/org/apache/spark/util/PrimitiveVector.scala 48 additions, 0 deletions...rc/main/scala/org/apache/spark/util/PrimitiveVector.scala
- core/src/test/scala/org/apache/spark/storage/DiskBlockManagerSuite.scala 80 additions, 0 deletions...cala/org/apache/spark/storage/DiskBlockManagerSuite.scala
Loading
Please register or sign in to comment