[SPARK-21923][CORE] Avoid calling reserveUnrollMemoryForThisTask for every record
## What changes were proposed in this pull request? When Spark persist data to Unsafe memory, we call the method `MemoryStore.putIteratorAsBytes`, which need synchronize the `memoryManager` for every record write. This implementation is not necessary, we can apply for more memory at a time to reduce unnecessary synchronization. ## How was this patch tested? Test case (with 1 executor 20 core): ```scala val start = System.currentTimeMillis() val data = sc.parallelize(0 until Integer.MAX_VALUE, 100) .persist(StorageLevel.OFF_HEAP) .count() println(System.currentTimeMillis() - start) ``` Test result: before | 27647 | 29108 | 28591 | 28264 | 27232 | after | 26868 | 26358 | 27767 | 26653 | 26693 | Author: Xianyang Liu <xianyang.liu@intel.com> Closes #19135 from ConeyLiu/memorystore.
Showing
- core/src/main/scala/org/apache/spark/internal/config/package.scala 15 additions, 0 deletions...main/scala/org/apache/spark/internal/config/package.scala
- core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala 14 additions, 4 deletions...n/scala/org/apache/spark/storage/memory/MemoryStore.scala
Please register or sign in to comment