-
- Downloads
[SPARK-3885] Provide mechanism to remove accumulators once they are no longer used
Instead of storing a strong reference to accumulators, I've replaced this with a weak reference and updated any code that uses these accumulators to check whether the reference resolves before using the accumulator. A weak reference will be cleared when there is no longer an existing copy of the variable versus using a soft reference in which case accumulators would only be cleared when the GC explicitly ran out of memory. Author: Ilya Ganelin <ilya.ganelin@capitalone.com> Closes #4021 from ilganeli/SPARK-3885 and squashes the following commits: 4ba9575 [Ilya Ganelin] Fixed error in test suite 8510943 [Ilya Ganelin] Extra code bb76ef0 [Ilya Ganelin] File deleted somehow 283a333 [Ilya Ganelin] Added cleanup method for accumulators to remove stale references within Accumulators.original to accumulators that are now out of scope 345fd4f [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3885 7485a82 [Ilya Ganelin] Fixed build error c8e0f2b [Ilya Ganelin] Added working test for accumulator garbage collection 94ce754 [Ilya Ganelin] Still not being properly garbage collected 8722b63 [Ilya Ganelin] Fixing gc test 7414a9c [Ilya Ganelin] Added test for accumulator garbage collection 18d62ec [Ilya Ganelin] Updated to throw Exception when accessing a GCd accumulator 9a81928 [Ilya Ganelin] Reverting permissions changes 28f705c [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3885 b820ab4b [Ilya Ganelin] reset d78f4bf [Ilya Ganelin] Removed obsolete comment 0746e61 [Ilya Ganelin] Updated DAGSchedulerSUite to fix bug 3350852 [Ilya Ganelin] Updated DAGScheduler and Suite to correctly use new implementation of WeakRef Accumulator storage c49066a [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3885 cbb9023 [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3885 a77d11b [Ilya Ganelin] Updated Accumulators class to store weak references instead of strong references to allow garbage collection of old accumulators
Showing
- core/src/main/scala/org/apache/spark/Accumulators.scala 28 additions, 8 deletionscore/src/main/scala/org/apache/spark/Accumulators.scala
- core/src/main/scala/org/apache/spark/ContextCleaner.scala 20 additions, 0 deletionscore/src/main/scala/org/apache/spark/ContextCleaner.scala
- core/src/main/scala/org/apache/spark/SparkContext.scala 21 additions, 7 deletionscore/src/main/scala/org/apache/spark/SparkContext.scala
- core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala 9 additions, 1 deletion.../main/scala/org/apache/spark/scheduler/DAGScheduler.scala
- core/src/test/scala/org/apache/spark/AccumulatorSuite.scala 20 additions, 0 deletionscore/src/test/scala/org/apache/spark/AccumulatorSuite.scala
- core/src/test/scala/org/apache/spark/ContextCleanerSuite.scala 4 additions, 0 deletions...src/test/scala/org/apache/spark/ContextCleanerSuite.scala
- core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala 5 additions, 1 deletion.../scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
Loading
Please register or sign in to comment