Skip to content
Snippets Groups Projects
Commit bc87cc41 authored by Davies Liu's avatar Davies Liu Committed by Josh Rosen
Browse files

[SPARK-3731] [PySpark] fix memory leak in PythonRDD

The parent.getOrCompute() of PythonRDD is executed in a separated thread, it should release the memory reserved for shuffle and unrolling finally.

Author: Davies Liu <davies.liu@gmail.com>

Closes #2668 from davies/leak and squashes the following commits:

ae98be2 [Davies Liu] fix memory leak in PythonRDD
parent 65503296
No related branches found
No related tags found
No related merge requests found
...@@ -247,6 +247,11 @@ private[spark] class PythonRDD( ...@@ -247,6 +247,11 @@ private[spark] class PythonRDD(
// will kill the whole executor (see org.apache.spark.executor.Executor). // will kill the whole executor (see org.apache.spark.executor.Executor).
_exception = e _exception = e
worker.shutdownOutput() worker.shutdownOutput()
} finally {
// Release memory used by this thread for shuffles
env.shuffleMemoryManager.releaseMemoryForThisThread()
// Release memory used by this thread for unrolling blocks
env.blockManager.memoryStore.releaseUnrollMemoryForThisThread()
} }
} }
} }
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment