Skip to content
Snippets Groups Projects
Commit b6738520 authored by Guillaume Poulin's avatar Guillaume Poulin Committed by Reynold Xin
Browse files

[SPARK-12678][CORE] MapPartitionsRDD clearDependencies

MapPartitionsRDD was keeping a reference to `prev` after a call to
`clearDependencies` which could lead to memory leak.

Author: Guillaume Poulin <poulin.guillaume@gmail.com>

Closes #10623 from gpoulin/map_partition_deps.
parent 174e72ce
No related branches found
No related tags found
No related merge requests found
......@@ -25,7 +25,7 @@ import org.apache.spark.{Partition, TaskContext}
* An RDD that applies the provided function to every partition of the parent RDD.
*/
private[spark] class MapPartitionsRDD[U: ClassTag, T: ClassTag](
prev: RDD[T],
var prev: RDD[T],
f: (TaskContext, Int, Iterator[T]) => Iterator[U], // (TaskContext, partition index, iterator)
preservesPartitioning: Boolean = false)
extends RDD[U](prev) {
......@@ -36,4 +36,9 @@ private[spark] class MapPartitionsRDD[U: ClassTag, T: ClassTag](
override def compute(split: Partition, context: TaskContext): Iterator[U] =
f(context, split.index, firstParent[T].iterator(split, context))
override def clearDependencies() {
super.clearDependencies()
prev = null
}
}
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment