Skip to content
Snippets Groups Projects
Commit 4ea23db0 authored by Patrick Wendell's avatar Patrick Wendell
Browse files

SPARK-1019: pyspark RDD take() throws an NPE

Author: Patrick Wendell <pwendell@gmail.com>

Closes #112 from pwendell/pyspark-take and squashes the following commits:

daae80e [Patrick Wendell] SPARK-1019: pyspark RDD take() throws an NPE
parent 6bd2eaa4
No related branches found
No related tags found
No related merge requests found
......@@ -46,6 +46,7 @@ class TaskContext(
}
def executeOnCompleteCallbacks() {
onCompleteCallbacks.foreach{_()}
// Process complete callbacks in the reverse order of registration
onCompleteCallbacks.reverse.foreach{_()}
}
}
......@@ -100,6 +100,14 @@ private[spark] class PythonRDD[T: ClassTag](
}
}.start()
/*
* Partial fix for SPARK-1019: Attempts to stop reading the input stream since
* other completion callbacks might invalidate the input. Because interruption
* is not synchronous this still leaves a potential race where the interruption is
* processed only after the stream becomes invalid.
*/
context.addOnCompleteCallback(() => context.interrupted = true)
// Return an iterator that read lines from the process's stdout
val stream = new DataInputStream(new BufferedInputStream(worker.getInputStream, bufferSize))
val stdoutIterator = new Iterator[Array[Byte]] {
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment