-
- Downloads
[SPARK-17451][CORE] CoarseGrainedExecutorBackend should inform driver before self-kill
## What changes were proposed in this pull request? Jira : https://issues.apache.org/jira/browse/SPARK-17451 `CoarseGrainedExecutorBackend` in some failure cases exits the JVM. While this does not have any issue, from the driver UI there is no specific reason captured for this. In this PR, I am adding functionality to `exitExecutor` to notify driver that the executor is exiting. ## How was this patch tested? Ran the change over a test env and took down shuffle service before the executor could register to it. In the driver logs, where the job failure reason is mentioned (ie. `Job aborted due to stage ...` it gives the correct reason: Before: `ExecutorLostFailure (executor ZZZZZZZZZ exited caused by one of the running tasks) Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.` After: `ExecutorLostFailure (executor ZZZZZZZZZ exited caused by one of the running tasks) Reason: Unable to create executor due to java.util.concurrent.TimeoutException: Timeout waiting for task.` Author: Tejas Patil <tejasp@fb.com> Closes #15013 from tejasapatil/SPARK-17451_inform_driver.
Showing
- core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala 20 additions, 6 deletions.../apache/spark/executor/CoarseGrainedExecutorBackend.scala
- core/src/main/scala/org/apache/spark/storage/BlockManager.scala 3 additions, 0 deletions...rc/main/scala/org/apache/spark/storage/BlockManager.scala
Loading
Please register or sign in to comment