-
- Downloads
[SPARK-13522][CORE] Executor should kill itself when it's unable to heartbeat...
[SPARK-13522][CORE] Executor should kill itself when it's unable to heartbeat to driver more than N times ## What changes were proposed in this pull request? Sometimes, network disconnection event won't be triggered for other potential race conditions that we may not have thought of, then the executor will keep sending heartbeats to driver and won't exit. This PR adds a new configuration `spark.executor.heartbeat.maxFailures` to kill Executor when it's unable to heartbeat to the driver more than `spark.executor.heartbeat.maxFailures` times. ## How was this patch tested? unit tests Author: Shixiong Zhu <shixiong@databricks.com> Closes #11401 from zsxwing/SPARK-13522.
Showing
- core/src/main/scala/org/apache/spark/executor/Executor.scala 21 additions, 1 deletioncore/src/main/scala/org/apache/spark/executor/Executor.scala
- core/src/main/scala/org/apache/spark/executor/ExecutorExitCode.scala 8 additions, 0 deletions...in/scala/org/apache/spark/executor/ExecutorExitCode.scala
Please register or sign in to comment