-
- Downloads
[SPARK-17675][CORE] Expand Blacklist for TaskSets
## What changes were proposed in this pull request? This is a step along the way to SPARK-8425. To enable incremental review, the first step proposed here is to expand the blacklisting within tasksets. In particular, this will enable blacklisting for * (task, executor) pairs (this already exists via an undocumented config) * (task, node) * (taskset, executor) * (taskset, node) Adding (task, node) is critical to making spark fault-tolerant of one-bad disk in a cluster, without requiring careful tuning of "spark.task.maxFailures". The other additions are also important to avoid many misleading task failures and long scheduling delays when there is one bad node on a large cluster. Note that some of the code changes here aren't really required for just this -- they put pieces in place for SPARK-8425 even though they are not used yet (eg. the `BlacklistTracker` helper is a little out of place, `TaskSetBlacklist` holds onto a little more info than it needs to for just this change, and `ExecutorFailuresInTaskSet` is more complex than it needs to be). ## How was this patch tested? Added unit tests, run tests via jenkins. Author: Imran Rashid <irashid@cloudera.com> Author: mwws <wei.mao@intel.com> Closes #15249 from squito/taskset_blacklist_only.
Showing
- core/src/main/scala/org/apache/spark/SparkConf.scala 3 additions, 1 deletioncore/src/main/scala/org/apache/spark/SparkConf.scala
- core/src/main/scala/org/apache/spark/TaskEndReason.scala 11 additions, 0 deletionscore/src/main/scala/org/apache/spark/TaskEndReason.scala
- core/src/main/scala/org/apache/spark/internal/config/package.scala 45 additions, 0 deletions...main/scala/org/apache/spark/internal/config/package.scala
- core/src/main/scala/org/apache/spark/scheduler/BlacklistTracker.scala 114 additions, 0 deletions...n/scala/org/apache/spark/scheduler/BlacklistTracker.scala
- core/src/main/scala/org/apache/spark/scheduler/ExecutorFailuresInTaskSet.scala 50 additions, 0 deletions...rg/apache/spark/scheduler/ExecutorFailuresInTaskSet.scala
- core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala 15 additions, 16 deletions.../scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala
- core/src/main/scala/org/apache/spark/scheduler/TaskSetBlacklist.scala 128 additions, 0 deletions...n/scala/org/apache/spark/scheduler/TaskSetBlacklist.scala
- core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala 144 additions, 132 deletions...ain/scala/org/apache/spark/scheduler/TaskSetManager.scala
- core/src/test/scala/org/apache/spark/scheduler/BlacklistIntegrationSuite.scala 27 additions, 25 deletions...rg/apache/spark/scheduler/BlacklistIntegrationSuite.scala
- core/src/test/scala/org/apache/spark/scheduler/BlacklistTrackerSuite.scala 81 additions, 0 deletions...la/org/apache/spark/scheduler/BlacklistTrackerSuite.scala
- core/src/test/scala/org/apache/spark/scheduler/SchedulerIntegrationSuite.scala 2 additions, 2 deletions...rg/apache/spark/scheduler/SchedulerIntegrationSuite.scala
- core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala 10 additions, 12 deletions...a/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala
- core/src/test/scala/org/apache/spark/scheduler/TaskSetBlacklistSuite.scala 163 additions, 0 deletions...la/org/apache/spark/scheduler/TaskSetBlacklistSuite.scala
- core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala 123 additions, 8 deletions...cala/org/apache/spark/scheduler/TaskSetManagerSuite.scala
- core/src/test/scala/org/apache/spark/serializer/KryoSerializerDistributedSuite.scala 3 additions, 1 deletion...che/spark/serializer/KryoSerializerDistributedSuite.scala
- docs/configuration.md 43 additions, 0 deletionsdocs/configuration.md
- sql/core/src/test/scala/org/apache/spark/sql/execution/ui/SQLListenerSuite.scala 2 additions, 1 deletion.../org/apache/spark/sql/execution/ui/SQLListenerSuite.scala
Loading
Please register or sign in to comment