Skip to content
Snippets Groups Projects
Commit 51be6336 authored by jinxing's avatar jinxing Committed by Kay Ousterhout
Browse files

[SPARK-19777] Scan runningTasksSet when check speculatable tasks in TaskSetManager.

## What changes were proposed in this pull request?

When check speculatable tasks in `TaskSetManager`, only scan `runningTasksSet` instead of scanning all `taskInfos`.

## How was this patch tested?
Existing tests.

Author: jinxing <jinxing6042@126.com>

Closes #17111 from jinxing64/SPARK-19777.
parent db0ddce5
No related branches found
No related tags found
No related merge requests found
......@@ -906,8 +906,6 @@ private[spark] class TaskSetManager(
* Check for tasks to be speculated and return true if there are any. This is called periodically
* by the TaskScheduler.
*
* TODO: To make this scale to large jobs, we need to maintain a list of running tasks, so that
* we don't scan the whole task set. It might also help to make this sorted by launch time.
*/
override def checkSpeculatableTasks(minTimeToSpeculation: Int): Boolean = {
// Can't speculate if we only have one task, and no need to speculate if the task set is a
......@@ -927,7 +925,8 @@ private[spark] class TaskSetManager(
// TODO: Threshold should also look at standard deviation of task durations and have a lower
// bound based on that.
logDebug("Task length threshold for speculation: " + threshold)
for ((tid, info) <- taskInfos) {
for (tid <- runningTasksSet) {
val info = taskInfos(tid)
val index = info.index
if (!successful(index) && copiesRunning(index) == 1 && info.timeRunning(time) > threshold &&
!speculatableTasks.contains(index)) {
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment