Skip to content
Snippets Groups Projects
Commit 21fd12cb authored by Matei Zaharia's avatar Matei Zaharia
Browse files

[SPARK-9852] Let reduce tasks fetch multiple map output partitions

This makes two changes:

- Allow reduce tasks to fetch multiple map output partitions -- this is a pretty small change to HashShuffleFetcher
- Move shuffle locality computation out of DAGScheduler and into ShuffledRDD / MapOutputTracker; this was needed because the code in DAGScheduler wouldn't work for RDDs that fetch multiple map output partitions from each reduce task

I also added an AdaptiveSchedulingSuite that creates RDDs depending on multiple map output partitions.

Author: Matei Zaharia <matei@databricks.com>

Closes #8844 from mateiz/spark-9852.
parent 8023242e
No related branches found
No related tags found
No related merge requests found
Showing with 306 additions and 124 deletions
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment