Skip to content
Snippets Groups Projects
Commit 9933836c authored by Matei Zaharia's avatar Matei Zaharia
Browse files

Merge pull request #647 from jerryshao/master

Reduce ZippedPartitionsRDD's getPreferredLocations complexity from O(2^2n) to O(2^n)
parents 2ab311f4 1e9269c3
No related branches found
No related tags found
No related merge requests found
......@@ -53,14 +53,10 @@ abstract class ZippedPartitionsBaseRDD[V: ClassManifest](
val exactMatchLocations = exactMatchPreferredLocations.reduce((x, y) => x.intersect(y))
// Remove exact match and then do host local match.
val otherNodePreferredLocations = rddSplitZip.map(x => {
x._1.preferredLocations(x._2).map(hostPort => {
val host = Utils.parseHostPort(hostPort)._1
if (exactMatchLocations.contains(host)) null else host
}).filter(_ != null)
})
val otherNodeLocalLocations = otherNodePreferredLocations.reduce((x, y) => x.intersect(y))
val exactMatchHosts = exactMatchLocations.map(Utils.parseHostPort(_)._1)
val matchPreferredHosts = exactMatchPreferredLocations.map(locs => locs.map(Utils.parseHostPort(_)._1))
.reduce((x, y) => x.intersect(y))
val otherNodeLocalLocations = matchPreferredHosts.filter { s => !exactMatchHosts.contains(s) }
otherNodeLocalLocations ++ exactMatchLocations
}
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment