-
- Downloads
Performance improvements to shuffle operations: in particular, preserve
RDD partitioning in more cases where it's possible, and use iterators instead of materializing collections when doing joins.
Showing
- bagel/src/main/scala/spark/bagel/Bagel.scala 1 addition, 2 deletionsbagel/src/main/scala/spark/bagel/Bagel.scala
- bagel/src/main/scala/spark/bagel/examples/WikipediaPageRankStandalone.scala 0 additions, 1 deletion...la/spark/bagel/examples/WikipediaPageRankStandalone.scala
- core/src/main/scala/spark/PairRDDFunctions.scala 100 additions, 87 deletionscore/src/main/scala/spark/PairRDDFunctions.scala
- core/src/main/scala/spark/Partitioner.scala 4 additions, 3 deletionscore/src/main/scala/spark/Partitioner.scala
- core/src/main/scala/spark/RDD.scala 2 additions, 2 deletionscore/src/main/scala/spark/RDD.scala
- core/src/test/scala/spark/PartitioningSuite.scala 101 additions, 0 deletionscore/src/test/scala/spark/PartitioningSuite.scala
Loading
Please register or sign in to comment