-
- Downloads
[SPARK-10599] [MLLIB] Lower communication for block matrix multiplication
This PR aims to decrease communication costs in BlockMatrix multiplication in two ways: - Simulate the multiplication on the driver, and figure out which blocks actually need to be shuffled - Send the block once to a partition, and join inside the partition rather than sending multiple copies to the same partition **NOTE**: One important note is that right now, the old behavior of checking for multiple blocks with the same index is lost. This is not hard to add, but is a little more expensive than how it was. Initial benchmarking showed promising results (look below), however I did hit some `FileNotFound` exceptions with the new implementation after the shuffle. Size A: 1e5 x 1e5 Size B: 1e5 x 1e5 Block Sizes: 1024 x 1024 Sparsity: 0.01 Old implementation: 1m 13s New implementation: 9s cc avulanov Would you be interested in helping me benchmark this? I used your code from the mailing list (which you sent about 3 months ago?), and the old implementation didn't even run, but the new implementation completed in 268s in a 120 GB / 16 core cluster Author: Burak Yavuz <brkyvz@gmail.com> Closes #8757 from brkyvz/opt-bmm.
Showing
- mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/BlockMatrix.scala 58 additions, 22 deletions...g/apache/spark/mllib/linalg/distributed/BlockMatrix.scala
- mllib/src/test/scala/org/apache/spark/mllib/linalg/distributed/BlockMatrixSuite.scala 18 additions, 0 deletions...che/spark/mllib/linalg/distributed/BlockMatrixSuite.scala
Loading
Please register or sign in to comment