-
- Downloads
[SPARK-4188] [Core] Perform network-level retry of shuffle file fetches
This adds a RetryingBlockFetcher to the NettyBlockTransferService which is wrapped around our typical OneForOneBlockFetcher, adding retry logic in the event of an IOException. This sort of retry allows us to avoid marking an entire executor as failed due to garbage collection or high network load. TODO: - [x] unit tests - [x] put in ExternalShuffleClient too Author: Aaron Davidson <aaron@databricks.com> Closes #3101 from aarondav/retry and squashes the following commits: 72a2a32 [Aaron Davidson] Add that we should remove the condition around the retry thingy c7fd107 [Aaron Davidson] Fix unit tests e80e4c2 [Aaron Davidson] Address initial comments 6f594cd [Aaron Davidson] Fix unit test 05ff43c [Aaron Davidson] Add to external shuffle client and add unit test 66e5a24 [Aaron Davidson] [SPARK-4238] [Core] Perform network-level retry of shuffle file fetches
Showing
- core/src/main/scala/org/apache/spark/network/netty/NettyBlockTransferService.scala 17 additions, 4 deletions...pache/spark/network/netty/NettyBlockTransferService.scala
- network/common/src/main/java/org/apache/spark/network/client/TransportClient.java 14 additions, 2 deletions...java/org/apache/spark/network/client/TransportClient.java
- network/common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java 7 additions, 6 deletions...g/apache/spark/network/client/TransportClientFactory.java
- network/common/src/main/java/org/apache/spark/network/client/TransportResponseHandler.java 2 additions, 1 deletion...apache/spark/network/client/TransportResponseHandler.java
- network/common/src/main/java/org/apache/spark/network/protocol/MessageEncoder.java 1 addition, 1 deletion...ava/org/apache/spark/network/protocol/MessageEncoder.java
- network/common/src/main/java/org/apache/spark/network/server/TransportServer.java 6 additions, 2 deletions...java/org/apache/spark/network/server/TransportServer.java
- network/common/src/main/java/org/apache/spark/network/util/NettyUtils.java 9 additions, 5 deletions...c/main/java/org/apache/spark/network/util/NettyUtils.java
- network/common/src/main/java/org/apache/spark/network/util/TransportConf.java 17 additions, 0 deletions...ain/java/org/apache/spark/network/util/TransportConf.java
- network/common/src/test/java/org/apache/spark/network/TransportClientFactorySuite.java 4 additions, 3 deletions...org/apache/spark/network/TransportClientFactorySuite.java
- network/shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleClient.java 24 additions, 7 deletions...g/apache/spark/network/shuffle/ExternalShuffleClient.java
- network/shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java 5 additions, 4 deletions...g/apache/spark/network/shuffle/OneForOneBlockFetcher.java
- network/shuffle/src/main/java/org/apache/spark/network/shuffle/RetryingBlockFetcher.java 234 additions, 0 deletions...rg/apache/spark/network/shuffle/RetryingBlockFetcher.java
- network/shuffle/src/test/java/org/apache/spark/network/sasl/SaslIntegrationSuite.java 2 additions, 2 deletions...a/org/apache/spark/network/sasl/SaslIntegrationSuite.java
- network/shuffle/src/test/java/org/apache/spark/network/shuffle/ExternalShuffleIntegrationSuite.java 12 additions, 6 deletions...park/network/shuffle/ExternalShuffleIntegrationSuite.java
- network/shuffle/src/test/java/org/apache/spark/network/shuffle/ExternalShuffleSecuritySuite.java 4 additions, 2 deletions...e/spark/network/shuffle/ExternalShuffleSecuritySuite.java
- network/shuffle/src/test/java/org/apache/spark/network/shuffle/RetryingBlockFetcherSuite.java 310 additions, 0 deletions...ache/spark/network/shuffle/RetryingBlockFetcherSuite.java
Loading
Please register or sign in to comment