-
- Downloads
[SPARK-17321][YARN] Avoid writing shuffle metadata to disk if NM recovery is disabled
In the current code, if NM recovery is not enabled then `YarnShuffleService` will write shuffle metadata to NM local dir-1, if this local dir-1 is on bad disk, then `YarnShuffleService` will be failed to start. So to solve this issue, in Spark side if NM recovery is not enabled, then Spark will not persist data into leveldb, in that case yarn shuffle service can still be served but lose the ability for recovery, (it is fine because the failure of NM will kill the containers as well as applications). Tested in the local cluster with NM recovery off and on to see if folder is created or not. MiniCluster UT isn't added because in MiniCluster NM will always set port to 0, but NM recovery requires non-ephemeral port. Author: jerryshao <sshao@hortonworks.com> Closes #19032 from jerryshao/SPARK-17321. Change-Id: I8f2fe73d175e2ad2c4e380caede3873e0192d027
Showing
- common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java 39 additions, 43 deletions...ava/org/apache/spark/network/yarn/YarnShuffleService.java
- resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnShuffleIntegrationSuite.scala 25 additions, 8 deletions...pache/spark/deploy/yarn/YarnShuffleIntegrationSuite.scala
- resource-managers/yarn/src/test/scala/org/apache/spark/network/yarn/YarnShuffleServiceSuite.scala 22 additions, 10 deletions...g/apache/spark/network/yarn/YarnShuffleServiceSuite.scala
Loading
Please register or sign in to comment