Skip to content
Snippets Groups Projects
Commit 5fde6616 authored by huangzhaowei's avatar huangzhaowei Committed by Thomas Graves
Browse files

[YARN][SPARK-4929] Bug fix: fix the yarn-client code to support HA

Nowadays, yarn-client will exit directly when the HA change happens no matter how many times the am should retry.
The reason may be that the default final status only considerred the sys.exit, and the yarn-client HA cann't benefit from this.
So we should distinct the default final status between client and cluster, because the SUCCEEDED status may cause the HA failed in client mode and UNDEFINED may cause the error reporter in cluster when using sys.exit.

Author: huangzhaowei <carlmartinmax@gmail.com>

Closes #3771 from SaintBacchus/YarnHA and squashes the following commits:

c02bfcc [huangzhaowei] Improve the comment of the funciton 'getDefaultFinalStatus'
0e69924 [huangzhaowei] Bug fix: fix the yarn-client code to support HA
parent e21acc19
No related branches found
No related tags found
No related merge requests found
......@@ -60,7 +60,7 @@ private[spark] class ApplicationMaster(args: ApplicationMasterArguments,
@volatile private var exitCode = 0
@volatile private var unregistered = false
@volatile private var finished = false
@volatile private var finalStatus = FinalApplicationStatus.SUCCEEDED
@volatile private var finalStatus = getDefaultFinalStatus
@volatile private var finalMsg: String = ""
@volatile private var userClassThread: Thread = _
......@@ -152,6 +152,20 @@ private[spark] class ApplicationMaster(args: ApplicationMasterArguments,
exitCode
}
/**
* Set the default final application status for client mode to UNDEFINED to handle
* if YARN HA restarts the application so that it properly retries. Set the final
* status to SUCCEEDED in cluster mode to handle if the user calls System.exit
* from the application code.
*/
final def getDefaultFinalStatus() = {
if (isDriver) {
FinalApplicationStatus.SUCCEEDED
} else {
FinalApplicationStatus.UNDEFINED
}
}
/**
* unregister is used to completely unregister the application from the ResourceManager.
* This means the ResourceManager will not retry the application attempt on your behalf if
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment