Skip to content
Snippets Groups Projects
Commit b8a18719 authored by Sandy Ryza's avatar Sandy Ryza Committed by Thomas Graves
Browse files

SPARK-1053. Don't require SPARK_YARN_APP_JAR

It looks this just requires taking out the checks.

I verified that, with the patch, I was able to run spark-shell through yarn without setting the environment variable.

Author: Sandy Ryza <sandy@cloudera.com>

Closes #553 from sryza/sandy-spark-1053 and squashes the following commits:

b037676 [Sandy Ryza] SPARK-1053.  Don't require SPARK_YARN_APP_JAR
parent c852201c
No related branches found
No related tags found
No related merge requests found
...@@ -99,13 +99,12 @@ With this mode, your application is actually run on the remote machine where the ...@@ -99,13 +99,12 @@ With this mode, your application is actually run on the remote machine where the
## Launch spark application with yarn-client mode. ## Launch spark application with yarn-client mode.
With yarn-client mode, the application will be launched locally. Just like running application or spark-shell on Local / Mesos / Standalone mode. The launch method is also the similar with them, just make sure that when you need to specify a master url, use "yarn-client" instead. And you also need to export the env value for SPARK_JAR and SPARK_YARN_APP_JAR With yarn-client mode, the application will be launched locally. Just like running application or spark-shell on Local / Mesos / Standalone mode. The launch method is also the similar with them, just make sure that when you need to specify a master url, use "yarn-client" instead. And you also need to export the env value for SPARK_JAR.
Configuration in yarn-client mode: Configuration in yarn-client mode:
In order to tune worker core/number/memory etc. You need to export environment variables or add them to the spark configuration file (./conf/spark_env.sh). The following are the list of options. In order to tune worker core/number/memory etc. You need to export environment variables or add them to the spark configuration file (./conf/spark_env.sh). The following are the list of options.
* `SPARK_YARN_APP_JAR`, Path to your application's JAR file (required)
* `SPARK_WORKER_INSTANCES`, Number of workers to start (Default: 2) * `SPARK_WORKER_INSTANCES`, Number of workers to start (Default: 2)
* `SPARK_WORKER_CORES`, Number of cores for the workers (Default: 1). * `SPARK_WORKER_CORES`, Number of cores for the workers (Default: 1).
* `SPARK_WORKER_MEMORY`, Memory per Worker (e.g. 1000M, 2G) (Default: 1G) * `SPARK_WORKER_MEMORY`, Memory per Worker (e.g. 1000M, 2G) (Default: 1G)
...@@ -118,12 +117,11 @@ In order to tune worker core/number/memory etc. You need to export environment v ...@@ -118,12 +117,11 @@ In order to tune worker core/number/memory etc. You need to export environment v
For example: For example:
SPARK_JAR=./assembly/target/scala-{{site.SCALA_BINARY_VERSION}}/spark-assembly-{{site.SPARK_VERSION}}-hadoop2.0.5-alpha.jar \ SPARK_JAR=./assembly/target/scala-{{site.SCALA_BINARY_VERSION}}/spark-assembly-{{site.SPARK_VERSION}}-hadoop2.0.5-alpha.jar \
SPARK_YARN_APP_JAR=examples/target/scala-{{site.SCALA_BINARY_VERSION}}/spark-examples-assembly-{{site.SPARK_VERSION}}.jar \
./bin/run-example org.apache.spark.examples.SparkPi yarn-client ./bin/run-example org.apache.spark.examples.SparkPi yarn-client
or
SPARK_JAR=./assembly/target/scala-{{site.SCALA_BINARY_VERSION}}/spark-assembly-{{site.SPARK_VERSION}}-hadoop2.0.5-alpha.jar \ SPARK_JAR=./assembly/target/scala-{{site.SCALA_BINARY_VERSION}}/spark-assembly-{{site.SPARK_VERSION}}-hadoop2.0.5-alpha.jar \
SPARK_YARN_APP_JAR=examples/target/scala-{{site.SCALA_BINARY_VERSION}}/spark-examples-assembly-{{site.SPARK_VERSION}}.jar \
MASTER=yarn-client ./bin/spark-shell MASTER=yarn-client ./bin/spark-shell
......
...@@ -108,7 +108,7 @@ class ClientArguments(val args: Array[String], val sparkConf: SparkConf) { ...@@ -108,7 +108,7 @@ class ClientArguments(val args: Array[String], val sparkConf: SparkConf) {
args = tail args = tail
case Nil => case Nil =>
if (userJar == null || userClass == null) { if (userClass == null) {
printUsageAndExit(1) printUsageAndExit(1)
} }
...@@ -129,7 +129,7 @@ class ClientArguments(val args: Array[String], val sparkConf: SparkConf) { ...@@ -129,7 +129,7 @@ class ClientArguments(val args: Array[String], val sparkConf: SparkConf) {
System.err.println( System.err.println(
"Usage: org.apache.spark.deploy.yarn.Client [options] \n" + "Usage: org.apache.spark.deploy.yarn.Client [options] \n" +
"Options:\n" + "Options:\n" +
" --jar JAR_PATH Path to your application's JAR file (required)\n" + " --jar JAR_PATH Path to your application's JAR file (required in yarn-standalone mode)\n" +
" --class CLASS_NAME Name of your application's main class (required)\n" + " --class CLASS_NAME Name of your application's main class (required)\n" +
" --args ARGS Arguments to be passed to your application's main class.\n" + " --args ARGS Arguments to be passed to your application's main class.\n" +
" Mutliple invocations are possible, each will be passed in order.\n" + " Mutliple invocations are possible, each will be passed in order.\n" +
......
...@@ -68,7 +68,8 @@ trait ClientBase extends Logging { ...@@ -68,7 +68,8 @@ trait ClientBase extends Logging {
def validateArgs() = { def validateArgs() = {
Map( Map(
(System.getenv("SPARK_JAR") == null) -> "Error: You must set SPARK_JAR environment variable!", (System.getenv("SPARK_JAR") == null) -> "Error: You must set SPARK_JAR environment variable!",
(args.userJar == null) -> "Error: You must specify a user jar!", ((args.userJar == null && args.amClass == classOf[ApplicationMaster].getName) ->
"Error: You must specify a user jar when running in standalone mode!"),
(args.userClass == null) -> "Error: You must specify a user class!", (args.userClass == null) -> "Error: You must specify a user class!",
(args.numWorkers <= 0) -> "Error: You must specify at least 1 worker!", (args.numWorkers <= 0) -> "Error: You must specify at least 1 worker!",
(args.amMemory <= YarnAllocationHandler.MEMORY_OVERHEAD) -> ("Error: AM memory size must be" + (args.amMemory <= YarnAllocationHandler.MEMORY_OVERHEAD) -> ("Error: AM memory size must be" +
......
...@@ -44,10 +44,6 @@ private[spark] class YarnClientSchedulerBackend( ...@@ -44,10 +44,6 @@ private[spark] class YarnClientSchedulerBackend(
override def start() { override def start() {
super.start() super.start()
val userJar = System.getenv("SPARK_YARN_APP_JAR")
if (userJar == null)
throw new SparkException("env SPARK_YARN_APP_JAR is not set")
val driverHost = conf.get("spark.driver.host") val driverHost = conf.get("spark.driver.host")
val driverPort = conf.get("spark.driver.port") val driverPort = conf.get("spark.driver.port")
val hostport = driverHost + ":" + driverPort val hostport = driverHost + ":" + driverPort
...@@ -55,7 +51,7 @@ private[spark] class YarnClientSchedulerBackend( ...@@ -55,7 +51,7 @@ private[spark] class YarnClientSchedulerBackend(
val argsArrayBuf = new ArrayBuffer[String]() val argsArrayBuf = new ArrayBuffer[String]()
argsArrayBuf += ( argsArrayBuf += (
"--class", "notused", "--class", "notused",
"--jar", userJar, "--jar", null,
"--args", hostport, "--args", hostport,
"--master-class", "org.apache.spark.deploy.yarn.WorkerLauncher" "--master-class", "org.apache.spark.deploy.yarn.WorkerLauncher"
) )
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment