-
- Downloads
[SPARK-5338] [MESOS] Add cluster mode support for Mesos
This patch adds the support for cluster mode to run on Mesos. It introduces a new Mesos framework dedicated to launch new apps/drivers, and can be called with the spark-submit script and specifying --master flag to the cluster mode REST interface instead of Mesos master. Example: ./bin/spark-submit --deploy-mode cluster --class org.apache.spark.examples.SparkPi --master mesos://10.0.0.206:8077 --executor-memory 1G --total-executor-cores 100 examples/target/spark-examples_2.10-1.3.0-SNAPSHOT.jar 30 Part of this patch is also to abstract the StandaloneRestServer so it can have different implementations of the REST endpoints. Features of the cluster mode in this PR: - Supports supervise mode where scheduler will keep trying to reschedule exited job. - Adds a new UI for the cluster mode scheduler to see all the running jobs, finished jobs, and supervise jobs waiting to be retried - Supports state persistence to ZK, so when the cluster scheduler fails over it can pick up all the queued and running jobs Author: Timothy Chen <tnachen@gmail.com> Author: Luc Bourlier <luc.bourlier@typesafe.com> Closes #5144 from tnachen/mesos_cluster_mode and squashes the following commits: 069e946 [Timothy Chen] Fix rebase. e24b512 [Timothy Chen] Persist submitted driver. 390c491 [Timothy Chen] Fix zk conf key for mesos zk engine. e324ac1 [Timothy Chen] Fix merge. fd5259d [Timothy Chen] Address review comments. 1553230 [Timothy Chen] Address review comments. c6c6b73 [Timothy Chen] Pass spark properties to mesos cluster tasks. f7d8046 [Timothy Chen] Change app name to spark cluster. 17f93a2 [Timothy Chen] Fix head of line blocking in scheduling drivers. 6ff8e5c [Timothy Chen] Address comments and add logging. df355cd [Timothy Chen] Add metrics to mesos cluster scheduler. 20f7284 [Timothy Chen] Address review comments 7252612 [Timothy Chen] Fix tests. a46ad66 [Timothy Chen] Allow zk cli param override. 920fc4b [Timothy Chen] Fix scala style issues. 862b5b5 [Timothy Chen] Support asking driver status when it's retrying. 7f214c2 [Timothy Chen] Fix RetryState visibility e0f33f7 [Timothy Chen] Add supervise support and persist retries. 371ce65 [Timothy Chen] Handle cluster mode recovery and state persistence. 3d4dfa1 [Luc Bourlier] Adds support to kill submissions febfaba [Timothy Chen] Bound the finished drivers in memory 543a98d [Timothy Chen] Schedule multiple jobs 6887e5e [Timothy Chen] Support looking at SPARK_EXECUTOR_URI env variable in schedulers 8ec76bc [Timothy Chen] Fix Mesos dispatcher UI. d57d77d [Timothy Chen] Add documentation 825afa0 [Luc Bourlier] Supports more spark-submit parameters b8e7181 [Luc Bourlier] Adds a shutdown latch to keep the deamon running 0fa7780 [Luc Bourlier] Launch task through the mesos scheduler 5b7a12b [Timothy Chen] WIP: Making a cluster mode a mesos framework. 4b2f5ef [Timothy Chen] Specify user jar in command to be replaced with local. e775001 [Timothy Chen] Support fetching remote uris in driver runner. 7179495 [Timothy Chen] Change Driver page output and add logging 880bc27 [Timothy Chen] Add Mesos Cluster UI to display driver results 9986731 [Timothy Chen] Kill drivers when shutdown 67cbc18 [Timothy Chen] Rename StandaloneRestClient to RestClient and add sbin scripts e3facdd [Timothy Chen] Add Mesos Cluster dispatcher
Showing
- core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala 1 addition, 1 deletion...in/scala/org/apache/spark/deploy/FaultToleranceTest.scala
- core/src/main/scala/org/apache/spark/deploy/SparkCuratorUtil.scala 6 additions, 4 deletions...main/scala/org/apache/spark/deploy/SparkCuratorUtil.scala
- core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala 30 additions, 18 deletions.../src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
- core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala 7 additions, 4 deletions.../scala/org/apache/spark/deploy/SparkSubmitArguments.scala
- core/src/main/scala/org/apache/spark/deploy/master/Master.scala 1 addition, 1 deletion...rc/main/scala/org/apache/spark/deploy/master/Master.scala
- core/src/main/scala/org/apache/spark/deploy/master/ZooKeeperLeaderElectionAgent.scala 1 addition, 0 deletions...he/spark/deploy/master/ZooKeeperLeaderElectionAgent.scala
- core/src/main/scala/org/apache/spark/deploy/master/ZooKeeperPersistenceEngine.scala 1 addition, 0 deletions...ache/spark/deploy/master/ZooKeeperPersistenceEngine.scala
- core/src/main/scala/org/apache/spark/deploy/mesos/MesosClusterDispatcher.scala 116 additions, 0 deletions...rg/apache/spark/deploy/mesos/MesosClusterDispatcher.scala
- core/src/main/scala/org/apache/spark/deploy/mesos/MesosClusterDispatcherArguments.scala 101 additions, 0 deletions.../spark/deploy/mesos/MesosClusterDispatcherArguments.scala
- core/src/main/scala/org/apache/spark/deploy/mesos/MesosDriverDescription.scala 65 additions, 0 deletions...rg/apache/spark/deploy/mesos/MesosDriverDescription.scala
- core/src/main/scala/org/apache/spark/deploy/mesos/ui/MesosClusterPage.scala 114 additions, 0 deletions...a/org/apache/spark/deploy/mesos/ui/MesosClusterPage.scala
- core/src/main/scala/org/apache/spark/deploy/mesos/ui/MesosClusterUI.scala 48 additions, 0 deletions...ala/org/apache/spark/deploy/mesos/ui/MesosClusterUI.scala
- core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionClient.scala 22 additions, 13 deletions...a/org/apache/spark/deploy/rest/RestSubmissionClient.scala
- core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala 318 additions, 0 deletions...a/org/apache/spark/deploy/rest/RestSubmissionServer.scala
- core/src/main/scala/org/apache/spark/deploy/rest/StandaloneRestServer.scala 52 additions, 292 deletions...a/org/apache/spark/deploy/rest/StandaloneRestServer.scala
- core/src/main/scala/org/apache/spark/deploy/rest/SubmitRestProtocolRequest.scala 1 addition, 1 deletion.../apache/spark/deploy/rest/SubmitRestProtocolRequest.scala
- core/src/main/scala/org/apache/spark/deploy/rest/SubmitRestProtocolResponse.scala 3 additions, 3 deletions...apache/spark/deploy/rest/SubmitRestProtocolResponse.scala
- core/src/main/scala/org/apache/spark/deploy/rest/mesos/MesosRestServer.scala 158 additions, 0 deletions.../org/apache/spark/deploy/rest/mesos/MesosRestServer.scala
- core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala 19 additions, 63 deletions...scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala
- core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterPersistenceEngine.scala 134 additions, 0 deletions...heduler/cluster/mesos/MesosClusterPersistenceEngine.scala
Loading
Please register or sign in to comment