Skip to content
Snippets Groups Projects
  1. Nov 21, 2013
  2. Nov 19, 2013
  3. Nov 17, 2013
  4. Nov 16, 2013
    • Matei Zaharia's avatar
      Merge pull request #178 from hsaputra/simplecleanupcode · 1b5b3583
      Matei Zaharia authored
      Simple cleanup on Spark's Scala code
      
      Simple cleanup on Spark's Scala code while testing some modules:
      -) Remove some of unused imports as I found them
      -) Remove ";" in the imports statements
      -) Remove () at the end of method calls like size that does not have size effect.
      1b5b3583
  5. Nov 15, 2013
    • Henry Saputra's avatar
      Simple cleanup on Spark's Scala code while testing core and yarn modules: · c33f8020
      Henry Saputra authored
      -) Remove some of unused imports as I found them
      -) Remove ";" in the imports statements
      -) Remove () at the end of method call like size that does not have size effect.
      c33f8020
    • Matei Zaharia's avatar
      Merge pull request #173 from kayousterhout/scheduler_hang · 96e0fb46
      Matei Zaharia authored
      Fix bug where scheduler could hang after task failure.
      
      When a task fails, we need to call reviveOffers() so that the
      task can be rescheduled on a different machine. In the current code,
      the state in ClusterTaskSetManager indicating which tasks are
      pending may be updated after revive offers is called (there's a
      race condition here), so when revive offers is called, the task set
      manager does not yet realize that there are failed tasks that need
      to be relaunched.
      
      This isn't currently unit tested but will be once my pull request for
      merging the cluster and local schedulers goes in -- at which point
      many more of the unit tests will exercise the code paths through
      the cluster scheduler (currently the failure test suite uses the local
      scheduler, which is why we didn't see this bug before).
      96e0fb46
  6. Nov 14, 2013
  7. Nov 13, 2013
    • Matei Zaharia's avatar
      Merge pull request #159 from liancheng/dagscheduler-actor-refine · 2054c61a
      Matei Zaharia authored
      Migrate the daemon thread started by DAGScheduler to Akka actor
      
      `DAGScheduler` adopts an event queue and a daemon thread polling the it to process events sent to a `DAGScheduler`.  This is a classical actor use case.  By migrating this thread to Akka actor, we may benefit from both cleaner code and better performance (context switching cost of Akka actor is much less than that of a native thread).
      
      But things become a little complicated when taking existing test code into consideration.
      
      Code in `DAGSchedulerSuite` is somewhat tightly coupled with `DAGScheduler`, and directly calls `DAGScheduler.processEvent` instead of posting event messages to `DAGScheduler`.  To minimize code change, I chose to let the actor to delegate messages to `processEvent`.  Maybe this doesn't follow conventional actor usage, but I tried to make it apparently correct.
      
      Another tricky part is that, since `DAGScheduler` depends on the `ActorSystem` provided by its field `env`, `env` cannot be null.  But the `dagScheduler` field created in `DAGSchedulerSuite.before` was given a null `env`.  What's more, `BlockManager.blockIdsToBlockManagers` checks whether `env` is null to determine whether to run the production code or the test code (bad smell here, huh?).  I went through all callers of `BlockManager.blockIdsToBlockManagers`, and made sure that if `env != null` holds, then `blockManagerMaster == null` must also hold.  That's the logic behind `BlockManager.scala` [line 896](https://github.com/liancheng/incubator-spark/compare/dagscheduler-actor-refine?expand=1#diff-2b643ea78c1add0381754b1f47eec132L896).
      
      At last, since `DAGScheduler` instances are always `start()`ed after creation, I removed the `start()` method, and starts the `eventProcessActor` within the constructor.
      2054c61a
    • Matei Zaharia's avatar
      Merge pull request #165 from NathanHowell/kerberos-master · 9290e5bc
      Matei Zaharia authored
      spark-assembly.jar fails to authenticate with YARN ResourceManager
      
      The META-INF/services/ sbt MergeStrategy was discarding support for Kerberos, among others. This pull request changes to a merge strategy similar to sbt-assembly's default. I've also included an update to sbt-assembly 0.9.2, a minor fix to it's zip file handling.
      9290e5bc
    • Ahir Reddy's avatar
      Write Spark UI url to driver file on HDFS · 0ea1f8b2
      Ahir Reddy authored
      0ea1f8b2
    • Matei Zaharia's avatar
      Merge pull request #166 from ahirreddy/simr-spark-ui · 39af914b
      Matei Zaharia authored
      SIMR Backend Scheduler will now write Spark UI URL to HDFS, which is to ...
      
      ...be retrieved by SIMR clients
      39af914b
  8. Nov 12, 2013
    • Matei Zaharia's avatar
      Merge pull request #137 from tgravescs/sparkYarnJarsHdfsRebase · f49ea28d
      Matei Zaharia authored
      Allow spark on yarn to be run from HDFS.
      
      Allows the spark.jar, app.jar, and log4j.properties to be put into hdfs.  Allows you to specify the files on a different hdfs cluster and it will copy them over. It makes sure permissions are correct and makes sure to put things into public distributed cache so they can be reused amongst users if their permissions are appropriate.  Also add a bit of error handling for missing arguments.
      f49ea28d
    • Matei Zaharia's avatar
      Merge pull request #153 from ankurdave/stop-spot-cluster · 87f2f4e5
      Matei Zaharia authored
      Enable stopping and starting a spot cluster
      
      Clusters launched using `--spot-price` contain an on-demand master and spot slaves. Because EC2 does not support stopping spot instances, the spark-ec2 script previously could only destroy such clusters.
      
      This pull request makes it possible to stop and restart a spot cluster.
      * The `stop` command works as expected for a spot cluster: the master is stopped and the slaves are terminated.
      * To start a stopped spot cluster, the user must invoke `launch --use-existing-master`. This launches fresh spot slaves but resumes the existing master.
      87f2f4e5
    • Matei Zaharia's avatar
      Merge pull request #160 from xiajunluan/JIRA-923 · b8bf04a0
      Matei Zaharia authored
      Fix bug JIRA-923
      
      Fix column sort issue in UI for JIRA-923.
      https://spark-project.atlassian.net/browse/SPARK-923
      
      Conflicts:
      	core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala
      	core/src/main/scala/org/apache/spark/ui/jobs/StageTable.scala
      b8bf04a0
    • Ahir Reddy's avatar
      SIMR Backend Scheduler will now write Spark UI URL to HDFS, which is to be... · ccb099e8
      Ahir Reddy authored
      SIMR Backend Scheduler will now write Spark UI URL to HDFS, which is to be retrieved by SIMR clients
      ccb099e8
    • Nathan Howell's avatar
      Upgrade to sbt-assembly 0.9.2 · 48eac0bc
      Nathan Howell authored
      48eac0bc
    • Nathan Howell's avatar
      spark-assembly.jar fails to authenticate with YARN ResourceManager · 23146a67
      Nathan Howell authored
      sbt-assembly is setup to pick the first META-INF/services/org.apache.hadoop.security.SecurityInfo file instead of merging them. This causes Kerberos authentication to fail, this manifests itself in the "info:null" debug log statement:
      
          DEBUG SaslRpcClient: Get token info proto:interface org.apache.hadoop.yarn.api.ApplicationClientProtocolPB info:null
          DEBUG SaslRpcClient: Get kerberos info proto:interface org.apache.hadoop.yarn.api.ApplicationClientProtocolPB info:null
          ERROR UserGroupInformation: PriviledgedActionException as:foo@BAR (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
          DEBUG UserGroupInformation: PrivilegedAction as:foo@BAR (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:583)
          WARN Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
          ERROR UserGroupInformation: PriviledgedActionException as:foo@BAR (auth:KERBEROS) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
      
      This previously would just contain a single class:
      
      $ unzip -c assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar META-INF/services/org.apache.hadoop.security.SecurityInfo
      Archive:  assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar
        inflating: META-INF/services/org.apache.hadoop.security.SecurityInfo
      
          org.apache.hadoop.security.AnnotatedSecurityInfo
      
      And now has the full list of classes:
      
      $ unzip -c assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar META-INF/services/org.apache.hadoop.security.SecurityInfoArchive:  assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar
        inflating: META-INF/services/org.apache.hadoop.security.SecurityInfo
      
          org.apache.hadoop.security.AnnotatedSecurityInfo
          org.apache.hadoop.mapreduce.v2.app.MRClientSecurityInfo
          org.apache.hadoop.mapreduce.v2.security.client.ClientHSSecurityInfo
          org.apache.hadoop.yarn.security.client.ClientRMSecurityInfo
          org.apache.hadoop.yarn.security.ContainerManagerSecurityInfo
          org.apache.hadoop.yarn.security.SchedulerSecurityInfo
          org.apache.hadoop.yarn.security.admin.AdminSecurityInfo
          org.apache.hadoop.yarn.server.RMNMSecurityInfoClass
      23146a67
    • Matei Zaharia's avatar
      Merge pull request #164 from tdas/kafka-fix · dfd1ebc2
      Matei Zaharia authored
      Made block generator thread safe to fix Kafka bug.
      
      This is a very important bug fix. Data can and was being lost in the kafka due to this.
      dfd1ebc2
    • Tathagata Das's avatar
      7ccbbdac
  9. Nov 11, 2013
Loading