Skip to content
Snippets Groups Projects
  1. Jan 09, 2014
  2. Jan 08, 2014
  3. Jan 07, 2014
    • Patrick Wendell's avatar
      Merge pull request #336 from liancheng/akka-remote-lookup · f5f12dc2
      Patrick Wendell authored
      Get rid of `Either[ActorRef, ActorSelection]'
      
      In this pull request, instead of returning an `Either[ActorRef, ActorSelection]`, `registerOrLookup` identifies the remote actor blockingly to obtain an `ActorRef`, or throws an exception if the remote actor doesn't exist or the lookup times out (configured by `spark.akka.lookupTimeout`).  This function is only called when an `SparkEnv` is constructed (instantiating driver or executor), so the blocking call is considered acceptable.  Executor side `ActorSelection`s/`ActorRef`s to driver side `MapOutputTrackerMasterActor` and `BlockManagerMasterActor` are affected by this pull request.
      
      `ActorSelection` is dangerous and should be used with care.  It's only absolutely safe to send messages via an `ActorSelection` when the remote actor is stateless, so that actor incarnation is irrelevant.  But as pointed by @ScrapCodes in the comments below, executor exits immediately once the connection to the driver lost, `ActorSelection`s are not harmful in this scenario.  So this pull request is mostly a code style patch.
      f5f12dc2
    • Matei Zaharia's avatar
      Merge pull request #327 from lucarosellini/master · 11891e68
      Matei Zaharia authored
      Added ‘-i’ command line option to Spark REPL
      
      We had to create a new implementation of both scala.tools.nsc.CompilerCommand and scala.tools.nsc.Settings, because using scala.tools.nsc.GenericRunnerSettings would bring in other options (-howtorun, -save and -execute) which don’t make sense in Spark.
      Any new Spark specific command line option could now be added to org.apache.spark.repl.SparkRunnerSettings class.
      
      Since the behavior of loading a script from the command line should be the same as loading it using the “:load” command inside the shell, the script should be loaded when the SparkContext is available, that’s why we had to move the call to ‘loadfiles(settings)’ _after_ the call to postInitialization(). This still doesn’t work if ‘isAsync = true’.
      11891e68
    • Matei Zaharia's avatar
      Merge pull request #354 from hsaputra/addasfheadertosbt · 7d0aac91
      Matei Zaharia authored
      Add ASF header to the new sbt script.
      
      Add ASF header to the new sbt script.
      7d0aac91
    • Matei Zaharia's avatar
      Merge pull request #350 from mateiz/standalone-limit · d75dc428
      Matei Zaharia authored
      Add way to limit default # of cores used by apps in standalone mode
      
      Also documents the spark.deploy.spreadOut option, and fixes a config option that had a dash in its name.
      d75dc428
    • Hossein Falaki's avatar
      Fixed merge conflict · 46cb980a
      Hossein Falaki authored
      46cb980a
    • Henry Saputra's avatar
      Add ASF header to the new sbt script. · 226b58ad
      Henry Saputra authored
      226b58ad
    • Patrick Wendell's avatar
      Merge pull request #352 from markhamstra/oldArch · 61674bca
      Patrick Wendell authored
      Don't leave os.arch unset after BlockManagerSuite
      
      Recent SparkConf changes meant that BlockManagerSuite was now leaving the os.arch System.property unset.  That's a problem for any subsequent tests that rely upon having a valid os.arch.  This is true for CompressionCodecSuite in the usual maven build test order, even though it isn't usually true for the sbt build.
      61674bca
    • Patrick Wendell's avatar
      Merge pull request #328 from falaki/MatrixFactorizationModel-fix · b2e690f8
      Patrick Wendell authored
      SPARK-1012: DAGScheduler Exception Fix
      
      Added a predict method to MatrixFactorizationModel to enable bulk prediction. This method takes and RDD[(Int, Int)] of users and products and return an RDD with a Rating element per each element in the input RDD.
      
      Also added python bindings to the new bulk prediction methods to address SPARK-1011 issue.
      
      This is ready to be merged now.
      b2e690f8
    • Mark Hamstra's avatar
      Fix BlockManagerSuite#after · 86ed1ad2
      Mark Hamstra authored
      86ed1ad2
    • Matei Zaharia's avatar
      Address review comments · 2c421749
      Matei Zaharia authored
      2c421749
    • Patrick Wendell's avatar
      Merge pull request #351 from pwendell/maven-fix · 6ccf8ce7
      Patrick Wendell authored
      Add log4j exclusion rule to maven.
      
      To make this work I had to rename the defaults file. Otherwise
      maven's pattern matching rules included it when trying to match
      other log4j.properties files.
      
      I also fixed a bug in the existing maven build where two
      <transformers> tags were present in assembly/pom.xml
      such that one overwrote the other.
      6ccf8ce7
    • Hossein Falaki's avatar
    • Matei Zaharia's avatar
      Fix unit test compilation · 044c8ad3
      Matei Zaharia authored
      044c8ad3
    • Patrick Wendell's avatar
      Add log4j exclusion rule to maven. · e688e112
      Patrick Wendell authored
      To make this work I had to rename the defaults file. Otherwise
      maven's pattern matching rules included it when trying to match
      other log4j.properties files.
      
      I also fixed a bug in the existing maven build where two
      <transformers> tags were present in assembly/pom.xml
      such that one overwrote the other.
      e688e112
    • Matei Zaharia's avatar
      Add way to limit default # of cores used by applications on standalone mode · d8bcc8e9
      Matei Zaharia authored
      Also documents the spark.deploy.spreadOut option.
      d8bcc8e9
    • Reynold Xin's avatar
      Merge pull request #337 from yinxusen/mllib-16-bugfix · 7d5fa175
      Reynold Xin authored
      Mllib 16 bugfix
      
      Bug fix: https://spark-project.atlassian.net/browse/MLLIB-16
      
      Hi, I fixed the bug and added a test suite for `GradientDescent`. There are 2 checks in the test case. First, the final loss must be lower than the initial one. Second, the trend of loss sequence should be decreasing, i.e., at least 80% iterations have lower losses than their prior iterations.
      
      Thanks!
      7d5fa175
    • Reynold Xin's avatar
      Merge pull request #349 from CodingCat/support-worker_dir · 71fc1135
      Reynold Xin authored
      add the comments about SPARK_WORKER_DIR
      
      this env variable seems to be forgotten
      
      in many cases we need to set this variable, e.g. in EC2, we have to move the large application log files from the EBS to the ephemeral storage
      71fc1135
    • Tathagata Das's avatar
    • CodingCat's avatar
      add the comments about SPARK_WORKER_DIR · 3633172e
      CodingCat authored
      this env variable seems to be forgotten …
      3633172e
Loading