- Jan 07, 2014
-
-
Patrick Wendell authored
Get rid of `Either[ActorRef, ActorSelection]' In this pull request, instead of returning an `Either[ActorRef, ActorSelection]`, `registerOrLookup` identifies the remote actor blockingly to obtain an `ActorRef`, or throws an exception if the remote actor doesn't exist or the lookup times out (configured by `spark.akka.lookupTimeout`). This function is only called when an `SparkEnv` is constructed (instantiating driver or executor), so the blocking call is considered acceptable. Executor side `ActorSelection`s/`ActorRef`s to driver side `MapOutputTrackerMasterActor` and `BlockManagerMasterActor` are affected by this pull request. `ActorSelection` is dangerous and should be used with care. It's only absolutely safe to send messages via an `ActorSelection` when the remote actor is stateless, so that actor incarnation is irrelevant. But as pointed by @ScrapCodes in the comments below, executor exits immediately once the connection to the driver lost, `ActorSelection`s are not harmful in this scenario. So this pull request is mostly a code style patch.
-
Matei Zaharia authored
Added ‘-i’ command line option to Spark REPL We had to create a new implementation of both scala.tools.nsc.CompilerCommand and scala.tools.nsc.Settings, because using scala.tools.nsc.GenericRunnerSettings would bring in other options (-howtorun, -save and -execute) which don’t make sense in Spark. Any new Spark specific command line option could now be added to org.apache.spark.repl.SparkRunnerSettings class. Since the behavior of loading a script from the command line should be the same as loading it using the “:load” command inside the shell, the script should be loaded when the SparkContext is available, that’s why we had to move the call to ‘loadfiles(settings)’ _after_ the call to postInitialization(). This still doesn’t work if ‘isAsync = true’.
-
Matei Zaharia authored
Add ASF header to the new sbt script. Add ASF header to the new sbt script.
-
Matei Zaharia authored
Add way to limit default # of cores used by apps in standalone mode Also documents the spark.deploy.spreadOut option, and fixes a config option that had a dash in its name.
-
Henry Saputra authored
-
Patrick Wendell authored
Don't leave os.arch unset after BlockManagerSuite Recent SparkConf changes meant that BlockManagerSuite was now leaving the os.arch System.property unset. That's a problem for any subsequent tests that rely upon having a valid os.arch. This is true for CompressionCodecSuite in the usual maven build test order, even though it isn't usually true for the sbt build.
-
Patrick Wendell authored
SPARK-1012: DAGScheduler Exception Fix Added a predict method to MatrixFactorizationModel to enable bulk prediction. This method takes and RDD[(Int, Int)] of users and products and return an RDD with a Rating element per each element in the input RDD. Also added python bindings to the new bulk prediction methods to address SPARK-1011 issue. This is ready to be merged now.
-
Mark Hamstra authored
-
Matei Zaharia authored
-
Patrick Wendell authored
Add log4j exclusion rule to maven. To make this work I had to rename the defaults file. Otherwise maven's pattern matching rules included it when trying to match other log4j.properties files. I also fixed a bug in the existing maven build where two <transformers> tags were present in assembly/pom.xml such that one overwrote the other.
-
Hossein Falaki authored
-
Matei Zaharia authored
-
Patrick Wendell authored
To make this work I had to rename the defaults file. Otherwise maven's pattern matching rules included it when trying to match other log4j.properties files. I also fixed a bug in the existing maven build where two <transformers> tags were present in assembly/pom.xml such that one overwrote the other.
-
Matei Zaharia authored
Also documents the spark.deploy.spreadOut option.
-
Reynold Xin authored
Mllib 16 bugfix Bug fix: https://spark-project.atlassian.net/browse/MLLIB-16 Hi, I fixed the bug and added a test suite for `GradientDescent`. There are 2 checks in the test case. First, the final loss must be lower than the initial one. Second, the trend of loss sequence should be decreasing, i.e., at least 80% iterations have lower losses than their prior iterations. Thanks!
-
Reynold Xin authored
add the comments about SPARK_WORKER_DIR this env variable seems to be forgotten in many cases we need to set this variable, e.g. in EC2, we have to move the large application log files from the EBS to the ephemeral storage
-
CodingCat authored
this env variable seems to be forgotten …
-
Reynold Xin authored
Suggested small changes to Java code for slightly more standard style, encapsulation and in some cases performance Sorry if this is too abrupt or not a welcome set of changes, but thought I'd see if I could contribute a little. I'm a Java developer and just getting seriously into Spark. So I thought I'd suggest a number of small changes to the couple Java parts of the code to make it a little tighter, more standard and even a bit faster. Feel free to take all, some or none of this. Happy to explain any of it.
-
Reynold Xin authored
spark -> org.apache.spark Changed package name spark to org.apache.spark which was missing in some of the files
-
Sean Owen authored
-
Patrick Wendell authored
Conf improvements There are two new features. 1. Allow users to set arbitrary akka configurations via spark conf. 2. Allow configuration to be printed in logs for diagnosis.
-
Luca Rosellini authored
-
Reynold Xin authored
Add a script to download sbt if not present on the system As per the discussion on the dev mailing list this script will use the system sbt if present or otherwise attempt to install the sbt launcher. The fall back error message in the event it fails instructs the user to install sbt. While the URLs it fetches from aren't controlled by the spark project directly, they are stable and the current authoritative sources.
-
Holden Karau authored
-
Prashant Sharma authored
-
Prashant Sharma authored
-
Prashant Sharma authored
-
Holden Karau authored
-
prabeesh authored
-
- Jan 06, 2014
-
-
Patrick Wendell authored
Update stop-slaves.sh The most recently version has changed the directory structure, but this script "sbin/stop-all.sh" doesn't change with it accordingly. This mistake makes "sbin/stop-all.sh" can't stop the slave node.
-
sproblvem authored
The most recently version has changed the directory structure, but this script "sbin/stop-all.sh" doesn't change with it accordingly. This mistake makes "sbin/stop-all.sh" can't stop the slave node.
-
Patrick Wendell authored
Fix test breaking downstream builds This wasn't detected in the pull-request-builder because it manually sets SPARK_HOME. I'm going to change that (it should't do this) to make it like the other builds.
-
Patrick Wendell authored
-
Hossein Falaki authored
-
Hossein Falaki authored
-
Hossein Falaki authored
-
Patrick Wendell authored
Made java options to be applied during tests so that they become self explanatory.
-
Patrick Wendell authored
SPARK-1005 Ning upgrade
-
Patrick Wendell authored
Clarify spark.cores.max in docs It controls the count of cores across the cluster, not on a per-machine basis.
-
Patrick Wendell authored
Change protobuf version for yarn alpha back to 2.4.1 The maven build for yarn-alpha uses the wrong protobuf version and hence the generated assembly jar doesn't work with Hadoop 0.23. Removing the setting for the yarn-alpha profile since the default protobuf version is 2.4.1 at the top of the pom file.
-