Commits · f5f12dc28218f3ed89836434ab0530e88b043e47 · cs525-sp18-g07 / spark

Jan 07, 2014

Merge pull request #336 from liancheng/akka-remote-lookup · f5f12dc2

Patrick Wendell authored 11 years ago

Get rid of `Either[ActorRef, ActorSelection]'

In this pull request, instead of returning an `Either[ActorRef, ActorSelection]`, `registerOrLookup` identifies the remote actor blockingly to obtain an `ActorRef`, or throws an exception if the remote actor doesn't exist or the lookup times out (configured by `spark.akka.lookupTimeout`).  This function is only called when an `SparkEnv` is constructed (instantiating driver or executor), so the blocking call is considered acceptable.  Executor side `ActorSelection`s/`ActorRef`s to driver side `MapOutputTrackerMasterActor` and `BlockManagerMasterActor` are affected by this pull request.

`ActorSelection` is dangerous and should be used with care.  It's only absolutely safe to send messages via an `ActorSelection` when the remote actor is stateless, so that actor incarnation is irrelevant.  But as pointed by @ScrapCodes in the comments below, executor exits immediately once the connection to the driver lost, `ActorSelection`s are not harmful in this scenario.  So this pull request is mostly a code style patch.

f5f12dc2

Merge pull request #327 from lucarosellini/master · 11891e68

Matei Zaharia authored 11 years ago

Added ‘-i’ command line option to Spark REPL

We had to create a new implementation of both scala.tools.nsc.CompilerCommand and scala.tools.nsc.Settings, because using scala.tools.nsc.GenericRunnerSettings would bring in other options (-howtorun, -save and -execute) which don’t make sense in Spark.
Any new Spark specific command line option could now be added to org.apache.spark.repl.SparkRunnerSettings class.

Since the behavior of loading a script from the command line should be the same as loading it using the “:load” command inside the shell, the script should be loaded when the SparkContext is available, that’s why we had to move the call to ‘loadfiles(settings)’ _after_ the call to postInitialization(). This still doesn’t work if ‘isAsync = true’.

11891e68

Merge pull request #354 from hsaputra/addasfheadertosbt · 7d0aac91
Matei Zaharia authored 11 years ago
```
Add ASF header to the new sbt script.

Add ASF header to the new sbt script.
```
7d0aac91

Merge pull request #350 from mateiz/standalone-limit · d75dc428

Matei Zaharia authored 11 years ago

Add way to limit default # of cores used by apps in standalone mode

Also documents the spark.deploy.spreadOut option, and fixes a config option that had a dash in its name.

d75dc428

Add ASF header to the new sbt script. · 226b58ad
Henry Saputra authored 11 years ago

226b58ad

Merge pull request #352 from markhamstra/oldArch · 61674bca

Patrick Wendell authored 11 years ago

Don't leave os.arch unset after BlockManagerSuite

Recent SparkConf changes meant that BlockManagerSuite was now leaving the os.arch System.property unset.  That's a problem for any subsequent tests that rely upon having a valid os.arch.  This is true for CompressionCodecSuite in the usual maven build test order, even though it isn't usually true for the sbt build.

61674bca

Merge pull request #328 from falaki/MatrixFactorizationModel-fix · b2e690f8

Patrick Wendell authored 11 years ago

SPARK-1012: DAGScheduler Exception Fix

Added a predict method to MatrixFactorizationModel to enable bulk prediction. This method takes and RDD[(Int, Int)] of users and products and return an RDD with a Rating element per each element in the input RDD.

Also added python bindings to the new bulk prediction methods to address SPARK-1011 issue.

This is ready to be merged now.

b2e690f8

Fix BlockManagerSuite#after · 86ed1ad2
Mark Hamstra authored 11 years ago

86ed1ad2
Address review comments · 2c421749
Matei Zaharia authored 11 years ago

2c421749

Merge pull request #351 from pwendell/maven-fix · 6ccf8ce7

Patrick Wendell authored 11 years ago

Add log4j exclusion rule to maven.

To make this work I had to rename the defaults file. Otherwise
maven's pattern matching rules included it when trying to match
other log4j.properties files.

I also fixed a bug in the existing maven build where two
<transformers> tags were present in assembly/pom.xml
such that one overwrote the other.

6ccf8ce7

Merge branch 'master' into MatrixFactorizationModel-fix · 3a8beb46
Hossein Falaki authored 11 years ago

3a8beb46
Fix unit test compilation · 044c8ad3
Matei Zaharia authored 11 years ago

044c8ad3

Add log4j exclusion rule to maven. · e688e112

Patrick Wendell authored 11 years ago

To make this work I had to rename the defaults file. Otherwise
maven's pattern matching rules included it when trying to match
other log4j.properties files.

I also fixed a bug in the existing maven build where two
<transformers> tags were present in assembly/pom.xml
such that one overwrote the other.

e688e112

Add way to limit default # of cores used by applications on standalone mode · d8bcc8e9
Matei Zaharia authored 11 years ago
```
Also documents the spark.deploy.spreadOut option.
```
d8bcc8e9

Merge pull request #337 from yinxusen/mllib-16-bugfix · 7d5fa175

Reynold Xin authored 11 years ago

Mllib 16 bugfix

Bug fix: https://spark-project.atlassian.net/browse/MLLIB-16

Hi, I fixed the bug and added a test suite for `GradientDescent`. There are 2 checks in the test case. First, the final loss must be lower than the initial one. Second, the trend of loss sequence should be decreasing, i.e., at least 80% iterations have lower losses than their prior iterations.

Thanks!

7d5fa175

Merge pull request #349 from CodingCat/support-worker_dir · 71fc1135

Reynold Xin authored 11 years ago

add the comments about SPARK_WORKER_DIR

this env variable seems to be forgotten

in many cases we need to set this variable, e.g. in EC2, we have to move the large application log files from the EBS to the ephemeral storage

71fc1135

add the comments about SPARK_WORKER_DIR · 3633172e
CodingCat authored 11 years ago
```
this env variable seems to be forgotten …
```
3633172e

Merge pull request #318 from srowen/master · 15d95345

Reynold Xin authored 11 years ago

Suggested small changes to Java code for slightly more standard style, encapsulation and in some cases performance

Sorry if this is too abrupt or not a welcome set of changes, but thought I'd see if I could contribute a little. I'm a Java developer and just getting seriously into Spark. So I thought I'd suggest a number of small changes to the couple Java parts of the code to make it a little tighter, more standard and even a bit faster.

Feel free to take all, some or none of this. Happy to explain any of it.

15d95345

Merge pull request #348 from prabeesh/master · 468af0fa

Reynold Xin authored 11 years ago

spark -> org.apache.spark

Changed package name spark to org.apache.spark which was missing in some of the files

468af0fa

Issue #318 : minor style updates per review from Reynold Xin · 4b92a202
Sean Owen authored 11 years ago

4b92a202

Merge pull request #339 from ScrapCodes/conf-improvements · c3cf0475

Patrick Wendell authored 11 years ago

Conf improvements

There are two new features.

1. Allow users to set arbitrary akka configurations via spark conf.

2. Allow configuration to be printed in logs for diagnosis.

c3cf0475

Added license header and removed @author tag · 4689ce29
Luca Rosellini authored 11 years ago

4689ce29

Merge pull request #331 from holdenk/master · a862cafa

Reynold Xin authored 11 years ago

Add a script to download sbt if not present on the system

As per the discussion on the dev mailing list this script will use the system sbt if present or otherwise attempt to install the sbt launcher. The fall back error message in the event it fails instructs the user to install sbt. While the URLs it fetches from aren't controlled by the spark project directly, they are stable and the current authoritative sources.

a862cafa

Use awk to extract the version · 60a7a6b3
Holden Karau authored 11 years ago

60a7a6b3
formatting related fixes suggested by Patrick. · c729fa7c
Prashant Sharma authored 11 years ago

c729fa7c
Allow configuration to be printed in logs for diagnosis. · b84dc780
Prashant Sharma authored 11 years ago

b84dc780
Allow users to set arbitrary akka configurations via spark conf. · b3018811
Prashant Sharma authored 11 years ago

b3018811
Put quote arround arguments passed down to system sbt · b590adb2
Holden Karau authored 11 years ago

b590adb2
spark -> org.apache.spark · a91f14cf
prabeesh authored 11 years ago

a91f14cf

Jan 06, 2014

Merge pull request #346 from sproblvem/patch-1 · b97ef218

Patrick Wendell authored 11 years ago

Update stop-slaves.sh

The most recently version has changed the directory structure, but this script "sbin/stop-all.sh" doesn't change with it accordingly. This mistake makes "sbin/stop-all.sh" can't stop the slave node.

b97ef218

Update stop-slaves.sh · dea4ba9d

sproblvem authored 11 years ago

The most recently version has changed the directory structure, but this script "sbin/stop-all.sh" doesn't change with it accordingly. This mistake makes "sbin/stop-all.sh" can't stop the slave node.

dea4ba9d

Merge pull request #343 from pwendell/build-fix · e4d6057b

Patrick Wendell authored 11 years ago

Fix test breaking downstream builds

This wasn't detected in the pull-request-builder because it manually sets SPARK_HOME. I'm going to change that (it should't do this) to make it like the other builds.

e4d6057b

Fix test breaking downstream builds · 9272a004
Patrick Wendell authored 11 years ago

9272a004
Added predictAll python function to MatrixFactorizationModel · 754f5300
Hossein Falaki authored 11 years ago

754f5300
Added Rating deserializer · 04132ea9
Hossein Falaki authored 11 years ago

04132ea9
Added serializing method for Rating object · 11a93fb5
Hossein Falaki authored 11 years ago

11a93fb5
Merge pull request #340 from ScrapCodes/sbt-fixes · 93bf9620
Patrick Wendell authored 11 years ago
```
Made java options to be applied during tests so that they become self explanatory.
```
93bf9620
Merge pull request #338 from ScrapCodes/ning-upgrade · 60edeb3d
Patrick Wendell authored 11 years ago
```
SPARK-1005 Ning upgrade
```
60edeb3d

Merge pull request #341 from ash211/patch-5 · c708e817

Patrick Wendell authored 11 years ago

Clarify spark.cores.max in docs

It controls the count of cores across the cluster, not on a per-machine basis.

c708e817

Merge pull request #342 from tgravescs/fix_maven_protobuf · 33fcb91e

Patrick Wendell authored 11 years ago

Change protobuf version for yarn alpha back to 2.4.1

The maven build for yarn-alpha uses the wrong protobuf version and hence the generated assembly jar doesn't work with Hadoop 0.23.  Removing the setting for the yarn-alpha profile since the default protobuf version is 2.4.1 at the top of the pom file.

33fcb91e