Commits · 440e531a5e7720c42f0c53ce98425b63b4194b7b · cs525-sp18-g07 / spark

Dec 19, 2013

Merge pull request #278 from MLnick/java-python-tostring · 440e531a

Matei Zaharia authored 11 years ago

Add toString to Java RDD, and __repr__ to Python RDD

Addresses [SPARK-992](https://spark-project.atlassian.net/browse/SPARK-992)

440e531a

Add toString to Java RDD, and __repr__ to Python RDD · a76f5341
Nick Pentreath authored 11 years ago

a76f5341

Merge pull request #183 from aarondav/spark-959 · d8d3f3e6

Reynold Xin authored 11 years ago

[SPARK-959] Explicitly depend on org.eclipse.jetty.orbit jar

Without this, in some cases, Ivy attempts to download the wrong file and fails, stopping the whole build. See [bug](https://spark-project.atlassian.net/browse/SPARK-959) for more details.

Note that this may not be the best solution, as I do not understand the root cause of why this only happens for some people. However, it is reported to work.

d8d3f3e6

[SPARK-959] Explicitly depend on org.eclipse.jetty.orbit jar · eaf6a269

Aaron Davidson authored 11 years ago

Without this, in some cases, Ivy attempts to download the wrong file
and fails, stopping the whole build. See bug for more details.

(This is probably also the beginning of the slow death of our
recently prettified dependencies. Form follow function.)

eaf6a269

Merge pull request #247 from aarondav/minor · bfba5323

Reynold Xin authored 11 years ago

Increase spark.akka.askTimeout default to 30 seconds

In experimental clusters we've observed that a 10 second timeout was insufficient, despite having a low number of nodes and relatively small workload (16 nodes, <1.5 TB data). This would cause an entire job to fail at the beginning of the reduce phase.
There is no particular reason for this value to be small as a timeout should only occur in an exceptional situation.

Also centralized the reading of spark.akka.askTimeout to AkkaUtils (surely this can later be cleaned up to use Typesafe).

Finally, deleted some lurking implicits. If anyone can think of a reason they should still be there, please let me know.

bfba5323

Dec 18, 2013

In experimental clusters we've observed that a 10 second timeout was insufficient, · 293a0af5

Aaron Davidson authored 11 years ago

despite having a low number of nodes and relatively small workload (16 nodes, <1.5 TB data).
This would cause an entire job to fail at the beginning of the reduce phase.
There is no particular reason for this value to be small as a timeout should only occur
in an exceptional situation.

Also centralized the reading of spark.akka.askTimeout to AkkaUtils (surely this can later
be cleaned up to use Typesafe).

Finally, deleted some lurking implicits. If anyone can think of a reason they should still
be there, please let me know.

293a0af5

Merge pull request #267 from JoshRosen/cygwin · c64a53a4

Reynold Xin authored 11 years ago

Fix Cygwin support in several scripts.

This allows the spark-shell, spark-class, run-example, make-distribution.sh,
and ./bin/start-* scripts to work under Cygwin. Note that this doesn't
support PySpark under Cygwin, since that requires many additional `cygpath`
calls from within Python and will be non-trivial to implement.

This PR was inspired by, and subsumes, #253 (so close #253 after this is merged).

c64a53a4

Merge pull request #274 from azuryy/master · 5ea18727

Reynold Xin authored 11 years ago

Fixed the example link in the Scala programing guid.

The old link cannot access, I changed to the new one.

5ea18727

changed the example links in the scala-programming-guid · ad8ce014
fengdong authored 11 years ago

ad8ce014

Merge pull request #273 from rxin/top · f4effb37

Reynold Xin authored 11 years ago

Fixed a performance problem in RDD.top and BoundedPriorityQueue

BoundedPriority was actually traversing the entire queue to calculate the size, resulting in bad performance in insertion.

This should also cherry pick cleanly into branch-0.8.

f4effb37

Dec 17, 2013

Fixed the example link. · ddebaf82
fengdong authored 11 years ago

ddebaf82

Fixed a performance problem in RDD.top and BoundedPriorityQueue (size in... · 9a6864d0

Reynold Xin authored 11 years ago

Fixed a performance problem in RDD.top and BoundedPriorityQueue (size in BoundedPriority was actually traversing the entire queue to calculate the size, resulting in bad performance in insertion).

9a6864d0

Merge pull request #268 from pwendell/shaded-protobuf · 7a8169be

Patrick Wendell authored 11 years ago

Add support for 2.2. to master (via shaded jars)

This patch does a few related things. NOTE: This may not compile correctly for ~24 hours until artifacts fully propagate to Maven Central.

1. Uses shaded versions of akka/protobuf. For more information on how these versions were prepared, see [1].

2. Brings the `new-yarn` project up-to-date with the changes for Akka 2.2.3.

3. Some clean-up of the build now that we don't have to switch akka groups for different YARN versions.

[1]
https://github.com/pwendell/spark-utils/tree/933a309ef85c22643e8e4b5e365652101c4e95de/shaded-protobuf

7a8169be

One other fix · 10c0ffa1
Patrick Wendell authored 11 years ago

10c0ffa1
Clean-up · c1c0f809
Patrick Wendell authored 11 years ago

c1c0f809

Dec 16, 2013

Cleanup · c1fec898
Patrick Wendell authored 11 years ago

c1fec898
Removing extra code in new yarn · 24f8220d
Patrick Wendell authored 11 years ago

24f8220d

Remove trailing slashes from repository specifications. · ceb013f8

Patrick Wendell authored 11 years ago

The correct format is to not have a trailing slash.

For me this caused non-deterministic failures due to issues fetching
certain artifacts. The issue was that some of the maven caches would
fail to fetch the artifact (due to the way that the artifact
path was concatenated with the repository) and this short-circuited
the download process in a silent way. Here is what the log output
looked like:

    Downloading: http://repo.maven.apache.org/maven2/org/spark-project/akka/akka-remote_2.10/2.2.3-shaded-protobuf/akka-remote_2.10-2.2.3-shaded-protobuf.pom
    [WARNING] The POM for org.spark-project.akka:akka-remote_2.10:jar:2.2.3-shaded-protobuf is missing, no dependency information available

This was pretty brutal to debug since there was no error message
anywhere and the path *looks* correct as reported by the Maven log.

ceb013f8

Attempt with extra repositories · c6f95e60
Patrick Wendell authored 11 years ago

c6f95e60

Merge pull request #270 from ewencp/really-force-ssh-pseudo-tty-master · 964a3b69

Patrick Wendell authored 11 years ago

Force pseudo-tty allocation in spark-ec2 script.

ssh commands need the -t argument repeated twice if there is no local
tty, e.g. if the process running spark-ec2 uses nohup and the parent
process exits.

Without this change, if you run the script this way (e.g. using nohup from a cron job), it will fail setting up the nodes because some of the ssh commands complain about missing ttys and then fail.

(This version is for the master branch. I've filed a separate request for the 0.8 since changes to the script caused the patches to be different.)

964a3b69

Merge pull request #245 from gregakespret/task-maxfailures-fix · 883e034a

Reynold Xin authored 11 years ago

Fix for spark.task.maxFailures not enforced correctly.

Docs at http://spark.incubator.apache.org/docs/latest/configuration.html say:

```
spark.task.maxFailures

Number of individual task failures before giving up on the job. Should be greater than or equal to 1. Number of allowed retries = this value - 1.
```

Previous implementation worked incorrectly. When for example `spark.task.maxFailures` was set to 1, the job was aborted only after the second task failure, not after the first one.

883e034a

Force pseudo-tty allocation in spark-ec2 script. · d17c1426

Ewen Cheslack-Postava authored 11 years ago

ssh commands need the -t argument repeated twice if there is no local
tty, e.g. if the process running spark-ec2 uses nohup and the parent
process exits.

d17c1426

Merge pull request #265 from markhamstra/scala.binary.version · a51f3404

Patrick Wendell authored 11 years ago

DRY out the POMs with scala.binary.version

...instead of hard-coding 2.10 repeatedly.

As long as it's not a `<project>`-level `<artifactId>`, I think that we are okay parameterizing these.

a51f3404

Dec 15, 2013

Fix Cygwin support in several scripts. · f8ba89da

Josh Rosen authored 11 years ago

This PR was inspired by, and subsumes, #253 (so close #253 after this is merged).

f8ba89da

Merge pull request #256 from MLnick/master · d2ced6d5

Josh Rosen authored 11 years ago

Fix 'IPYTHON=1 ./pyspark' throwing ValueError

This fixes an annoying issue where running ```IPYTHON=1 ./pyspark``` resulted in:

```
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 0.8.0
      /_/

Using Python version 2.7.5 (default, Jun 20 2013 11:06:30)
Spark context avaiable as sc.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/usr/local/lib/python2.7/site-packages/IPython/utils/py3compat.pyc in execfile(fname, *where)
    202             else:
    203                 filename = fname
--> 204             __builtin__.execfile(filename, *where)

/Users/Nick/workspace/scala/spark-0.8.0-incubating-bin-hadoop1/python/pyspark/shell.py in <module>()
     30 add_files = os.environ.get("ADD_FILES").split(',') if os.environ.get("ADD_FILES") != None else None
     31
---> 32 sc = SparkContext(os.environ.get("MASTER", "local"), "PySparkShell", pyFiles=add_files)
     33
     34 print """Welcome to

/Users/Nick/workspace/scala/spark-0.8.0-incubating-bin-hadoop1/python/pyspark/context.pyc in __init__(self, master, jobName, sparkHome, pyFiles, environment, batchSize)
     70         with SparkContext._lock:
     71             if SparkContext._active_spark_context:
---> 72                 raise ValueError("Cannot run multiple SparkContexts at once")
     73             else:
     74                 SparkContext._active_spark_context = self

ValueError: Cannot run multiple SparkContexts at once
```

The issue arises since previously IPython didn't seem to respect ```$PYTHONSTARTUP```, but since at least 1.0.0 it has. Technically this might break for older versions of IPython, but most users should be able to upgrade IPython to at least 1.0.0 (and should be encouraged to do so :).

New behaviour:
```
Nicks-MacBook-Pro:incubator-spark-mlnick Nick$ IPYTHON=1 ./pyspark
Python 2.7.5 (default, Jun 20 2013, 11:06:30)
Type "copyright", "credits" or "license" for more information.

IPython 1.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/Nick/workspace/scala/incubator-spark-mlnick/tools/target/scala-2.9.3/spark-tools-assembly-0.9.0-incubating-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/Nick/workspace/scala/incubator-spark-mlnick/assembly/target/scala-2.9.3/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop1.0.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
13/12/12 13:08:15 WARN Utils: Your hostname, Nicks-MacBook-Pro.local resolves to a loopback address: 127.0.0.1; using 10.0.0.4 instead (on interface en0)
13/12/12 13:08:15 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
13/12/12 13:08:15 INFO Slf4jEventHandler: Slf4jEventHandler started
13/12/12 13:08:15 INFO SparkEnv: Registering BlockManagerMaster
13/12/12 13:08:15 INFO DiskBlockManager: Created local directory at /var/folders/_l/06wxljt13wqgm7r08jlc44_r0000gn/T/spark-local-20131212130815-0e76
13/12/12 13:08:15 INFO MemoryStore: MemoryStore started with capacity 326.7 MB.
13/12/12 13:08:15 INFO ConnectionManager: Bound socket to port 53732 with id = ConnectionManagerId(10.0.0.4,53732)
13/12/12 13:08:15 INFO BlockManagerMaster: Trying to register BlockManager
13/12/12 13:08:15 INFO BlockManagerMasterActor$BlockManagerInfo: Registering block manager 10.0.0.4:53732 with 326.7 MB RAM
13/12/12 13:08:15 INFO BlockManagerMaster: Registered BlockManager
13/12/12 13:08:15 INFO HttpBroadcast: Broadcast server started at http://10.0.0.4:53733
13/12/12 13:08:15 INFO SparkEnv: Registering MapOutputTracker
13/12/12 13:08:15 INFO HttpFileServer: HTTP File server directory is /var/folders/_l/06wxljt13wqgm7r08jlc44_r0000gn/T/spark-8f40e897-8211-4628-a7a8-755562d5244c
13/12/12 13:08:16 INFO SparkUI: Started Spark Web UI at http://10.0.0.4:4040
2013-12-12 13:08:16.337 java[56801:4003] Unable to load realm info from SCDynamicStore
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 0.9.0-SNAPSHOT
      /_/

Using Python version 2.7.5 (default, Jun 20 2013 11:06:30)
Spark context avaiable as sc.
```

d2ced6d5

Merge pull request #257 from tgravescs/sparkYarnFixName · c55e6985

Reynold Xin authored 11 years ago

Fix the --name option for Spark on Yarn

Looks like the --name option accidentally got broken in one of the merges.  The Client hangs if the --name option is used right now.

c55e6985

Merge pull request #264 from shivaram/spark-class-fix · ab85f88f
Reynold Xin authored 11 years ago
```
Use CoarseGrainedExecutorBackend in spark-class
```
ab85f88f
Use scala.binary.version in POMs · 09ed7ddf
Mark Hamstra authored 11 years ago

09ed7ddf
Use CoarseGrainedExecutorBackend in spark-class · fc96ca9f
Shivaram Venkataraman authored 11 years ago

fc96ca9f
Making IPython PySpark compatible across versions <1.0.0. Also cleaned up '-i'... · bb5277b1
Nick Pentreath authored 11 years ago
```
Making IPython PySpark compatible across versions <1.0.0. Also cleaned up '-i' option and made IPYTHON_OPTS work
```
bb5277b1
Merge remote-tracking branch 'upstream/master' · d36ee3b1
Nick Pentreath authored 11 years ago

d36ee3b1

Dec 14, 2013

Merge pull request #251 from pwendell/master · 7db91659

Reynold Xin authored 11 years ago

Fix list rendering in YARN markdown docs.

This is some minor clean-up which makes the list render correctly.

7db91659

Merge pull request #249 from ngbinh/partitionInJavaSortByKey · 2fd781d3

Josh Rosen authored 11 years ago

Expose numPartitions parameter in JavaPairRDD.sortByKey()

This change makes Java and Scala API on sortByKey() the same.

2fd781d3

Merge pull request #259 from pwendell/scala-2.10 · 97ac0601

Patrick Wendell authored 11 years ago

Migration to Scala 2.10

== Below description was written by Prashant Sharma ==

This PR migrates spark to scala 2.10.

Summary of changes apart from scala 2.10 migration:
(has no implications for user.)
1. Migrated Akka to 2.2.3.

Does not use remote death watch for it has a bug, where it tries to send message to dead node infinitely.

Uses an indestructible actorsystem which tolerates errors only on executors.

(Might be useful for user.)
4. New configuration settings introduced:

System.getProperty("spark.akka.heartbeat.pauses", "600")
System.getProperty("spark.akka.failure-detector.threshold", "300.0")
System.getProperty("spark.akka.heartbeat.interval", "1000")

Defaults for these are fairly large to only disable Failure detector that comes with akka. The reason for doing so is we have our own failure detector like mechanism in place and then this is just an overhead on top of that + it leads to a lot of false positives. But with these properties it is possible to enable them. A good use case for enabling it could be when someone wants spark to be sensitive (in a controllable manner ofc.) to GC pauses/Network lags and quickly evict executors that experienced it. More information is included in configuration.md

Once we have the SPARK-544 merged, I had like to deprecate atleast these akka properties and may be others too.

This PR is duplicate of #221(where all the discussion happened.) for that one pointed to master this one points to scala-2.10 branch.

97ac0601

Merge pull request #262 from pwendell/mvn-fix · 7ac944fc
Patrick Wendell authored 11 years ago
```
Fix maven build issues in 2.10 branch

Found some issues when locally testing maven.
```
7ac944fc
Fix maven build issues in 2.10 branch · 6e8a96c7
Patrick Wendell authored 11 years ago

6e8a96c7

Dec 13, 2013
- Merge pull request #261 from ScrapCodes/scala-2.10 · 6defb061
  Reynold Xin authored 11 years ago
  
  Added a comment about ActorRef and ActorSelection difference.
  6defb061
- Added a comment about ActorRef and ActorSelection difference. · 1ae3c0fc
  Prashant Sharma authored 11 years ago
  
  1ae3c0fc
- Merge pull request #260 from ScrapCodes/scala-2.10 · 76566b1f
  Reynold Xin authored 11 years ago
  
  Review comments on the PR for scala 2.10 migration.
  76566b1f
- Review comments on the PR for scala 2.10 migration. · a854cc53
  Prashant Sharma authored 11 years ago
  
  a854cc53