- Dec 29, 2013
-
-
Matei Zaharia authored
Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala core/src/main/scala/org/apache/spark/scheduler/local/LocalScheduler.scala core/src/main/scala/org/apache/spark/util/MetadataCleaner.scala core/src/test/scala/org/apache/spark/scheduler/TaskResultGetterSuite.scala core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala new-yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala streaming/src/test/scala/org/apache/spark/streaming/BasicOperationsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/CheckpointSuite.scala streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/TestSuiteBase.scala streaming/src/test/scala/org/apache/spark/streaming/WindowOperationsSuite.scala
-
Matei Zaharia authored
-
Matei Zaharia authored
The test in context.py created two different instances of the SparkContext class by copying "globals", so that some tests can have a global "sc" object and others can try initializing their own contexts. This led to two JVM gateways being created since SparkConf also looked at pyspark.context.SparkContext to get the JVM.
-
Matei Zaharia authored
-
- Dec 28, 2013
-
-
Matei Zaharia authored
-
Matei Zaharia authored
-
Matei Zaharia authored
-
Matei Zaharia authored
-
Matei Zaharia authored
-
Matei Zaharia authored
-
Matei Zaharia authored
sometimes be set that way (undoes a change in previous commit)
-
Matei Zaharia authored
- Got rid of global SparkContext.globalConf - Pass SparkConf to serializers and compression codecs - Made SparkConf public instead of private[spark] - Improved API of SparkContext and SparkConf - Switched executor environment vars to be passed through SparkConf - Fixed some places that were still using system properties - Fixed some tests, though others are still failing This still fails several tests in core, repl and streaming, likely due to properties not being set or cleared correctly (some of the tests run fine in isolation).
-
- Dec 27, 2013
-
-
Matei Zaharia authored
Removed unused OtherFailure TaskEndReason. The OtherFailure TaskEndReason was added by @mateiz 3 years ago in this commit: https://github.com/apache/incubator-spark/commit/24a1e7f8380bfd8d4fbdda688482a451bd6ea215 Unless I am missing something, it doesn't seem to have been used then, and is not used now, so seems safe for deletion.
-
Matei Zaharia authored
Remove unused hasPendingTasks methods
-
Kay Ousterhout authored
-
Kay Ousterhout authored
-
Patrick Wendell authored
Fixed >100char lines in DAGScheduler.scala There's no changed functionality here -- only line spacing and one grammatical fix in a comment.
-
Kay Ousterhout authored
-
Kay Ousterhout authored
-
Reynold Xin authored
Minor: Decrease margin of left side of Log page Before  After  It's a start anyway...
-
Reynold Xin authored
SPARK-1007: spark-class2.cmd should change SCALA_VERSION to be 2.10 Reported by Qiuzhuang Lian
-
Patrick Wendell authored
-
- Dec 26, 2013
-
-
Matei Zaharia authored
Avoid a lump of coal (NPE) in JobProgressListener's stocking.
-
Aaron Davidson authored
-
Matei Zaharia authored
Renamed ClusterScheduler to TaskSchedulerImpl for yarn and new-yarn package
-
liguoqiang authored
-
Mark Hamstra authored
-
Matei Zaharia authored
Python bindings for mllib This pull request contains Python bindings for the regression, clustering, classification, and recommendation tools in mllib. For each 'train' frontend exposed, there is a Scala stub in PythonMLLibAPI.scala and a Python stub in mllib.py. The Python stub serialises the input RDD and any vector/matrix arguments into a mutually-understood format and calls the Scala stub. The Scala stub deserialises the RDD and the vector/matrix arguments, calls the appropriate 'train' function, serialises the resulting model, and returns the serialised model. ALSModel is slightly different since a MatrixFactorizationModel has RDDs inside. The Scala stub returns a handle to a Scala MatrixFactorizationModel; prediction is done by calling the Scala predict method. I have tested these bindings on an x86_64 machine running Linux. There is a risk that these bindings may fail on some choose-your-own-endian platform if Python's endian differs from java.nio.ByteBuffer's idea of the native byte order.
-
- Dec 25, 2013
-
-
liguoqiang authored
-
liguoqiang authored
-
Tor Myklebust authored
-
Tor Myklebust authored
-
Matei Zaharia authored
Typo: avaiable -> available
-
Reynold Xin authored
Fixed job name in the java streaming example.
-
- Dec 24, 2013
-
-
Tor Myklebust authored
-
Tor Myklebust authored
-
Tor Myklebust authored
-
Andrew Ash authored
-
Patrick Wendell authored
Deduplicate Local and Cluster schedulers. The code in LocalScheduler/LocalTaskSetManager was nearly identical to the code in ClusterScheduler/ClusterTaskSetManager. The redundancy made making updating the schedulers unnecessarily painful and error- prone. This commit combines the two into a single TaskScheduler/ TaskSetManager. Unfortunately the diff makes this change look much more invasive than it is -- TaskScheduler.scala is only superficially changed (names updated, overrides removed) from the old ClusterScheduler.scala, and the same with TaskSetManager.scala. Thanks @rxin for suggesting this change!
-
Patrick Wendell authored
Clean up shuffle files once their metadata is gone Previously, we would only clean the in-memory metadata for consolidated shuffle files. Additionally, fixes a bug where the Metadata Cleaner was ignoring type-specific TTLs.
-