- Dec 31, 2013
-
-
Patrick Wendell authored
Bug fixes for file input stream and checkpointing - Fixed bugs in the file input stream that led the stream to fail due to transient HDFS errors (listing files when a background thread it deleting fails caused errors, etc.) - Updated Spark's CheckpointRDD and Streaming's CheckpointWriter to use SparkContext.hadoopConfiguration, to allow checkpoints to be written to any HDFS compatible store requiring special configuration. - Changed the API of SparkContext.setCheckpointDir() - eliminated the unnecessary 'useExisting' parameter. Now SparkContext will always create a unique subdirectory within the user specified checkpoint directory. This is to ensure that previous checkpoint files are not accidentally overwritten. - Fixed bug where setting checkpoint directory as a relative local path caused the checkpointing to fail.
-
Tathagata Das authored
-
- Dec 30, 2013
-
-
Patrick Wendell authored
Changed naming of StageCompleted event to be consistent The rest of the SparkListener events are named with "SparkListener" as the prefix of the name; this commit renames the StageCompleted event to SparkListenerStageCompleted for consistency.
-
- Dec 29, 2013
-
-
Kay Ousterhout authored
-
Reynold Xin authored
This reverts commit 79b20e4d, reversing changes made to 7375047d.
-
Reynold Xin authored
Fix typo in the Accumulators section Change 'val' to 'var'
-
- Dec 28, 2013
-
-
Jyun-Fan Tsai authored
val => var
-
Patrick Wendell authored
Removed unused failed and causeOfFailure variables (in TaskSetManager)
-
- Dec 27, 2013
-
-
Matei Zaharia authored
Removed unused OtherFailure TaskEndReason. The OtherFailure TaskEndReason was added by @mateiz 3 years ago in this commit: https://github.com/apache/incubator-spark/commit/24a1e7f8380bfd8d4fbdda688482a451bd6ea215 Unless I am missing something, it doesn't seem to have been used then, and is not used now, so seems safe for deletion.
-
Matei Zaharia authored
Remove unused hasPendingTasks methods
-
Kay Ousterhout authored
The rest of the SparkListener events are named with "SparkListener" as the prefix of the name; this commit renames the StageCompleted event to SparkListenerStageCompleted for consistency.
-
Kay Ousterhout authored
-
Kay Ousterhout authored
-
Patrick Wendell authored
Fixed >100char lines in DAGScheduler.scala There's no changed functionality here -- only line spacing and one grammatical fix in a comment.
-
Tathagata Das authored
-
Kay Ousterhout authored
-
Kay Ousterhout authored
-
Kay Ousterhout authored
-
Reynold Xin authored
Minor: Decrease margin of left side of Log page Before  After  It's a start anyway...
-
Reynold Xin authored
SPARK-1007: spark-class2.cmd should change SCALA_VERSION to be 2.10 Reported by Qiuzhuang Lian
-
Patrick Wendell authored
-
- Dec 26, 2013
-
-
Matei Zaharia authored
Avoid a lump of coal (NPE) in JobProgressListener's stocking.
-
Aaron Davidson authored
-
Tathagata Das authored
-
Tathagata Das authored
Changed file stream to not catch any exceptions related to finding new files (FileNotFound exception is still caught and ignored).
-
Matei Zaharia authored
Renamed ClusterScheduler to TaskSchedulerImpl for yarn and new-yarn package
-
Tathagata Das authored
Removed slack time in file stream and added better handling of exceptions due to failures due FileNotFound exceptions.
-
liguoqiang authored
-
Mark Hamstra authored
-
Matei Zaharia authored
Python bindings for mllib This pull request contains Python bindings for the regression, clustering, classification, and recommendation tools in mllib. For each 'train' frontend exposed, there is a Scala stub in PythonMLLibAPI.scala and a Python stub in mllib.py. The Python stub serialises the input RDD and any vector/matrix arguments into a mutually-understood format and calls the Scala stub. The Scala stub deserialises the RDD and the vector/matrix arguments, calls the appropriate 'train' function, serialises the resulting model, and returns the serialised model. ALSModel is slightly different since a MatrixFactorizationModel has RDDs inside. The Scala stub returns a handle to a Scala MatrixFactorizationModel; prediction is done by calling the Scala predict method. I have tested these bindings on an x86_64 machine running Linux. There is a risk that these bindings may fail on some choose-your-own-endian platform if Python's endian differs from java.nio.ByteBuffer's idea of the native byte order.
-
- Dec 25, 2013
-
-
liguoqiang authored
-
liguoqiang authored
-
Tor Myklebust authored
-
Tor Myklebust authored
-
Matei Zaharia authored
Typo: avaiable -> available
-
Reynold Xin authored
Fixed job name in the java streaming example.
-
- Dec 24, 2013
-
-
Tor Myklebust authored
-
Tor Myklebust authored
-
Tor Myklebust authored
-
Andrew Ash authored
-