- Feb 13, 2013
-
-
Tathagata Das authored
-
Tathagata Das authored
Added filter functionality to reduceByKeyAndWindow with inverse. Consolidated reduceByKeyAndWindow's many functions into smaller number of functions with optional parameters.
-
Tathagata Das authored
Changed scheduler and file input stream to fix bugs in the driver fault tolerance. Added MasterFailureTest to rigorously test master fault tolerance with file input stream.
-
- Feb 10, 2013
-
-
Tathagata Das authored
Fixed bugs in FileInputDStream and Scheduler that occasionally failed to reprocess old files after recovering from master failure. Completely modified spark.streaming.FailureTest to test multiple master failures using file input stream.
-
Tathagata Das authored
-
- Feb 09, 2013
-
-
Tathagata Das authored
-
- Feb 07, 2013
-
-
Tathagata Das authored
-
Tathagata Das authored
-
Tathagata Das authored
-
Tathagata Das authored
Removing offset management code that is non-existent in kafka 0.7.0+
-
Tathagata Das authored
StateDStream changes to give updateStateByKey consistent behavior
-
- Feb 05, 2013
-
-
Matei Zaharia authored
Inline mergePair to look more like the narrow dep branch.
-
Matei Zaharia authored
Handle Terminated to avoid endless DeathPactExceptions.
-
Stephen Haberman authored
Conflicts: core/src/main/scala/spark/deploy/worker/Worker.scala
-
Matei Zaharia authored
Increase DriverSuite timeout.
-
Stephen Haberman authored
Credit to Roland Kuhn, Akka's tech lead, for pointing out this various obvious fix, but StandaloneExecutorBackend.preStart's catch block would never (ever) get hit, because all of the operation's in preStart are async. So, the System.exit in the catch block was skipped, and instead Akka was sending Terminated messages which, since we didn't handle, it turned into DeathPactException, which started a postRestart/preStart infinite loop.
-
Stephen Haberman authored
-
Stephen Haberman authored
No functionality changes, I think this is just more consistent given mergePair isn't called multiple times/recursive. Also added a comment to explain the usual case of having two parent RDDs.
-
Matei Zaharia authored
Streaming constructor which takes JavaSparkContext
-
Patrick Wendell authored
It's sometimes helpful to directly pass a JavaSparkContext, and take advantage of the various constructors available for that.
-
- Feb 04, 2013
-
-
Matei Zaharia authored
-
Matei Zaharia authored
-
- Feb 03, 2013
-
-
Matei Zaharia authored
Fix exit status in PySpark unit tests; fix/optimize PySpark's RDD.take()
-
Josh Rosen authored
-
Matei Zaharia authored
Add spark.executor.memory to differentiate executor memory from spark-shell
-
Matei Zaharia authored
RDDInfo available from SparkContext
-
Matei Zaharia authored
Once we find a split with no block, we don't have to look for more.
-
Matei Zaharia authored
Fix createActorSystem not actually using the systemName parameter.
-
Matei Zaharia authored
-
Josh Rosen authored
-
Josh Rosen authored
-
- Feb 02, 2013
-
-
Matei Zaharia authored
-
Matei Zaharia authored
Tests for DAGScheduler
-
Stephen Haberman authored
-
Charles Reiss authored
Conflicts: core/src/main/scala/spark/scheduler/DAGScheduler.scala
-
Stephen Haberman authored
-
Stephen Haberman authored
-
Stephen Haberman authored
-
Stephen Haberman authored
This meant all system names were "spark", which worked, but didn't lead to the most intuitive log output. This fixes createActorSystem to use the passed system name, and refactors Master/Worker to encapsulate their system/actor names instead of having the clients guess at them. Note that the driver system name, "spark", is left as is, and is still repeated a few times, but that seems like a separate issue.
-
Charles Reiss authored
-