- Aug 12, 2013
-
-
Andre Schumacher authored
Now ADD_FILES uses a comma as file name separator.
-
- Feb 05, 2013
-
-
Matei Zaharia authored
Increase DriverSuite timeout.
-
Stephen Haberman authored
-
Matei Zaharia authored
Streaming constructor which takes JavaSparkContext
-
Patrick Wendell authored
It's sometimes helpful to directly pass a JavaSparkContext, and take advantage of the various constructors available for that.
-
- Feb 04, 2013
-
-
Matei Zaharia authored
-
Matei Zaharia authored
-
- Feb 03, 2013
-
-
Matei Zaharia authored
Fix exit status in PySpark unit tests; fix/optimize PySpark's RDD.take()
-
Josh Rosen authored
-
Matei Zaharia authored
Add spark.executor.memory to differentiate executor memory from spark-shell
-
Matei Zaharia authored
RDDInfo available from SparkContext
-
Matei Zaharia authored
Once we find a split with no block, we don't have to look for more.
-
Matei Zaharia authored
Fix createActorSystem not actually using the systemName parameter.
-
Matei Zaharia authored
-
Josh Rosen authored
-
Josh Rosen authored
-
- Feb 02, 2013
-
-
Matei Zaharia authored
-
Matei Zaharia authored
Tests for DAGScheduler
-
Stephen Haberman authored
-
Charles Reiss authored
Conflicts: core/src/main/scala/spark/scheduler/DAGScheduler.scala
-
Stephen Haberman authored
-
Stephen Haberman authored
-
Stephen Haberman authored
-
Stephen Haberman authored
This meant all system names were "spark", which worked, but didn't lead to the most intuitive log output. This fixes createActorSystem to use the passed system name, and refactors Master/Worker to encapsulate their system/actor names instead of having the clients guess at them. Note that the driver system name, "spark", is left as is, and is still repeated a few times, but that seems like a separate issue.
-
Charles Reiss authored
-
- Feb 01, 2013
-
-
Matei Zaharia authored
-
Matei Zaharia authored
Reduce the amount of duplicate logging Akka does to stdout.
-
Stephen Haberman authored
Given we have Akka logging go through SLF4j to log4j, we don't need all the extra noise of Akka's stdout logger that is supposedly only used during Akka init time but seems to continue logging lots of noisy network events that we either don't care about or are in the log4j logs anyway. See: http://doc.akka.io/docs/akka/2.0/general/configuration.html # Log level for the very basic logger activated during AkkaApplication startup # Options: ERROR, WARNING, INFO, DEBUG # stdout-loglevel = "WARNING"
-
Matei Zaharia authored
These operations used to wait for all the results to be available in an array on the driver program before merging them. They now merge values incrementally as they arrive.
-
Matei Zaharia authored
-
Matei Zaharia authored
-
Matei Zaharia authored
Add more private declarations.
-
Matei Zaharia authored
Stop BlockManagers metadataCleaner.
-
Matei Zaharia authored
Use spark.local.dir for PySpark temp files (SPARK-580).
-
Josh Rosen authored
-
Matei Zaharia authored
Do not launch JavaGateways on workers (SPARK-674).
-
Josh Rosen authored
The problem was that the gateway was being initialized whenever the pyspark.context module was loaded. The fix uses lazy initialization that occurs only when SparkContext instances are actually constructed. I also made the gateway and jvm variables private. This change results in ~3-4x performance improvement when running the PySpark unit tests.
-
Imran Rashid authored
-
Stephen Haberman authored
-
Matei Zaharia authored
Changed PartitionPruningRDD's split to make sure it returns the correct split index.
-