Commits · 8fd5c7bc00b1104e4282959ec95b699955ded976 · cs525-sp18-g07 / spark

Aug 12, 2013
- Implementing SPARK-865: Add the equivalent of ADD_JARS to PySpark · 8fd5c7bc
  Andre Schumacher authored 11 years ago
  
  Now ADD_FILES uses a comma as file name separator.
  8fd5c7bc
Feb 05, 2013
- Merge pull request #449 from stephenh/longerdriversuite · a4611d66
  Matei Zaharia authored 12 years ago
  
  Increase DriverSuite timeout.
  a4611d66
- Increase DriverSuite timeout. · 1ba3393c
  Stephen Haberman authored 12 years ago
  
  1ba3393c
- Merge pull request #447 from pwendell/streaming-constructor · 2d9eca9f
  Matei Zaharia authored 12 years ago
  
  Streaming constructor which takes JavaSparkContext
  2d9eca9f
- Streaming constructor which takes JavaSparkContext · 7eea64aa
  Patrick Wendell authored 12 years ago
  
  It's sometimes helpful to directly pass a JavaSparkContext, and take advantage of the various constructors available for that.
  7eea64aa
Feb 04, 2013
- Small fix to test for distinct · f6ec547e
  Matei Zaharia authored 12 years ago
  
  f6ec547e
- Fix failing test · aa4ee1e9
  Matei Zaharia authored 12 years ago
  
  aa4ee1e9
Feb 03, 2013
- Merge pull request #445 from JoshRosen/pyspark_fixes · f7b4e428
  Matei Zaharia authored 12 years ago
  
  Fix exit status in PySpark unit tests; fix/optimize PySpark's RDD.take()
  f7b4e428
- Remove unnecessary doctest __main__ methods. · e6172911
  Josh Rosen authored 12 years ago
  
  e6172911
- Merge pull request #379 from stephenh/sparkmem · 3bfaf3ab
  Matei Zaharia authored 12 years ago
  
  Add spark.executor.memory to differentiate executor memory from spark-shell
  3bfaf3ab
- Merge pull request #422 from squito/blockmanager_info · 88ee6163
  Matei Zaharia authored 12 years ago
  
  RDDInfo available from SparkContext
  88ee6163
- Merge pull request #436 from stephenh/removeextraloop · cd4ca936
  Matei Zaharia authored 12 years ago
  
  Once we find a split with no block, we don't have to look for more.
  cd4ca936
- Merge pull request #442 from stephenh/fixsystemnames · d5daaab3
  Matei Zaharia authored 12 years ago
  
  Fix createActorSystem not actually using the systemName parameter.
  d5daaab3
- Formatting · 9163c370
  Matei Zaharia authored 12 years ago
  
  9163c370
- Fetch fewer objects in PySpark's take() method. · 8fbd5380
  Josh Rosen authored 12 years ago
  
  8fbd5380
- Fix reporting of PySpark doctest failures. · 2415c18f
  Josh Rosen authored 12 years ago
  
  2415c18f
Feb 02, 2013
- Formatting · 34a7bcdb
  Matei Zaharia authored 12 years ago
  
  34a7bcdb
- Merge pull request #427 from woggling/dag-sched-tests · 85019d76
  Matei Zaharia authored 12 years ago
  
  Tests for DAGScheduler
  85019d76
- Further simplify checking for Nil. · 7aba123f
  Stephen Haberman authored 12 years ago
  
  7aba123f
- Merge remote-tracking branch 'base/master' into dag-sched-tests · 61079579
  Charles Reiss authored 12 years ago
  
  Conflicts: core/src/main/scala/spark/scheduler/DAGScheduler.scala
  61079579
- Fix dangling old variable names. · cae8a679
  Stephen Haberman authored 12 years ago
  
  cae8a679
- Move executorMemory up into SchedulerBackend. · 696eec32
  Stephen Haberman authored 12 years ago
  
  696eec32
- Merge branch 'master' into sparkmem · 103c375b
  Stephen Haberman authored 12 years ago
  
  103c375b
- Fix createActorSystem not actually using the systemName parameter. · 28e0cb9f
  Stephen Haberman authored 12 years ago
  
  This meant all system names were "spark", which worked, but didn't lead to the most intuitive log output. This fixes createActorSystem to use the passed system name, and refactors Master/Worker to encapsulate their system/actor names instead of having the clients guess at them. Note that the driver system name, "spark", is left as is, and is still repeated a few times, but that seems like a separate issue.
  28e0cb9f
- Code review changes: add sc.stop; style of multiline comments; parens on procedure calls. · 1fd5ee32
  Charles Reiss authored 12 years ago
  
  1fd5ee32
Feb 01, 2013
- Add back test for distinct without parens · ae26911e
  Matei Zaharia authored 12 years ago
  
  ae26911e
- Merge pull request #441 from stephenh/lessnoisyakka · 7ae4b6a2
  Matei Zaharia authored 12 years ago
  
  Reduce the amount of duplicate logging Akka does to stdout.
  7ae4b6a2
- Reduce the amount of duplicate logging Akka does to stdout. · 12c1eb47
  Stephen Haberman authored 12 years ago
  
  Given we have Akka logging go through SLF4j to log4j, we don't need all the extra noise of Akka's stdout logger that is supposedly only used during Akka init time but seems to continue logging lots of noisy network events that we either don't care about or are in the log4j logs anyway. See: http://doc.akka.io/docs/akka/2.0/general/configuration.html # Log level for the very basic logger activated during AkkaApplication startup # Options: ERROR, WARNING, INFO, DEBUG # stdout-loglevel = "WARNING"
  12c1eb47
- Reduced the memory usage of reduce and similar operations · 8b3041c7
  Matei Zaharia authored 12 years ago
  
  These operations used to wait for all the results to be available in an array on the driver program before merging them. They now merge values incrementally as they arrive.
  8b3041c7
- Merge branch 'master' of github.com:mesos/spark · 4529876d
  Matei Zaharia authored 12 years ago
  
  4529876d
- formatting · 9970926e
  Matei Zaharia authored 12 years ago
  
  9970926e
- Merge pull request #432 from stephenh/moreprivacy · 79c24abe
  Matei Zaharia authored 12 years ago
  
  Add more private declarations.
  79c24abe
- Merge pull request #437 from stephenh/cancelmetacleaner · de340ddf
  Matei Zaharia authored 12 years ago
  
  Stop BlockManagers metadataCleaner.
  de340ddf
- Merge pull request #439 from JoshRosen/spark-580 · 04556507
  Matei Zaharia authored 12 years ago
  
  Use spark.local.dir for PySpark temp files (SPARK-580).
  04556507
- Use spark.local.dir for PySpark temp files (SPARK-580). · e211f405
  Josh Rosen authored 12 years ago
  
  e211f405
- Merge pull request #438 from JoshRosen/spark-674 · b6a60921
  Matei Zaharia authored 12 years ago
  
  Do not launch JavaGateways on workers (SPARK-674).
  b6a60921
- Do not launch JavaGateways on workers (SPARK-674). · 9cc6ff9c
  Josh Rosen authored 12 years ago
  
  The problem was that the gateway was being initialized whenever the pyspark.context module was loaded. The fix uses lazy initialization that occurs only when SparkContext instances are actually constructed. I also made the gateway and jvm variables private. This change results in ~3-4x performance improvement when running the PySpark unit tests.
  9cc6ff9c
- remove unneeded (and unused) filter on block info · c6190067
  Imran Rashid authored 12 years ago
  
  c6190067
- Stop BlockManagers metadataCleaner. · 59c57e48
  Stephen Haberman authored 12 years ago
  
  59c57e48
- Merge pull request #433 from rxin/master · 571af313
  Matei Zaharia authored 12 years ago
  
  Changed PartitionPruningRDD's split to make sure it returns the correct split index.
  571af313