Commits · c5483e39f9427d82273381e2ae3d63b44df03077 · cs525-sp18-g07 / spark

Dec 16, 2010
- - ParallelLocalFileShuffle does NOT use HttpPipelining at all. · c5483e39
  Mosharaf Chowdhury authored 14 years ago
  
  - Config option related to pipelining has been removed. - Summary: Basic -> Pipelining / Parallel -> NO pipelining
  c5483e39
Dec 15, 2010

- Updated java-opts file of this branch. · 56d8a2af
Mosharaf Chowdhury authored 14 years ago
```
 - Renamed some ParallelLocalFileShuffle config options for clarity.
```
56d8a2af

- Brought back Matei's LocalFileShuffle implementation as BasicLocalFileShuffle · 25fb3c4c

Mosharaf Chowdhury authored 14 years ago

 - Renamed parallel-pull version to ParallelLocalFileShuffle
 - Note that setting max-concurrent connections to 1 in ParallelLocalFileShuffle should essentially be the same as BasicLocalFileShuffle

25fb3c4c

Dec 07, 2010
- UseHttpPipelining option is brought back in. It works! · f82cc17b
  Mosharaf Chowdhury authored 14 years ago
  
  f82cc17b
Dec 04, 2010
- Multiple connections created at a time. No upper limit on the server side though. · 7e2d72c3
  Mosharaf Chowdhury authored 14 years ago
  
  7e2d72c3
Dec 02, 2010
- UseHttpPipelining is 'true' by default. · 540a4116
  Mosharaf Chowdhury authored 14 years ago
  
  540a4116
- Enabling/disabling HTTP pipelining is a config option now. Performance... · 0de859fb
  Mosharaf Chowdhury authored 14 years ago
  
  Enabling/disabling HTTP pipelining is a config option now. Performance tradeoffs are not obvious yet.
  0de859fb
Nov 28, 2010
- - Added log messages for benchmarking. · 8494b3a4
  Mosharaf Chowdhury authored 14 years ago
  
  - Added GroupByTest.scala for benchmarking.
  8494b3a4
Nov 13, 2010
- Remove -unchecked compiler parameter · f8ea98d9
  Matei Zaharia authored 14 years ago
  
  f8ea98d9
Nov 12, 2010
- Added a shuffle test with negative hash codes for some keys (this was a bug earlier) · f8966ffc
  Matei Zaharia authored 14 years ago
  
  f8966ffc
- Unit tests for shuffle operations. Fixes #33. · d0a99665
  Matei Zaharia authored 14 years ago
  
  d0a99665
Nov 09, 2010
- Added options for using an external HTTP server with LocalFileShuffle · 7b25ab87
  Matei Zaharia authored 14 years ago
  
  7b25ab87
Nov 08, 2010
- Removed unnecessary collectAsMap · 504f839c
  Matei Zaharia authored 14 years ago
  
  504f839c
- Made shuffle algorithm pluggable and added LocalFileShuffle. · 9d3f05a9
  Matei Zaharia authored 14 years ago
  
  9d3f05a9
Nov 06, 2010
- Create output files one by one instead of at the same time in the map · d9ea6d69
  Matei Zaharia authored 14 years ago
  
  phase of DfsShuffle.
  d9ea6d69
Nov 04, 2010
- Merge branch 'matei-shuffle' of github.com:mesos/spark into matei-shuffle · 16ff4dc0
  Matei Zaharia authored 14 years ago
  
  16ff4dc0
- Properly set the number of output splits in DFS shuffle · d984b8ab
  Matei Zaharia authored 14 years ago
  
  d984b8ab
- Fixed a small bug in DFS shuffle -- the number of reduce tasks was not being... · 4cc0984b
  root authored 14 years ago
  
  Fixed a small bug in DFS shuffle -- the number of reduce tasks was not being set based on numOutputSplits
  4cc0984b
- Added groupBy function in RDD · 96f0be93
  Matei Zaharia authored 14 years ago
  
  96f0be93
- Added reduceByKey, groupByKey and join operations based on combine, as · 72ec298c
  Matei Zaharia authored 14 years ago
  
  well as versions of the shuffle operations that set the number of splits automatically.
  72ec298c
- Fixed a bug with negative hashcodes · d947cb97
  Matei Zaharia authored 14 years ago
  
  d947cb97
- Made DFS shuffle's "reduce tasks" fetch inputs in a random order so they · 44530c31
  Matei Zaharia authored 14 years ago
  
  don't all hit the same nodes at the same time.
  44530c31
Nov 03, 2010
- Initial work towards a simple HDFS-based shuffle. · 820dac5a
  Matei Zaharia authored 14 years ago
  
  820dac5a
Nov 02, 2010
- Made alltests write test output as XML in build/test_results · 648f4293
  Matei Zaharia authored 14 years ago
  
  648f4293
- 'Running on Mesos' test is now only run when MESOS_HOME is set · 6f93baa4
  Matei Zaharia authored 14 years ago
  
  6f93baa4
Oct 24, 2010
- Added initial attempt at a BoundedMemoryCache · dd7c5d8e
  Matei Zaharia authored 14 years ago
  
  dd7c5d8e
- Added SizeEstimator class for use by caches · edf86fdb
  Matei Zaharia authored 14 years ago
  
  edf86fdb
Oct 23, 2010
- Made caching pluggable and added soft reference and weak reference caches. · a481e237
  Matei Zaharia authored 14 years ago
  
  a481e237
- Renamed aggregateSplit() to splitRdd(), plus some style fixes · 93a200bc
  Matei Zaharia authored 14 years ago
  
  93a200bc
Oct 19, 2010
- Fixed a bug with scheduling of tasks that have no locality preferences. · 787faf0d
  Matei Zaharia authored 14 years ago
  
  These tasks were being subjected to delay scheduling but then counted as having been launched on a preferred node. The solution is to have a separate queue for them and treat them as preferred during scheduling.
  787faf0d
- Undid some changes that Mosharaf inadvertedly committed to master. · 0e0ec835
  Matei Zaharia authored 14 years ago
  
  0e0ec835
Oct 18, 2010
- Merge branch 'master' of git@github.com:mesos/spark · bf7055de
  Mosharaf Chowdhury authored 14 years ago
  
  Conflicts: src/scala/spark/SparkContext.scala Using the latest one from Matei.
  bf7055de
Oct 17, 2010
- Less hacky way of preventing config files from being overwritten when a template file changes · b940164d
  Matei Zaharia authored 14 years ago
  
  b940164d
Oct 16, 2010
- Changed the config files that were included in git to templates which · e5fb280e
  Matei Zaharia authored 14 years ago
  
  are used to create an initial copy of each config file if the user does not have one. This way, users won't accidentally commit their changes to config files to git.
  e5fb280e
- Fixed some whitespace · 023ed194
  Matei Zaharia authored 14 years ago
  
  023ed194
- Added support for generic Hadoop InputFormats and refactored textFile to · 74bbfa91
  Matei Zaharia authored 14 years ago
  
  use this. Closes #12.
  74bbfa91
- Renamed HdfsFile to HadoopFile · 03238cb7
  Matei Zaharia authored 14 years ago
  
  03238cb7
- Simplified UnionRDD slightly and added a SparkContext.union method for... · 0e2adecd
  Matei Zaharia authored 14 years ago
  
  Simplified UnionRDD slightly and added a SparkContext.union method for efficiently union-ing a large number of RDDs
  0e2adecd
- Removed setSparkHome method on SparkContext in favor of having an · 166d9f91
  Matei Zaharia authored 14 years ago
  
  optional constructor parameter, so that the scheduler is guaranteed that a Spark home has been set when it first builds its executor arg.
  166d9f91
- Added the ability to specify a list of JAR files when creating a · 1c082ad5
  Matei Zaharia authored 14 years ago
  
  SparkContext and have the master node serve those to workers.
  1c082ad5