Commits · 45d964b60666a794ce8d6cc4e8451a3447327499 · cs525-sp18-g07 / spark

Sep 09, 2013
- Style fix: put body of if within curly braces · fdb8b0ee
  Evan Chan authored 11 years ago
  
  fdb8b0ee
- Print out more friendly error if listFiles() fails · 27726079
  Evan Chan authored 11 years ago
  
  listFiles() could return null if the I/O fails, and this currently results in an ugly NPE which is hard to diagnose.
  27726079
- $Y.CORP.YAHOO.COM\tgraves's avatar$
  
  Add metrics-ganglia to core pom file · 2186d932
  Y.CORP.YAHOO.COM\tgraves authored 11 years ago
  
  2186d932
- Use a set since shuffle could change order. · 59003d38
  Stephen Haberman authored 11 years ago
  
  59003d38
- Reword 'evenly distributed' to 'distributed with a hash partitioner. · 6471bfec
  Stephen Haberman authored 11 years ago
  
  6471bfec
Sep 08, 2013

Fix an instance where full standalone mode executor IDs were passed to · f9b7f58d

Matei Zaharia authored 11 years ago

StandaloneSchedulerBackend instead of the smaller IDs used within Spark
(that lack the application name).

This was reported by ClearStory in
https://github.com/clearstorydata/spark/pull/9.

Also fixed some messages that said slave instead of executor.

f9b7f58d

Fix unit test failure due to changed default · 170b3869
Matei Zaharia authored 11 years ago

170b3869
Adding sc name in metrics source · b4e382c2
Patrick Wendell authored 11 years ago

b4e382c2
Adding more docs and some code cleanup · c190b48b
Patrick Wendell authored 11 years ago

c190b48b

Add better docs for coalesce. · df5fd352

Stephen Haberman authored 11 years ago

Include the useful tip that if shuffle=true, coalesce can actually
increase the number of partitions.

This makes coalesce more like a generic `RDD.repartition` operation.

(Ideally this `RDD.repartition` could automatically choose either a coalesce or
a shuffle if numPartitions was either less than or greater than, respectively,
the current number of partitions.)

df5fd352

Ganglia sink · 8de8ee5d
Patrick Wendell authored 11 years ago

8de8ee5d

More fair scheduler docs and property names. · 651a96ad

Matei Zaharia authored 11 years ago

Also changed uses of "job" terminology to "application" when they
referred to an entire Spark program, to avoid confusion.

651a96ad

Work in progress: · 98fb6982

Matei Zaharia authored 11 years ago

- Add job scheduling docs
- Rename some fair scheduler properties
- Organize intro page better
- Link to Apache wiki for "contributing to Spark"

98fb6982

Sep 07, 2013

Export StorageLevel and refactor · c1cc8c4d
Aaron Davidson authored 11 years ago

c1cc8c4d

Remove reflection, hard-code StorageLevels · 8001687a

Aaron Davidson authored 11 years ago

The sc.StorageLevel -> StorageLevel pathway is a bit janky, but otherwise
the shell would have to call a private method of SparkContext. Having
StorageLevel available in sc also doesn't seem like the end of the world.
There may be a better solution, though.

As for creating the StorageLevel object itself, this seems to be the best
way in Python 2 for creating singleton, enum-like objects:
http://stackoverflow.com/questions/36932/how-can-i-represent-an-enum-in-python

8001687a

Fixed the bug that ResultTask was not properly deserializing outputId. · 210eae26
Reynold Xin authored 11 years ago

210eae26

Sep 06, 2013
- Memoize StorageLevels read from JVM · b8a0b6ea
  Aaron Davidson authored 11 years ago
  
  b8a0b6ea
- Hot fix to resolve the compilation error caused by SPARK-821. · 1e15feb5
  Reynold Xin authored 11 years ago
  
  1e15feb5
- SPARK-660: Add StorageLevel support in Python · a63d4c7d
  Aaron Davidson authored 11 years ago
  
  It uses reflection... I am not proud of that fact, but it at least ensures compatibility (sans refactoring of the StorageLevel stuff).
  a63d4c7d
Sep 05, 2013
- Reynold's second round of comments · 3a04e76c
  Aaron Davidson authored 11 years ago
  
  3a04e76c
- Add unit test and address comments · 4f2236a1
  Aaron Davidson authored 11 years ago
  
  4f2236a1
- SPARK-821: Don't cache results when action run locally on driver · 1418d18a
  Aaron Davidson authored 11 years ago
  
  Caching the results of local actions (e.g., rdd.first()) causes the driver to store entire partitions in its own memory, which may be highly constrained. This patch simply makes the CacheManager avoid caching the result of all locally-run computations.
  1418d18a
- Fix bug SPARK-864 · 7c15e3c5
  Andrew xia authored 11 years ago
  
  7c15e3c5
- Fix line over 100 chars · 714e7f9e
  Aaron Davidson authored 11 years ago
  
  714e7f9e
Sep 04, 2013
- Address Patrick's comments · 37db141a
  Aaron Davidson authored 11 years ago
  
  37db141a
- SPARK-884: Add unit test to validate Spark JSON output · 9e6f2b68
  Aaron Davidson authored 11 years ago
  
  This unit test simply validates that the outputs of the JsonProtocol methods are syntactically valid JSON.
  9e6f2b68
Sep 03, 2013
- Address review comments - rename toHash to nonNegativeHash · 1e2474b8
  Mridul Muralidharan authored 11 years ago
  
  1e2474b8
- Fix hash bug - caused failure after 35k stages, sigh · b3a82b7d
  Mridul Muralidharan authored 11 years ago
  
  b3a82b7d
- Minor spacing fix · c592a3c9
  Patrick Wendell authored 11 years ago
  
  c592a3c9
- $Y.CORP.YAHOO.COM\tgraves's avatar$
  
  Update based on review comments. Change function to prependBaseUri and fix formatting. · 41c1b5b9
  Y.CORP.YAHOO.COM\tgraves authored 11 years ago
  
  41c1b5b9
- $Y.CORP.YAHOO.COM\tgraves's avatar$
  
  Review comment changes and update to org.apache packaging · c8cc2761
  Y.CORP.YAHOO.COM\tgraves authored 11 years ago
  
  c8cc2761
- Using configured akka timeouts · bd078850
  Ali Ghodsi authored 11 years ago
  
  bd078850
Sep 02, 2013
- Sort order of imports to match project guidelines · cbfef9b3
  Ali Ghodsi authored 11 years ago
  
  cbfef9b3
- Reynold's comment fixed · 36d8fca2
  Ali Ghodsi authored 11 years ago
  
  36d8fca2
- Brushing the code up slightly · e452bd6d
  Ali Ghodsi authored 11 years ago
  
  e452bd6d
- Enabling getting the actual WEBUI port · cf7b1154
  Ali Ghodsi authored 11 years ago
  
  cf7b1154
- Add missing license headers found with RAT · 12b2f1f9
  Matei Zaharia authored 11 years ago
  
  12b2f1f9
- Fix test · 246bf67f
  Matei Zaharia authored 11 years ago
  
  246bf67f
- Fix spark.io.compression.codec and change default codec to LZF · 9329a7d4
  Matei Zaharia authored 11 years ago
  
  9329a7d4
Sep 01, 2013
- Allow PySpark to launch worker.py directly on Windows · 6550e5e6
  Matei Zaharia authored 11 years ago
  
  6550e5e6