Commits · df5fd352735005ce0322d287ae27d72d12a7fc8e · cs525-sp18-g07 / spark

Sep 08, 2013

Add better docs for coalesce. · df5fd352

Stephen Haberman authored 11 years ago

Include the useful tip that if shuffle=true, coalesce can actually
increase the number of partitions.

This makes coalesce more like a generic `RDD.repartition` operation.

(Ideally this `RDD.repartition` could automatically choose either a coalesce or
a shuffle if numPartitions was either less than or greater than, respectively,
the current number of partitions.)

df5fd352

Merge pull request #898 from ilikerps/660 · 04cfb3aa
Matei Zaharia authored 11 years ago
```
SPARK-660: Add StorageLevel support in Python
```
04cfb3aa
Whoopsy daisy · a3868544
Aaron Davidson authored 11 years ago

a3868544
Merge pull request #900 from pwendell/cdh-docs · 38488aca
Matei Zaharia authored 11 years ago
```
Provide docs to describe running on CDH/HDP cluster.
```
38488aca

Sep 07, 2013
- Merge pull request #904 from pwendell/master · a8e376ec
  Patrick Wendell authored 11 years ago
  
  Adding Apache license to two files
  a8e376ec
- Adding Apache license to two files · 6d219864
  Patrick Wendell authored 11 years ago
  
  6d219864
- Export StorageLevel and refactor · c1cc8c4d
  Aaron Davidson authored 11 years ago
  
  c1cc8c4d
- File rename · 22b982d2
  Patrick Wendell authored 11 years ago
  
  22b982d2
- Merge pull request #901 from ooyala/2013-09/0.8-doc-changes · cfde85e3
  Matei Zaharia authored 11 years ago
  
  0.8 Doc changes for make-distribution.sh
  cfde85e3
- Merge pull request #903 from rxin/resulttask · 4a7813a2
  Matei Zaharia authored 11 years ago
  
  Fixed the bug that ResultTask was not properly deserializing outputId.
  4a7813a2
- Changes based on feedback · 61c4762d
  Patrick Wendell authored 11 years ago
  
  61c4762d
- Remove reflection, hard-code StorageLevels · 8001687a
  Aaron Davidson authored 11 years ago
  
  The sc.StorageLevel -> StorageLevel pathway is a bit janky, but otherwise the shell would have to call a private method of SparkContext. Having StorageLevel available in sc also doesn't seem like the end of the world. There may be a better solution, though. As for creating the StorageLevel object itself, this seems to be the best way in Python 2 for creating singleton, enum-like objects: http://stackoverflow.com/questions/36932/how-can-i-represent-an-enum-in-python
  8001687a
- CR feedback from Matei · be1ee28c
  Evan Chan authored 11 years ago
  
  be1ee28c
- Merge pull request #892 from jey/fix-yarn-assembly · afe46ba3
  Matei Zaharia authored 11 years ago
  
  YARN build fixes
  afe46ba3
- Fixed the bug that ResultTask was not properly deserializing outputId. · 210eae26
  Reynold Xin authored 11 years ago
  
  210eae26
Sep 06, 2013
- Memoize StorageLevels read from JVM · b8a0b6ea
  Aaron Davidson authored 11 years ago
  
  b8a0b6ea
- Merge pull request #897 from pwendell/master · 2eebeff5
  Patrick Wendell authored 11 years ago
  
  Docs describing Spark monitoring and instrumentation
  2eebeff5
- Add references to make-distribution.sh · ff1dbf21
  Evan Chan authored 11 years ago
  
  ff1dbf21
- "launch" scripts is more accurate terminology · 88d53f0d
  Evan Chan authored 11 years ago
  
  88d53f0d
- Easier way to start the master · 5a18b854
  Evan Chan authored 11 years ago
  
  5a18b854
- Add notes about starting spark-shell · 76d5d2d3
  Evan Chan authored 11 years ago
  
  76d5d2d3
- Docs describing Spark monitoring and instrumentation · a2a0cf9d
  Patrick Wendell authored 11 years ago
  
  a2a0cf9d
- Provide docs to describe running on CDH/HDP cluster. · e653a9d8
  Patrick Wendell authored 11 years ago
  
  This doc consolidates information relevant to CDH/HDP users in a single place.
  e653a9d8
- Minor YARN build cleanups · 30a32c83
  Jey Kottalam authored 11 years ago
  
  30a32c83
- Fix YARN assembly generation under Maven · 70661246
  Jey Kottalam authored 11 years ago
  
  70661246
- Clarify YARN example · 35ed09f1
  Jey Kottalam authored 11 years ago
  
  35ed09f1
- Hot fix to resolve the compilation error caused by SPARK-821. · 1e15feb5
  Reynold Xin authored 11 years ago
  
  1e15feb5
- Merge pull request #895 from ilikerps/821 · ddcb9d31
  Patrick Wendell authored 11 years ago
  
  SPARK-821: Don't cache results when action run locally on driver
  ddcb9d31
- SPARK-660: Add StorageLevel support in Python · a63d4c7d
  Aaron Davidson authored 11 years ago
  
  It uses reflection... I am not proud of that fact, but it at least ensures compatibility (sans refactoring of the StorageLevel stuff).
  a63d4c7d
Sep 05, 2013
- Reynold's second round of comments · 3a04e76c
  Aaron Davidson authored 11 years ago
  
  3a04e76c
- Merge pull request #891 from xiajunluan/SPARK-864 · 699c331f
  Matei Zaharia authored 11 years ago
  
  [SPARK-864]DAGScheduler Exception if we delete Worker and StandaloneExecutorBackend then add Worker
  699c331f
- Add unit test and address comments · 4f2236a1
  Aaron Davidson authored 11 years ago
  
  4f2236a1
- SPARK-821: Don't cache results when action run locally on driver · 1418d18a
  Aaron Davidson authored 11 years ago
  
  Caching the results of local actions (e.g., rdd.first()) causes the driver to store entire partitions in its own memory, which may be highly constrained. This patch simply makes the CacheManager avoid caching the result of all locally-run computations.
  1418d18a
- Fix bug SPARK-864 · 7c15e3c5
  Andrew xia authored 11 years ago
  
  7c15e3c5
- Merge pull request #893 from ilikerps/master · 5c7494d7
  Patrick Wendell authored 11 years ago
  
  SPARK-884: Add unit test to validate Spark JSON output
  5c7494d7
- Fix line over 100 chars · 714e7f9e
  Aaron Davidson authored 11 years ago
  
  714e7f9e
Sep 04, 2013
- Address Patrick's comments · 37db141a
  Aaron Davidson authored 11 years ago
  
  37db141a
- Merge pull request #894 from c0s/master · a5478667
  Matei Zaharia authored 11 years ago
  
  Updating assembly README to reflect recent changes in the build.
  a5478667
- Updating assembly README to reflect recent changes in the build. · 7c7c7e10
  Konstantin Boudnik authored 11 years ago
  
  7c7c7e10
- SPARK-884: Add unit test to validate Spark JSON output · 9e6f2b68
  Aaron Davidson authored 11 years ago
  
  This unit test simply validates that the outputs of the JsonProtocol methods are syntactically valid JSON.
  9e6f2b68