Commits · 02a8f54bfa4572908d2d605a85e7a5adf9a36fbc · cs525-sp18-g07 / spark

Jan 13, 2014
- Miscel doc update. · 02a8f54b
  Reynold Xin authored 11 years ago
  
  02a8f54b
- Merge branch 'scaladoc1' of github.com:rxin/incubator-spark into graphx · dc041cd3
  Reynold Xin authored 11 years ago
  
  dc041cd3
- Merge branch 'master' into graphx · e2d25d2d
  Reynold Xin authored 11 years ago
  
  e2d25d2d
- Updated JavaStreamingContext to make scaladoc compile. · 30328c34
  Reynold Xin authored 11 years ago
  
  `sbt/sbt doc` used to fail. This fixed it.
  30328c34
- Merge pull request #2 from jegonzal/GraphXCCIssue · 8038da23
  Ankur Dave authored 11 years ago
  
  Improving documentation and identifying potential bug in CC calculation.
  8038da23
- Add graph loader links to doc · 97cd27e3
  Ankur Dave authored 11 years ago
  
  97cd27e3
- Fix mapReduceTriplets links in doc · 15ca89b1
  Ankur Dave authored 11 years ago
  
  15ca89b1
- Improving documentation and identifying potential bug in CC calculation. · 80e4d98d
  Joseph E. Gonzalez authored 11 years ago
  
  80e4d98d
- Improve EdgeRDD scaladoc · 9fe88627
  Ankur Dave authored 11 years ago
  
  9fe88627
- Further improve VertexRDD scaladocs · ea69cff7
  Ankur Dave authored 11 years ago
  
  ea69cff7
- Merge pull request #400 from tdas/dstream-move · b93f9d42
  Patrick Wendell authored 11 years ago
  
  Moved DStream and PairDSream to org.apache.spark.streaming.dstream Similar to the package location of `org.apache.spark.rdd.RDD`, `DStream` has been moved from `org.apache.spark.streaming.DStream` to `org.apache.spark.streaming.dstream.DStream`. I know that the package name is a little long, but I think its better to keep it consistent with Spark's structure. Also fixed persistence of windowed DStream. The RDDs generated generated by windowed DStream are essentially unions of underlying RDDs, and persistent these union RDDs would store numerous copies of the underlying data. Instead setting the persistence level on the windowed DStream is made to set the persistence level of the underlying DStream.
  b93f9d42
- Add LiveJournalPageRank example · 8ca97739
  Ankur Dave authored 11 years ago
  
  8ca97739
- Merge pull request #397 from pwendell/host-port · e6ed13f2
  Reynold Xin authored 11 years ago
  
  Remove now un-needed hostPort option I noticed this was logging some scary error messages in various places. After I looked into it, this is no longer really used. I removed the option and re-wrote the one remaining use case (it was unnecessary there anyways).
  e6ed13f2
- Fixed import formatting. · ffa1d38e
  Tathagata Das authored 11 years ago
  
  ffa1d38e
- Tested and corrected all examples up to mask in the graphx-programming-guide. · 66c9d009
  Joseph E. Gonzalez authored 11 years ago
  
  66c9d009
- Use GraphLoader for algorithms examples in doc · 1efe78a1
  Ankur Dave authored 11 years ago
  
  1efe78a1
Jan 12, 2014

Merge remote-tracking branch 'apache/master' into dstream-move · 777c181d
Tathagata Das authored 11 years ago
```
Conflicts:
	streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala
```
777c181d
Move algorithms to GraphOps · d691e9f4
Ankur Dave authored 11 years ago

d691e9f4
Add TriangleCount example · 20c509b8
Ankur Dave authored 11 years ago

20c509b8

Merge pull request #399 from pwendell/consolidate-off · 0b96d85c

Patrick Wendell authored 11 years ago

Disable shuffle file consolidation by default

After running various performance tests for the 0.9 release, this still seems to have performance issues even on XFS. So let's keep this off-by-default for 0.9 and users can experiment with it depending on their disk configurations.

0b96d85c

Merge pull request #395 from hsaputra/remove_simpleredundantreturn_scala · 0ab505a2

Patrick Wendell authored 11 years ago

Remove simple redundant return statements for Scala methods/functions

Remove simple redundant return statements for Scala methods/functions:

-) Only change simple return statements at the end of method
-) Ignore the complex if-else check
-) Ignore the ones inside synchronized
-) Add small changes to making var to val if possible and remove () for simple get

This hopefully makes the review simpler =)

Pass compile and tests.

0ab505a2

adding Pregel as an operator in GraphOps and cleaning up documentation of GraphOps · 2216319f
Joseph E. Gonzalez authored 11 years ago

2216319f
Documenting Pregel API · c787ff56
Joseph E. Gonzalez authored 11 years ago

c787ff56

Merge pull request #394 from tdas/error-handling · 405bfe86

Patrick Wendell authored 11 years ago

Better error handling in Spark Streaming and more API cleanup

Earlier errors in jobs generated by Spark Streaming (or in the generation of jobs) could not be caught from the main driver thread (i.e. the thread that called StreamingContext.start()) as it would be thrown in different threads. With this change, after `ssc.start`, one can call `ssc.awaitTermination()` which will be block until the ssc is closed, or there is an exception. This makes it easier to debug.

This change also adds ssc.stop(<stop-spark-context>) where you can stop StreamingContext without stopping the SparkContext.

Also fixes the bug that came up with PRs #393 and #381. MetadataCleaner default value has been changed from 3500 to -1 for normal SparkContext and 3600 when creating a StreamingContext. Also, updated StreamingListenerBus with changes similar to SparkListenerBus in #392.

And changed a lot of protected[streaming] to private[streaming].

405bfe86

Merge pull request #398 from pwendell/streaming-api · 28a6b0cd

Patrick Wendell authored 11 years ago

Rename DStream.foreach to DStream.foreachRDD

`foreachRDD` makes it clear that the granularity of this operator is per-RDD.
As it stands, `foreach` is inconsistent with with `map`, `filter`, and the other
DStream operators which get pushed down to individual records within each RDD.

28a6b0cd

Disable shuffle file consolidation by default · 2802cc80
Patrick Wendell authored 11 years ago

2802cc80
Address code review concerns and comments. · 5a8abfb7
Henry Saputra authored 11 years ago

5a8abfb7
Fixed persistence logic of WindowedDStream, and fixed default persistence level of input streams. · 034f89aa
Tathagata Das authored 11 years ago

034f89aa
Adding deprecated versions of old code · e6e20cee
Patrick Wendell authored 11 years ago

e6e20cee
Merge remote-tracking branch 'apache/master' into dstream-move · 74d01262
Tathagata Das authored 11 years ago

74d01262
Merge remote-tracking branch 'apache/master' into error-handling · aa2c9938
Tathagata Das authored 11 years ago

aa2c9938
Merge branch 'error-handling' into dstream-move · d1820fef
Tathagata Das authored 11 years ago

d1820fef
Changed StreamingContext.stopForWait to awaitTermination. · c7fabb74
Tathagata Das authored 11 years ago

c7fabb74

Rename DStream.foreach to DStream.foreachRDD · f4d77f8c

Patrick Wendell authored 11 years ago

`foreachRDD` makes it clear that the granularity of this operator is per-RDD.
As it stands, `foreach` is inconsistent with with `map`, `filter`, and the other
DStream operators which get pushed down to individual records within each RDD.

f4d77f8c

Merge pull request #396 from pwendell/executor-env · 074f5023

Patrick Wendell authored 11 years ago

Setting load defaults to true in executor

This preserves the behavior in earlier releases. If properties are set for the executors via `spark-env.sh` on the slaves, then they should take precedence over spark defaults. This is useful for if system administrators are setting properties for a standalone cluster, such as shuffle locations.

/cc @andrewor14 who initially reported this issue.

074f5023

Add connected components example to doc · 7a4bb863
Ankur Dave authored 11 years ago

7a4bb863

Merge pull request #392 from rxin/listenerbus · 82e2b92c

Reynold Xin authored 11 years ago

Stop SparkListenerBus daemon thread when DAGScheduler is stopped.

Otherwise this leads to hundreds of SparkListenerBus daemon threads in our unit tests (and also problematic if user applications launches multiple SparkContext).

82e2b92c

Removing mentions in tests · 0bb33076
Patrick Wendell authored 11 years ago

0bb33076
Remove now un-needed hostPort option · 0d4886c0
Patrick Wendell authored 11 years ago

0d4886c0
Fixed bugs to ensure better cleanup of JobScheduler, JobGenerator and... · 7883b8f5
Tathagata Das authored 11 years ago
```
Fixed bugs to ensure better cleanup of JobScheduler, JobGenerator and NetworkInputTracker upon close.
```
7883b8f5