Commits · a268d634113536f7aca11af23619b9713b5ef5de · cs525-sp18-g07 / spark

Jan 15, 2014

Fail rather than hanging if a task crashes the JVM. · a268d634

Kay Ousterhout authored 11 years ago

Prior to this commit, if a task crashes the JVM, the task (and
all other tasks running on that executor) is marked at KILLED rather
than FAILED.  As a result, the TaskSetManager will retry the task
indefiniteily rather than failing the job after maxFailures. This
commit fixes that problem by marking tasks as FAILED rather than
killed when an executor is lost.

The downside of this commit is that if task A fails because another
task running on the same executor caused the VM to crash, the failure
will incorrectly be counted as a failure of task A. This should not
be an issue because we typically set maxFailures to 3, and it is
unlikely that a task will be co-located with a JVM-crashing task
multiple times.

a268d634

Merge pull request #441 from pwendell/graphx-build · 5fecd251

Patrick Wendell authored 11 years ago

GraphX shouldn't list Spark as provided.

I noticed this when building an application against GraphX to audit the released artifacts.

5fecd251

GraphX shouldn't list Spark as provided · 9259d706
Patrick Wendell authored 11 years ago

9259d706
Merge pull request #433 from markhamstra/debFix · 494d3c07
Patrick Wendell authored 11 years ago
```
Updated Debian packaging
```
494d3c07

Merge pull request #366 from colorant/yarn-dev · cef2af9c

Thomas Graves authored 11 years ago

More yarn code refactor

Try to retrive common code in yarn alpha/stable for client and workerRunnable to reduce duplicated codes. By put them into a trait in common dir and extends with them.

Same works could be done for the remaining files in alpha/stable , while the remainning files have much more overlapping codes with different API call here and there within functions, and will need much more close review , aslo it might divide functions into too small trifle ones, thus might not deserve to be done in this way.

So just make it run for these two files firstly.

cef2af9c

Merge pull request #436 from ankurdave/VertexId-case · 3d9e66d9
Reynold Xin authored 11 years ago
```
Rename VertexID -> VertexId in GraphX
```
3d9e66d9

Merge pull request #435 from tdas/filestream-fix · 139c24ef

Patrick Wendell authored 11 years ago

Fixed the flaky tests by making SparkConf not serializable

SparkConf was being serialized with CoGroupedRDD and Aggregator, which somehow caused OptionalJavaException while being deserialized as part of a ShuffleMapTask. SparkConf should not even be serializable (according to conversation with Matei). This change fixes that.

@mateiz @pwendell

139c24ef

Merge pull request #434 from rxin/graphxmaven · 087487e9
Patrick Wendell authored 11 years ago
```
Fixed SVDPlusPlusSuite in Maven build.

This should go into 0.9.0 also.
```
087487e9
Merge remote-tracking branch 'apache/master' into filestream-fix · 0e15bd78
Tathagata Das authored 11 years ago

0e15bd78
Changed SparkConf to not be serializable. And also fixed unit-test log paths... · 1f4718c4
Tathagata Das authored 11 years ago
```
Changed SparkConf to not be serializable. And also fixed unit-test log paths in log4j.properties of external modules.
```
1f4718c4
Fixed SVDPlusPlusSuite in Maven build. · dfb15244
Reynold Xin authored 11 years ago

dfb15244
Removed repl-bin and updated maven build doc. · 147a943d
Mark Hamstra authored 11 years ago

147a943d
VertexID -> VertexId · f4d9019a
Ankur Dave authored 11 years ago

f4d9019a
Add deb profile to assembly/pom.xml · 148757e8
Mark Hamstra authored 11 years ago

148757e8

Jan 14, 2014
- Merge pull request #424 from jegonzal/GraphXProgrammingGuide · 3a386e23
  Reynold Xin authored 11 years ago
  
  Additional edits for clarity in the graphx programming guide. Added an overview of the Graph and GraphOps functions and fixed numerous typos.
  3a386e23
- Merge pull request #431 from ankurdave/graphx-caching-doc · ad294db3
  Reynold Xin authored 11 years ago
  
  Describe caching and uncaching in GraphX programming guide
  ad294db3
- Describe GraphX caching and uncaching in guide · 1210ec29
  Ankur Dave authored 11 years ago
  
  1210ec29
- Merge pull request #428 from pwendell/writeable-objects · 74b46acd
  Reynold Xin authored 11 years ago
  
  Don't clone records for text files
  74b46acd
- Merge pull request #429 from ankurdave/graphx-examples-pom.xml · 193a0757
  Reynold Xin authored 11 years ago
  
  Add GraphX dependency to examples/pom.xml
  193a0757
- Merge pull request #427 from pwendell/deprecate-aggregator · d601a76d
  Reynold Xin authored 11 years ago
  
  Deprecate rather than remove old combineValuesByKey function
  d601a76d
- Add GraphX dependency to examples/pom.xml · 8ea056d7
  Ankur Dave authored 11 years ago
  
  8ea056d7
- Style fix · b1b22b7a
  Patrick Wendell authored 11 years ago
  
  b1b22b7a
- Adding fix covering combineCombinersByKey as well · 8ea2cd56
  Patrick Wendell authored 11 years ago
  
  8ea2cd56
- Merge pull request #425 from rxin/scaladoc · 2ce23a55
  Reynold Xin authored 11 years ago
  
  API doc update & make Broadcast public In #413 Broadcast was mistakenly made private[spark]. I changed it to public again. Also exposing id in public given the R frontend requires that. Copied some of the documentation from the programming guide to API Doc for Broadcast and Accumulator. This should be cherry picked into branch-0.9 as well for 0.9.0 release.
  2ce23a55
- Deprecate rather than remove old combineValuesByKey function · b683608c
  Patrick Wendell authored 11 years ago
  
  b683608c
- Don't clone records for text files · 6f965a46
  Patrick Wendell authored 11 years ago
  
  6f965a46
- Fixed a typo in JavaSparkContext's API doc. · f12e506c
  Reynold Xin authored 11 years ago
  
  f12e506c
- Maintain Serializable API compatibility by reverting back to... · 1b5623fd
  Reynold Xin authored 11 years ago
  
  Maintain Serializable API compatibility by reverting back to java.io.Serializable for Broadcast and Accumulator.
  1b5623fd
- Added license header for package.scala in the Java API package. · 55db7741
  Reynold Xin authored 11 years ago
  
  55db7741
- Added package doc for the Java API. · f8c12e94
  Reynold Xin authored 11 years ago
  
  f8c12e94
- Updated API doc for Accumulable and Accumulator. · 6a12b9eb
  Reynold Xin authored 11 years ago
  
  6a12b9eb
- Broadcast variable visibility change & doc update. · 71b3007d
  Reynold Xin authored 11 years ago
  
  Note that previously Broadcast class was accidentally marked as private[spark]. It needs to be public for broadcast variables to work. Also exposing the broadcast varaible id.
  71b3007d
- Additional edits for clarity in the graphx programming guide. · 0bba7738
  Joseph E. Gonzalez authored 11 years ago
  
  0bba7738
- Merge pull request #423 from jegonzal/GraphXProgrammingGuide · 3fcc68bf
  Reynold Xin authored 11 years ago
  
  Improving the graphx-programming-guide This PR will track a few minor improvements to the content and formatting of the graphx-programming-guide.
  3fcc68bf
- Improving the graphx-programming-guide. · 486f37c5
  Joseph E. Gonzalez authored 11 years ago
  
  486f37c5
- Merge pull request #420 from pwendell/header-files · fa75e5e1
  Patrick Wendell authored 11 years ago
  
  Add missing header files
  fa75e5e1
- Add missing header files · 23034798
  Patrick Wendell authored 11 years ago
  
  23034798
- Merge pull request #416 from tdas/filestream-fix · 980250b1
  Patrick Wendell authored 11 years ago
  
  Removed unnecessary DStream operations and updated docs Removed StreamingContext.registerInputStream and registerOutputStream - they were useless. InputDStream has been made to register itself, and just registering a DStream as output stream cause RDD objects to be created but the RDDs will not be computed at all.. Also made DStream.register() private[streaming] for the same reasons. Updated docs, specially added package documentation for streaming package. Also, changed NetworkWordCount's input storage level to use MEMORY_ONLY, replication on the local machine causes warning messages (as replication fails) which is scary for a new user trying out his/her first example.
  980250b1
- Fixed loose ends in docs. · f8bd828c
  Tathagata Das authored 11 years ago
  
  f8bd828c
- Merge remote-tracking branch 'apache/master' into filestream-fix · f8e239e0
  Tathagata Das authored 11 years ago
  
  Conflicts: streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala
  f8e239e0