Commits · 8400536456ecff26145244cf74b7c00dd1c7034b · cs525-sp18-g07 / spark

Jan 15, 2014

fix some format problem. · 84005364
CrazyJvm authored 11 years ago

84005364

fix "set MASTER automatically fails" bug. · 7a0c5b5a

CrazyJvm authored 11 years ago

spark-shell intends to set MASTER automatically if we do not provide the option when we start the shell , but there's a problem. 
The condition is "if [[ "x" != "x$SPARK_MASTER_IP" && "y" != "y$SPARK_MASTER_PORT" ]];" we sure will set SPARK_MASTER_IP explicitly, the SPARK_MASTER_PORT option, however, we probably do not set just using spark default port 7077. So if we do not set SPARK_MASTER_PORT, the condition will never be true. We should just use default port if users do not set port explicitly I think.

7a0c5b5a

remove "-XX:+UseCompressedStrings" option · 263933da

CrazyJvm authored 11 years ago

remove "-XX:+UseCompressedStrings" option from tuning guide since jdk7 no longer supports this.

263933da

Merge pull request #436 from ankurdave/VertexId-case · 3d9e66d9
Reynold Xin authored 11 years ago
```
Rename VertexID -> VertexId in GraphX
```
3d9e66d9

Merge pull request #435 from tdas/filestream-fix · 139c24ef

Patrick Wendell authored 11 years ago

Fixed the flaky tests by making SparkConf not serializable

SparkConf was being serialized with CoGroupedRDD and Aggregator, which somehow caused OptionalJavaException while being deserialized as part of a ShuffleMapTask. SparkConf should not even be serializable (according to conversation with Matei). This change fixes that.

@mateiz @pwendell

139c24ef

Merge pull request #434 from rxin/graphxmaven · 087487e9
Patrick Wendell authored 11 years ago
```
Fixed SVDPlusPlusSuite in Maven build.

This should go into 0.9.0 also.
```
087487e9
Merge remote-tracking branch 'apache/master' into filestream-fix · 0e15bd78
Tathagata Das authored 11 years ago

0e15bd78
Changed SparkConf to not be serializable. And also fixed unit-test log paths... · 1f4718c4
Tathagata Das authored 11 years ago
```
Changed SparkConf to not be serializable. And also fixed unit-test log paths in log4j.properties of external modules.
```
1f4718c4
Fixed SVDPlusPlusSuite in Maven build. · dfb15244
Reynold Xin authored 11 years ago

dfb15244
VertexID -> VertexId · f4d9019a
Ankur Dave authored 11 years ago

f4d9019a

Jan 14, 2014
- Merge pull request #424 from jegonzal/GraphXProgrammingGuide · 3a386e23
  Reynold Xin authored 11 years ago
  
  Additional edits for clarity in the graphx programming guide. Added an overview of the Graph and GraphOps functions and fixed numerous typos.
  3a386e23
- Merge pull request #431 from ankurdave/graphx-caching-doc · ad294db3
  Reynold Xin authored 11 years ago
  
  Describe caching and uncaching in GraphX programming guide
  ad294db3
- Describe GraphX caching and uncaching in guide · 1210ec29
  Ankur Dave authored 11 years ago
  
  1210ec29
- Merge pull request #428 from pwendell/writeable-objects · 74b46acd
  Reynold Xin authored 11 years ago
  
  Don't clone records for text files
  74b46acd
- Merge pull request #429 from ankurdave/graphx-examples-pom.xml · 193a0757
  Reynold Xin authored 11 years ago
  
  Add GraphX dependency to examples/pom.xml
  193a0757
- Merge pull request #427 from pwendell/deprecate-aggregator · d601a76d
  Reynold Xin authored 11 years ago
  
  Deprecate rather than remove old combineValuesByKey function
  d601a76d
- Add GraphX dependency to examples/pom.xml · 8ea056d7
  Ankur Dave authored 11 years ago
  
  8ea056d7
- Style fix · b1b22b7a
  Patrick Wendell authored 11 years ago
  
  b1b22b7a
- Adding fix covering combineCombinersByKey as well · 8ea2cd56
  Patrick Wendell authored 11 years ago
  
  8ea2cd56
- Merge pull request #425 from rxin/scaladoc · 2ce23a55
  Reynold Xin authored 11 years ago
  
  API doc update & make Broadcast public In #413 Broadcast was mistakenly made private[spark]. I changed it to public again. Also exposing id in public given the R frontend requires that. Copied some of the documentation from the programming guide to API Doc for Broadcast and Accumulator. This should be cherry picked into branch-0.9 as well for 0.9.0 release.
  2ce23a55
- Deprecate rather than remove old combineValuesByKey function · b683608c
  Patrick Wendell authored 11 years ago
  
  b683608c
- Don't clone records for text files · 6f965a46
  Patrick Wendell authored 11 years ago
  
  6f965a46
- Fixed a typo in JavaSparkContext's API doc. · f12e506c
  Reynold Xin authored 11 years ago
  
  f12e506c
- Maintain Serializable API compatibility by reverting back to... · 1b5623fd
  Reynold Xin authored 11 years ago
  
  Maintain Serializable API compatibility by reverting back to java.io.Serializable for Broadcast and Accumulator.
  1b5623fd
- Added license header for package.scala in the Java API package. · 55db7741
  Reynold Xin authored 11 years ago
  
  55db7741
- Added package doc for the Java API. · f8c12e94
  Reynold Xin authored 11 years ago
  
  f8c12e94
- Updated API doc for Accumulable and Accumulator. · 6a12b9eb
  Reynold Xin authored 11 years ago
  
  6a12b9eb
- Broadcast variable visibility change & doc update. · 71b3007d
  Reynold Xin authored 11 years ago
  
  Note that previously Broadcast class was accidentally marked as private[spark]. It needs to be public for broadcast variables to work. Also exposing the broadcast varaible id.
  71b3007d
- Additional edits for clarity in the graphx programming guide. · 0bba7738
  Joseph E. Gonzalez authored 11 years ago
  
  0bba7738
- Merge pull request #423 from jegonzal/GraphXProgrammingGuide · 3fcc68bf
  Reynold Xin authored 11 years ago
  
  Improving the graphx-programming-guide This PR will track a few minor improvements to the content and formatting of the graphx-programming-guide.
  3fcc68bf
- Improving the graphx-programming-guide. · 486f37c5
  Joseph E. Gonzalez authored 11 years ago
  
  486f37c5
- Merge pull request #420 from pwendell/header-files · fa75e5e1
  Patrick Wendell authored 11 years ago
  
  Add missing header files
  fa75e5e1
- Add missing header files · 23034798
  Patrick Wendell authored 11 years ago
  
  23034798
- Merge pull request #416 from tdas/filestream-fix · 980250b1
  Patrick Wendell authored 11 years ago
  
  Removed unnecessary DStream operations and updated docs Removed StreamingContext.registerInputStream and registerOutputStream - they were useless. InputDStream has been made to register itself, and just registering a DStream as output stream cause RDD objects to be created but the RDDs will not be computed at all.. Also made DStream.register() private[streaming] for the same reasons. Updated docs, specially added package documentation for streaming package. Also, changed NetworkWordCount's input storage level to use MEMORY_ONLY, replication on the local machine causes warning messages (as replication fails) which is scary for a new user trying out his/her first example.
  980250b1
- Fixed loose ends in docs. · f8bd828c
  Tathagata Das authored 11 years ago
  
  f8bd828c
- Merge remote-tracking branch 'apache/master' into filestream-fix · f8e239e0
  Tathagata Das authored 11 years ago
  
  Conflicts: streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala
  f8e239e0
- Merge pull request #415 from pwendell/shuffle-compress · 055be5c6
  Patrick Wendell authored 11 years ago
  
  Enable compression by default for spills
  055be5c6
- Enable compression by default for spills · 0984647a
  Patrick Wendell authored 11 years ago
  
  0984647a
- Removed StreamingContext.registerInputStream and registerOutputStream - they... · 4e497db8
  Tathagata Das authored 11 years ago
  
  Removed StreamingContext.registerInputStream and registerOutputStream - they were useless as InputDStream has been made to register itself. Also made DStream.register() private[streaming] - not useful to expose the confusing function. Updated a lot of documentation.
  4e497db8
- Merge pull request #380 from mateiz/py-bayes · fdaabdc6
  Patrick Wendell authored 11 years ago
  
  Add Naive Bayes to Python MLlib, and some API fixes - Added a Python wrapper for Naive Bayes - Updated the Scala Naive Bayes to match the style of our other algorithms better and in particular make it easier to call from Java (added builder pattern, removed default value in train method) - Updated Python MLlib functions to not require a SparkContext; we can get that from the RDD the user gives - Added a toString method in LabeledPoint - Made the Python MLlib tests run as part of run-tests as well (before they could only be run individually through each file)
  fdaabdc6