- Jan 15, 2014
-
-
CrazyJvm authored
-
CrazyJvm authored
spark-shell intends to set MASTER automatically if we do not provide the option when we start the shell , but there's a problem. The condition is "if [[ "x" != "x$SPARK_MASTER_IP" && "y" != "y$SPARK_MASTER_PORT" ]];" we sure will set SPARK_MASTER_IP explicitly, the SPARK_MASTER_PORT option, however, we probably do not set just using spark default port 7077. So if we do not set SPARK_MASTER_PORT, the condition will never be true. We should just use default port if users do not set port explicitly I think.
-
CrazyJvm authored
remove "-XX:+UseCompressedStrings" option from tuning guide since jdk7 no longer supports this.
-
Reynold Xin authored
Rename VertexID -> VertexId in GraphX
-
Patrick Wendell authored
Fixed the flaky tests by making SparkConf not serializable SparkConf was being serialized with CoGroupedRDD and Aggregator, which somehow caused OptionalJavaException while being deserialized as part of a ShuffleMapTask. SparkConf should not even be serializable (according to conversation with Matei). This change fixes that. @mateiz @pwendell
-
Patrick Wendell authored
Fixed SVDPlusPlusSuite in Maven build. This should go into 0.9.0 also.
-
Tathagata Das authored
-
Tathagata Das authored
Changed SparkConf to not be serializable. And also fixed unit-test log paths in log4j.properties of external modules.
-
Reynold Xin authored
-
Ankur Dave authored
-
- Jan 14, 2014
-
-
Reynold Xin authored
Additional edits for clarity in the graphx programming guide. Added an overview of the Graph and GraphOps functions and fixed numerous typos.
-
Reynold Xin authored
Describe caching and uncaching in GraphX programming guide
-
Ankur Dave authored
-
Reynold Xin authored
Don't clone records for text files
-
Reynold Xin authored
Add GraphX dependency to examples/pom.xml
-
Reynold Xin authored
Deprecate rather than remove old combineValuesByKey function
-
Ankur Dave authored
-
Patrick Wendell authored
-
Patrick Wendell authored
-
Reynold Xin authored
API doc update & make Broadcast public In #413 Broadcast was mistakenly made private[spark]. I changed it to public again. Also exposing id in public given the R frontend requires that. Copied some of the documentation from the programming guide to API Doc for Broadcast and Accumulator. This should be cherry picked into branch-0.9 as well for 0.9.0 release.
-
Patrick Wendell authored
-
Patrick Wendell authored
-
Reynold Xin authored
-
Reynold Xin authored
Maintain Serializable API compatibility by reverting back to java.io.Serializable for Broadcast and Accumulator.
-
Reynold Xin authored
-
Reynold Xin authored
-
Reynold Xin authored
-
Reynold Xin authored
Note that previously Broadcast class was accidentally marked as private[spark]. It needs to be public for broadcast variables to work. Also exposing the broadcast varaible id.
-
Joseph E. Gonzalez authored
-
Reynold Xin authored
Improving the graphx-programming-guide This PR will track a few minor improvements to the content and formatting of the graphx-programming-guide.
-
Joseph E. Gonzalez authored
-
Patrick Wendell authored
Add missing header files
-
Patrick Wendell authored
-
Patrick Wendell authored
Removed unnecessary DStream operations and updated docs Removed StreamingContext.registerInputStream and registerOutputStream - they were useless. InputDStream has been made to register itself, and just registering a DStream as output stream cause RDD objects to be created but the RDDs will not be computed at all.. Also made DStream.register() private[streaming] for the same reasons. Updated docs, specially added package documentation for streaming package. Also, changed NetworkWordCount's input storage level to use MEMORY_ONLY, replication on the local machine causes warning messages (as replication fails) which is scary for a new user trying out his/her first example.
-
Tathagata Das authored
-
Tathagata Das authored
Conflicts: streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala
-
Patrick Wendell authored
Enable compression by default for spills
-
Patrick Wendell authored
-
Tathagata Das authored
Removed StreamingContext.registerInputStream and registerOutputStream - they were useless as InputDStream has been made to register itself. Also made DStream.register() private[streaming] - not useful to expose the confusing function. Updated a lot of documentation.
-
Patrick Wendell authored
Add Naive Bayes to Python MLlib, and some API fixes - Added a Python wrapper for Naive Bayes - Updated the Scala Naive Bayes to match the style of our other algorithms better and in particular make it easier to call from Java (added builder pattern, removed default value in train method) - Updated Python MLlib functions to not require a SparkContext; we can get that from the RDD the user gives - Added a toString method in LabeledPoint - Made the Python MLlib tests run as part of run-tests as well (before they could only be run individually through each file)
-