Commits · 3db404a43a90a9cca37090381857dc955496385a · cs525-sp18-g07 / spark

Sep 01, 2013
- Move some classes to more appropriate packages: · 0a8cc309
  Matei Zaharia authored 11 years ago
  
  * RDD, *RDDFunctions -> org.apache.spark.rdd * Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util * JavaSerializer, KryoSerializer -> org.apache.spark.serializer
  0a8cc309
- Fix some URLs · 5701eb92
  Matei Zaharia authored 11 years ago
  
  5701eb92
- Initial work to rename package to org.apache.spark · 46eecd11
  Matei Zaharia authored 11 years ago
  
  46eecd11
Aug 29, 2013

Update Maven build to create assemblies expected by new scripts · 666d93c2

Matei Zaharia authored 11 years ago

This includes the following changes:
- The "assembly" package now builds in Maven by default, and creates an
  assembly containing both hadoop-client and Spark, unlike the old
  BigTop distribution assembly that skipped hadoop-client
- There is now a bigtop-dist package to build the old BigTop assembly
- The repl-bin package is no longer built by default since the scripts
  don't reply on it; instead it can be enabled with -Prepl-bin
- Py4J is now included in the assembly/lib folder as a local Maven repo,
  so that the Maven package can link to it
- run-example now adds the original Spark classpath as well because the
  Maven examples assembly lists spark-core and such as provided
- The various Maven projects add a spark-yarn dependency correctly

666d93c2

Fix finding of assembly JAR, as well as some pointers to ./run · aab345c4
Matei Zaharia authored 11 years ago

aab345c4

Change build and run instructions to use assemblies · 53cd50c0

Matei Zaharia authored 11 years ago

This commit makes Spark invocation saner by using an assembly JAR to
find all of Spark's dependencies instead of adding all the JARs in
lib_managed. It also packages the examples into an assembly and uses
that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script
with two better-named scripts: "run-examples" for examples, and
"spark-class" for Spark internal classes (e.g. REPL, master, etc). This
is also designed to minimize the confusion people have in trying to use
"run" to run their own classes; it's not meant to do that, but now at
least if they look at it, they can modify run-examples to do a decent
job for them.

As part of this, Bagel's examples are also now properly moved to the
examples package instead of bagel.

53cd50c0

Aug 18, 2013
- Remove redundant dependencies from POMs · 23f4622a
  Jey Kottalam authored 11 years ago
  
  23f4622a
Aug 16, 2013
- Updates to repl and example POMs to match SBT build · c1e547bb
  Jey Kottalam authored 11 years ago
  
  c1e547bb
- Maven build now also works with YARN · ad580b94
  Jey Kottalam authored 11 years ago
  
  ad580b94
- Don't mark hadoop-client as 'provided' · 9dd15fe7
  Jey Kottalam authored 11 years ago
  
  9dd15fe7
- Maven build now works with CDH hadoop-2.0.0-mr1 · 11b42a84
  Jey Kottalam authored 11 years ago
  
  11b42a84
- Initial changes to make Maven build agnostic of hadoop version · 353fab24
  Jey Kottalam authored 11 years ago
  
  353fab24
Aug 15, 2013
- make SparkHadoopUtil a member of SparkEnv · 4f43fd79
  Jey Kottalam authored 11 years ago
  
  4f43fd79
Aug 11, 2013
- Fixed path to JavaALS.java and JavaKMeans.java, fixed hadoop2-yarn profile · 2d97cc46
  Alexander Pivovarov authored 11 years ago
  
  2d97cc46
Aug 10, 2013
- Optimize Scala PageRank to use reduceByKey · 4c4f7691
  Matei Zaharia authored 11 years ago
  
  4c4f7691
Aug 08, 2013
- Optimize JavaPageRank to use reduceByKey instead of groupByKey · 06303a62
  Matei Zaharia authored 11 years ago
  
  06303a62
- Add setters for optimizer, gradient in SGD. · 2812e722
  Shivaram Venkataraman authored 11 years ago
  
  Also remove java-specific constructor for LabeledPoint.
  2812e722
- Remove Java-specific constructor for Rating. · e1a209f7
  Shivaram Venkataraman authored 11 years ago
  
  The scala constructor works for native type java types. Modify examples to match this.
  e1a209f7
- Style changes as per Matei's comments · c4eea875
  Nick Pentreath authored 11 years ago
  
  c4eea875
Aug 07, 2013
- Adding Scala version of PageRank example · cce758b8
  Nick Pentreath authored 11 years ago
  
  cce758b8
Aug 06, 2013

Refactor GLM algorithms and add Java tests · 7db69d56

Shivaram Venkataraman authored 11 years ago

This change adds Java examples and unit tests for all GLM algorithms
to make sure the MLLib interface works from Java. Changes include
- Introduce LabeledPoint and avoid using Doubles in train arguments
- Rename train to run in class methods
- Make the optimizer a member variable of GLM to make sure the builder
  pattern works

7db69d56

Java examples, tests for KMeans and ALS · 471fbadd

Shivaram Venkataraman authored 11 years ago

- Changes ALS to accept RDD[Rating] instead of (Int, Int, Double) making it
  easier to call from Java
- Renames class methods from `train` to `run` to enable static methods to be
  called from Java.
- Add unit tests which check if both static / class methods can be called.
- Also add examples which port the main() function in ALS, KMeans to the
  examples project.

Couple of minor changes to existing code:
- Add a toJavaRDD method in RDD to convert scala RDD to java RDD easily
- Workaround a bug where using double[] from Java leads to class cast exception in
  KMeans init

471fbadd

Got rid of unnecessary map function · 882baee4
stayhf authored 11 years ago

882baee4
changes as reviewer requested · 326a7a82
stayhf authored 11 years ago

326a7a82

Aug 04, 2013
- Updated code with reviewer's suggestions · 98fd6260
  stayhf authored 11 years ago
  
  98fd6260
Aug 03, 2013
- Simple PageRank algorithm implementation in Java for SPARK-760 · a6826373
  stayhf authored 11 years ago
  
  a6826373
Jul 16, 2013
- Add Apache license headers and LICENSE and NOTICE files · af3c9d50
  Matei Zaharia authored 11 years ago
  
  af3c9d50
Jul 08, 2013
- pom cleanup · 0b39d66f
  Mark Hamstra authored 11 years ago
  
  0b39d66f
- Explicit dependencies for scala-library and scalap to prevent 2.9.2 vs. 2.9.3 problems · afdaf430
  Mark Hamstra authored 11 years ago
  
  afdaf430
Jul 01, 2013
- Fixing missed hbase dependency in examples hadoop2-yarn profile · 6fdbc68f
  Konstantin Boudnik authored 11 years ago
  
  6fdbc68f
Jun 25, 2013
- Fix usage and parameter extraction · 176193b1
  James Phillpotts authored 11 years ago
  
  176193b1
- Include a default OAuth implementation, and update examples and JavaStreamingContext · 366572ed
  James Phillpotts authored 11 years ago
  
  366572ed
Jun 13, 2013
- Fixing the style as per feedback · b5b12823
  Rohit Rai authored 11 years ago
  
  b5b12823
Jun 03, 2013
- Example to write the output to cassandra · b104c7f5
  Rohit Rai authored 11 years ago
  
  b104c7f5
- A better way to read column value if you are sure the column exists in every row. · 56c64c40
  Rohit Rai authored 11 years ago
  
  56c64c40
Jun 02, 2013
- Adding deps to examples/pom.xml · 6d8423fd
  Rohit Rai authored 11 years ago
  
  Fixing exclusion in examples deps in SparkBuild.scala
  6d8423fd
- Removing infix call · 81c2adc1
  Rohit Rai authored 11 years ago
  
  81c2adc1
Jun 01, 2013
- Adding example to make Spark RDD from Cassandra · 3be7bdce
  Rohit Rai authored 11 years ago
  
  3be7bdce
May 20, 2013
- Add hBase dependency to examples POM · 3217d486
  Ethan Jewett authored 11 years ago
  
  3217d486
May 09, 2013
- Add hBase example · ee6f6aa6
  Ethan Jewett authored 11 years ago
  
  ee6f6aa6