Commits · a18ea00f3af0fa4c6b2c59933e22b6c9f0f636c8 · cs525-sp18-g07 / spark

Mar 15, 2014

SPARK-1254. Consolidate, order, and harmonize repository declarations in Maven/SBT builds · 97e4459e

Sean Owen authored 11 years ago

This suggestion addresses a few minor suboptimalities with how repositories are handled.

1) Use HTTPS consistently to access repos, instead of HTTP

2) Consolidate repository declarations in the parent POM file, in the case of the Maven build, so that their ordering can be controlled to put the fully optional Cloudera repo at the end, after required repos. (This was prompted by the untimely failure of the Cloudera repo this week, which made the Spark build fail. #2 would have prevented that.)

3) Update SBT build to match Maven build in this regard

4) Update SBT build to not refer to Sonatype snapshot repos. This wasn't in Maven, and a build generally would not refer to external snapshots, but I'm not 100% sure on this one.

Author: Sean Owen <sowen@cloudera.com>

Closes #145 from srowen/SPARK-1254 and squashes the following commits:

42f9bfc [Sean Owen] Use HTTPS for repos; consolidate repos in parent in order to put optional Cloudera repo last; harmonize SBT build repos with Maven; remove snapshot repos from SBT build which weren't in Maven

97e4459e

Mar 08, 2014

SPARK-1193. Fix indentation in pom.xmls · a99fb374

Sandy Ryza authored 11 years ago

Author: Sandy Ryza <sandy@cloudera.com>

Closes #91 from sryza/sandy-spark-1193 and squashes the following commits:

a878124 [Sandy Ryza] SPARK-1193. Fix indentation in pom.xmls

a99fb374

Mar 04, 2014

[java8API] SPARK-964 Investigate the potential for using JDK 8 lambda... · 181ec503

Prashant Sharma authored 11 years ago

[java8API] SPARK-964 Investigate the potential for using JDK 8 lambda expressions for the Java/Scala APIs

Author: Prashant Sharma <prashant.s@imaginea.com>
Author: Patrick Wendell <pwendell@gmail.com>

Closes #17 from ScrapCodes/java8-lambdas and squashes the following commits:

95850e6 [Patrick Wendell] Some doc improvements and build changes to the Java 8 patch.
85a954e [Prashant Sharma] Nit. import orderings.
673f7ac [Prashant Sharma] Added support for -java-home as well
80a13e8 [Prashant Sharma] Used fake class tag syntax
26eb3f6 [Prashant Sharma] Patrick's comments on PR.
35d8d79 [Prashant Sharma] Specified java 8 building in the docs
31d4cd6 [Prashant Sharma] Maven build to support -Pjava8-tests flag.
4ab87d3 [Prashant Sharma] Review feedback on the pr
c33dc2c [Prashant Sharma] SPARK-964, Java 8 API Support.

181ec503

Mar 03, 2014

SPARK-1158: Fix flaky RateLimitedOutputStreamSuite. · f5ae38af

Reynold Xin authored 11 years ago

There was actually a problem with the RateLimitedOutputStream implementation where the first second doesn't write anything because of integer rounding.

So RateLimitedOutputStream was overly aggressive in throttling.

Author: Reynold Xin <rxin@apache.org>

Closes #55 from rxin/ratelimitest and squashes the following commits:

52ce1b7 [Reynold Xin] SPARK-1158: Fix flaky RateLimitedOutputStreamSuite.

f5ae38af

Mar 02, 2014

SPARK-1121: Include avro for yarn-alpha builds · c3f5e075

Patrick Wendell authored 11 years ago

This lets us explicitly include Avro based on a profile for 0.23.X
builds. It makes me sad how convoluted it is to express this logic
in Maven. @tgraves and @sryza curious if this works for you.

I'm also considering just reverting to how it was before. The only
real problem was that Spark advertised a dependency on Avro
even though it only really depends transitively on Avro through
other deps.

Author: Patrick Wendell <pwendell@gmail.com>

Closes #49 from pwendell/avro-build-fix and squashes the following commits:

8d6ee92 [Patrick Wendell] SPARK-1121: Add avro to yarn-alpha profile

c3f5e075

SPARK-1084.2 (resubmitted) · fd31adbf

Sean Owen authored 11 years ago

(Ported from https://github.com/apache/incubator-spark/pull/650 )

This adds one more change though, to fix the scala version warning introduced by json4s recently.

Author: Sean Owen <sowen@cloudera.com>

Closes #32 from srowen/SPARK-1084.2 and squashes the following commits:

9240abd [Sean Owen] Avoid scala version conflict in scalap induced by json4s dependency
1561cec [Sean Owen] Remove "exclude *" dependencies that are causing Maven warnings, and that are apparently unneeded anyway

fd31adbf

Ignore RateLimitedOutputStreamSuite for now. · 353ac6b4

Reynold Xin authored 11 years ago

This test has been flaky. We can re-enable it after @tdas has a chance to look at it.

Author: Reynold Xin <rxin@apache.org>

Closes #54 from rxin/ratelimit and squashes the following commits:

1a12198 [Reynold Xin] Ignore RateLimitedOutputStreamSuite for now.

353ac6b4

Remove remaining references to incubation · 1fd2bfd3

Patrick Wendell authored 11 years ago

This removes some loose ends not caught by the other (incubating -> tlp) patches. @markhamstra this updates the version as you mentioned earlier.

Author: Patrick Wendell <pwendell@gmail.com>

Closes #51 from pwendell/tlp and squashes the following commits:

d553b1b [Patrick Wendell] Remove remaining references to incubation

1fd2bfd3

Feb 27, 2014

SPARK 1084.1 (resubmitted) · 12bbca20

Sean Owen authored 11 years ago

(Ported from https://github.com/apache/incubator-spark/pull/637 )

Author: Sean Owen <sowen@cloudera.com>

Closes #31 from srowen/SPARK-1084.1 and squashes the following commits:

6c4a32c [Sean Owen] Suppress warnings about legitimate unchecked array creations, or change code to avoid it
f35b833 [Sean Owen] Fix two misc javadoc problems
254e8ef [Sean Owen] Fix one new style error introduced in scaladoc warning commit
5b2fce2 [Sean Owen] Fix scaladoc invocation warning, and enable javac warnings properly, with plugin config updates
007762b [Sean Owen] Remove dead scaladoc links
b8ff8cb [Sean Owen] Replace deprecated Ant <tasks> with <target>

12bbca20

Feb 23, 2014

SPARK-1071: Tidy logging strategy and use of log4j · c0ef3afa

Sean Owen authored 11 years ago

Prompted by a recent thread on the mailing list, I tried and failed to see if Spark can be made independent of log4j. There are a few cases where control of the underlying logging is pretty useful, and to do that, you have to bind to a specific logger.

Instead I propose some tidying that leaves Spark's use of log4j, but gets rid of warnings and should still enable downstream users to switch. The idea is to pipe everything (except log4j) through SLF4J, and have Spark use SLF4J directly when logging, and where Spark needs to output info (REPL and tests), bind from SLF4J to log4j.

This leaves the same behavior in Spark. It means that downstream users who want to use something except log4j should:

- Exclude dependencies on log4j, slf4j-log4j12 from Spark
- Include dependency on log4j-over-slf4j
- Include dependency on another logger X, and another slf4j-X
- Recreate any log config that Spark does, that is needed, in the other logger's config

That sounds about right.

Here are the key changes:

- Include the jcl-over-slf4j shim everywhere by depending on it in core.
- Exclude dependencies on commons-logging from third-party libraries.
- Include the jul-to-slf4j shim everywhere by depending on it in core.
- Exclude slf4j-* dependencies from third-party libraries to prevent collision or warnings
- Added missing slf4j-log4j12 binding to GraphX, Bagel module tests

And minor/incidental changes:

- Update to SLF4J 1.7.5, which happily matches Hadoop 2’s version and is a recommended update over 1.7.2
- (Remove a duplicate HBase dependency declaration in SparkBuild.scala)
- (Remove a duplicate mockito dependency declaration that was causing warnings and bugging me)

Author: Sean Owen <sowen@cloudera.com>

Closes #570 from srowen/SPARK-1071 and squashes the following commits:

52eac9f [Sean Owen] Add slf4j-over-log4j12 dependency to core (non-test) and remove it from things that depend on core.
77a7fa9 [Sean Owen] SPARK-1071: Tidy logging strategy and use of log4j

c0ef3afa

Feb 10, 2014

Merge pull request #567 from ScrapCodes/style2. · 919bd7f6

Prashant Sharma authored 11 years ago

SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build. Pt 2

Continuation of PR #557

With this all scala style errors are fixed across the code base !!

The reason for creating a separate PR was to not interrupt an already reviewed and ready to merge PR. Hope this gets reviewed soon and merged too.

Author: Prashant Sharma <prashant.s@imaginea.com>

Closes #567 and squashes the following commits:

3b1ec30 [Prashant Sharma] scala style fixes

919bd7f6

Feb 09, 2014

Merge pull request #557 from ScrapCodes/style. Closes #557. · b69f8b2a

Patrick Wendell authored 11 years ago

SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build.

Author: Patrick Wendell <pwendell@gmail.com>
Author: Prashant Sharma <scrapcodes@gmail.com>

== Merge branch commits ==

commit 1a8bd1c059b842cb95cc246aaea74a79fec684f4
Author: Prashant Sharma <scrapcodes@gmail.com>
Date:   Sun Feb 9 17:39:07 2014 +0530

    scala style fixes

commit f91709887a8e0b608c5c2b282db19b8a44d53a43
Author: Patrick Wendell <pwendell@gmail.com>
Date:   Fri Jan 24 11:22:53 2014 -0800

    Adding scalastyle snapshot

b69f8b2a

Feb 08, 2014

Merge pull request #542 from markhamstra/versionBump. Closes #542. · c2341c92

Mark Hamstra authored 11 years ago

Version number to 1.0.0-SNAPSHOT

Since 0.9.0-incubating is done and out the door, we shouldn't be building 0.9.0-incubating-SNAPSHOT anymore.

@pwendell

Author: Mark Hamstra <markhamstra@gmail.com>

== Merge branch commits ==

commit 1b00a8a7c1a7f251b4bb3774b84b9e64758eaa71
Author: Mark Hamstra <markhamstra@gmail.com>
Date:   Wed Feb 5 09:30:32 2014 -0800

    Version number to 1.0.0-SNAPSHOT

c2341c92

Feb 02, 2014

Merge pull request #529 from hsaputra/cleanup_right_arrowop_scala · 0386f42e

Henry Saputra authored 11 years ago

Change the ⇒ character (maybe from scalariform) to => in Scala code for style consistency

Looks like there are some ⇒ Unicode character (maybe from scalariform) in Scala code.
This PR is to change it to => to get some consistency on the Scala code.

If we want to use ⇒ as default we could use sbt plugin scalariform to make sure all Scala code has ⇒ instead of =>

And remove unused imports found in TwitterInputDStream.scala while I was there =)

Author: Henry Saputra <hsaputra@apache.org>

== Merge branch commits ==

commit 29c1771d346dff901b0b778f764e6b4409900234
Author: Henry Saputra <hsaputra@apache.org>
Date:   Sat Feb 1 22:05:16 2014 -0800

    Change the ⇒ character (maybe from scalariform) to => in Scala code for style consistency.

0386f42e

Jan 25, 2014

Fix ClassCastException in JavaPairRDD.collectAsMap() (SPARK-1040) · 740e865f

Josh Rosen authored 11 years ago

This fixes an issue where collectAsMap() could
fail when called on a JavaPairRDD that was derived
by transforming a non-JavaPairRDD.

The root problem was that we were creating the
JavaPairRDD's ClassTag by casting a
ClassTag[AnyRef] to a ClassTag[Tuple2[K2, V2]].
To fix this, I cast a ClassTag[Tuple2[_, _]]
instead, since this actually produces a ClassTag
of the appropriate type because ClassTags don't
capture type parameters:

scala> implicitly[ClassTag[Tuple2[_, _]]] == implicitly[ClassTag[Tuple2[Int, Int]]]
res8: Boolean = true

scala> implicitly[ClassTag[AnyRef]].asInstanceOf[ClassTag[Tuple2[Int, Int]]] == implicitly[ClassTag[Tuple2[Int, Int]]]
res9: Boolean = false

740e865f

Jan 16, 2014
- Updated java API docs for streaming, along with very minor changes in the code examples. · 11e6534d
  Tathagata Das authored 11 years ago
  
  11e6534d
Jan 15, 2014
- Made some classes private[stremaing] and deprecated a method in JavaStreamingContext. · 9e637534
  Tathagata Das authored 11 years ago
  
  9e637534
- Changed SparkConf to not be serializable. And also fixed unit-test log paths... · 1f4718c4
  Tathagata Das authored 11 years ago
  
  Changed SparkConf to not be serializable. And also fixed unit-test log paths in log4j.properties of external modules.
  1f4718c4
Jan 14, 2014
- Add missing header files · 23034798
  Patrick Wendell authored 11 years ago
  
  23034798
- Fixed loose ends in docs. · f8bd828c
  Tathagata Das authored 11 years ago
  
  f8bd828c
- Removed StreamingContext.registerInputStream and registerOutputStream - they... · 4e497db8
  Tathagata Das authored 11 years ago
  
  Removed StreamingContext.registerInputStream and registerOutputStream - they were useless as InputDStream has been made to register itself. Also made DStream.register() private[streaming] - not useful to expose the confusing function. Updated a lot of documentation.
  4e497db8
Jan 13, 2014
- Adjusted visibility of various components. · 33022d66
  Reynold Xin authored 11 years ago
  
  33022d66
- Improved file input stream further. · c0bb38e8
  Tathagata Das authored 11 years ago
  
  c0bb38e8
- Updated JavaStreamingContext to make scaladoc compile. · 30328c34
  Reynold Xin authored 11 years ago
  
  `sbt/sbt doc` used to fail. This fixed it.
  30328c34
- Added unpersisting and modified testsuite to better test out metadata cleaning. · 27311b13
  Tathagata Das authored 11 years ago
  
  27311b13
Jan 12, 2014
- Fixed persistence logic of WindowedDStream, and fixed default persistence level of input streams. · 034f89aa
  Tathagata Das authored 11 years ago
  
  034f89aa
- Adding deprecated versions of old code · e6e20cee
  Patrick Wendell authored 11 years ago
  
  e6e20cee
- Changed StreamingContext.stopForWait to awaitTermination. · c7fabb74
  Tathagata Das authored 11 years ago
  
  c7fabb74
- Rename DStream.foreach to DStream.foreachRDD · f4d77f8c
  Patrick Wendell authored 11 years ago
  
  `foreachRDD` makes it clear that the granularity of this operator is per-RDD. As it stands, `foreach` is inconsistent with with `map`, `filter`, and the other DStream operators which get pushed down to individual records within each RDD.
  f4d77f8c
- Removing mentions in tests · 0bb33076
  Patrick Wendell authored 11 years ago
  
  0bb33076
- Fixed bugs to ensure better cleanup of JobScheduler, JobGenerator and... · 7883b8f5
  Tathagata Das authored 11 years ago
  
  Fixed bugs to ensure better cleanup of JobScheduler, JobGenerator and NetworkInputTracker upon close.
  7883b8f5
- Moved DStream, DStreamCheckpointData and PairDStream from... · 448aef67
  Tathagata Das authored 11 years ago
  
  Moved DStream, DStreamCheckpointData and PairDStream from org.apache.spark.streaming to org.apache.spark.streaming.dstream.
  448aef67
- Remove simple redundant return statement for Scala methods/functions: · 93a65e5f
  Henry Saputra authored 11 years ago
  
  -) Only change simple return statements at the end of method -) Ignore the complex if-else check -) Ignore the ones inside synchronized
  93a65e5f
- Fixed bugs. · c5921e5c
  Tathagata Das authored 11 years ago
  
  c5921e5c
- Added waitForStop and stop to JavaStreamingContext. · 4d9b0ab4
  Tathagata Das authored 11 years ago
  
  4d9b0ab4
- Converted JobScheduler to use actors for event handling. Changed... · f5108ffc
  Tathagata Das authored 11 years ago
  
  Converted JobScheduler to use actors for event handling. Changed protected[streaming] to private[streaming] in StreamingContext and DStream. Added waitForStop to StreamingContext, and StreamingContextSuite.
  f5108ffc
Jan 11, 2014
- Revert "Fix one unit test that was not setting spark.cleaner.ttl" · 22d4d624
  Patrick Wendell authored 11 years ago
  
  This reverts commit 942c80b3.
  22d4d624
Jan 10, 2014
- Fix one unit test that was not setting spark.cleaner.ttl · 942c80b3
  Matei Zaharia authored 11 years ago
  
  942c80b3
- Modified streaming.FailureSuite tests to test StreamingContext.getOrCreate. · 82f07dee
  Tathagata Das authored 11 years ago
  
  82f07dee
- Updated docs based on Patrick's comments in PR 383. · e4bb8452
  Tathagata Das authored 11 years ago
  
  e4bb8452