Commits · f8ba89da217a1f1fd5c856a95a27a3e535017643 · cs525-sp18-g07 / spark

Dec 13, 2013
- Review comments on the PR for scala 2.10 migration. · a854cc53
  Prashant Sharma authored 11 years ago
  
  a854cc53
Dec 12, 2013
- Disabled yarn 2.2 and added a message in the sbt build · 589b83a1
  Prashant Sharma authored 11 years ago
  
  589b83a1
Dec 10, 2013
- added eclipse repository for spark streaming. · 0b82b5af
  Prashant Sharma authored 11 years ago
  
  0b82b5af
Dec 03, 2013

Use published "org.spark-project.akka-*" in sbt build for Hadoop-2.2 dependencies. · 1b6e4507

Harvey Feng authored 11 years ago

This also includes:
-Change `isNewYarn` to `isNewHadoop`, since the protobuf-2.5 dependency is from Hadoop-2.2 itself.
-Regexp bugix

Credits to @alig for this patch.

1b6e4507

Nov 26, 2013

Add optional Hadoop 2.2 settings in sbt build. · a1a1c62a

Harvey Feng authored 11 years ago

If the Hadoop used is version 2.2 or derived from it, then Spark
will be compiled against protobuf-2.5 and a protobuf-2.5 version of
Akka 2.0.5.

a1a1c62a

Nov 15, 2013

Use Kafka 2.10 (again) · ce1d2af7
Aaron Davidson authored 11 years ago

ce1d2af7

Various merge corrections · f629ba95

Aaron Davidson authored 11 years ago

I've diff'd this patch against my own -- since they were both created
independently, this means that two sets of eyes have gone over all the
merge conflicts that were created, so I'm feeling significantly more
confident in the resulting PR.

@rxin has looked at the changes to the repl and is resoundingly
confident that they are correct.

f629ba95

Nov 14, 2013
- Some fixes for previous master merge commits · d4cd3233
  Raymond Liu authored 11 years ago
  
  d4cd3233
Nov 12, 2013

Upgrade to sbt-assembly 0.9.2 · 48eac0bc
Nathan Howell authored 11 years ago

48eac0bc

spark-assembly.jar fails to authenticate with YARN ResourceManager · 23146a67

Nathan Howell authored 11 years ago

sbt-assembly is setup to pick the first META-INF/services/org.apache.hadoop.security.SecurityInfo file instead of merging them. This causes Kerberos authentication to fail, this manifests itself in the "info:null" debug log statement:

    DEBUG SaslRpcClient: Get token info proto:interface org.apache.hadoop.yarn.api.ApplicationClientProtocolPB info:null
    DEBUG SaslRpcClient: Get kerberos info proto:interface org.apache.hadoop.yarn.api.ApplicationClientProtocolPB info:null
    ERROR UserGroupInformation: PriviledgedActionException as:foo@BAR (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
    DEBUG UserGroupInformation: PrivilegedAction as:foo@BAR (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:583)
    WARN Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
    ERROR UserGroupInformation: PriviledgedActionException as:foo@BAR (auth:KERBEROS) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]

This previously would just contain a single class:

$ unzip -c assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar META-INF/services/org.apache.hadoop.security.SecurityInfo
Archive:  assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar
  inflating: META-INF/services/org.apache.hadoop.security.SecurityInfo

    org.apache.hadoop.security.AnnotatedSecurityInfo

And now has the full list of classes:

$ unzip -c assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar META-INF/services/org.apache.hadoop.security.SecurityInfoArchive:  assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar
  inflating: META-INF/services/org.apache.hadoop.security.SecurityInfo

    org.apache.hadoop.security.AnnotatedSecurityInfo
    org.apache.hadoop.mapreduce.v2.app.MRClientSecurityInfo
    org.apache.hadoop.mapreduce.v2.security.client.ClientHSSecurityInfo
    org.apache.hadoop.yarn.security.client.ClientRMSecurityInfo
    org.apache.hadoop.yarn.security.ContainerManagerSecurityInfo
    org.apache.hadoop.yarn.security.SchedulerSecurityInfo
    org.apache.hadoop.yarn.security.admin.AdminSecurityInfo
    org.apache.hadoop.yarn.server.RMNMSecurityInfoClass

23146a67

Nov 11, 2013
- Add mockito to the sbt build · 17bb9a27
  tgravescs authored 11 years ago
  
  17bb9a27
Nov 09, 2013
- Add spark-tools assembly to spark-class classpath. · a37ff0f1
  Josh Rosen authored 11 years ago
  
  This allows the JavaAPICompletenessChecker to be run with Spark 0.8+.
  a37ff0f1
Nov 08, 2013

Add graphite sink for metrics · ef85a51f

Russell Cardullo authored 11 years ago

This adds a metrics sink for graphite.  The sink must
be configured with the host and port of a graphite node
and optionally may be configured with a prefix that will
be prepended to all metrics that are sent to graphite.

ef85a51f

Oct 25, 2013

Exclude jopt from kafka dependency. · af4a529f

Patrick Wendell authored 11 years ago

Kafka uses an older version of jopt that causes bad conflicts with the version
used by spark-perf. It's not easy to remove this downstream because of the way
that spark-perf uses Spark (by including a spark assembly as an unmanaged jar).
This fixes the problem at its source by just never including it.

af4a529f

Oct 24, 2013
- Updating to latest akka 2.2.3, which fixes our only failing Driver Suite · c77ca1fe
  Prashant Sharma authored 11 years ago
  
  c77ca1fe
Oct 23, 2013
- Fix Maven build to use MQTT repository · dadfc63b
  Matei Zaharia authored 11 years ago
  
  dadfc63b
Oct 16, 2013
- remove unused dependency · 29245605
  prabeesh authored 11 years ago
  
  29245605
- Rename SBT target to assemble-deps. · 0a4b76fc
  Shivaram Venkataraman authored 11 years ago
  
  0a4b76fc
- added mqtt adapter library dependencies · 06de3d51
  prabeesh authored 11 years ago
  
  06de3d51
- Fixing spark streaming example and a bug in examples build. · 35befe07
  Patrick Wendell authored 11 years ago
  
  - Examples assembly included a log4j.properties which clobbered Spark's - Example had an error where some classes weren't serializable - Did some other clean-up in this example
  35befe07
Oct 12, 2013
- Upgrade Kafka 0.7.2 to Kafka 0.8.0-beta1 for Spark Streaming · c23cd72b
  jerryshao authored 11 years ago
  
  c23cd72b
Oct 11, 2013
- Add a comment and exclude tools · c441904b
  Shivaram Venkataraman authored 11 years ago
  
  c441904b
Oct 09, 2013
- Add new SBT target for dependency assembly · 484166d5
  Shivaram Venkataraman authored 11 years ago
  
  484166d5
Oct 07, 2013

Merge pull request #31 from sundeepn/branch-0.8 · 213b70a2

Reynold Xin authored 11 years ago


Resolving package conflicts with hadoop 0.23.9

Hadoop 0.23.9 is having a package conflict with easymock's dependencies.

(cherry picked from commit 023e3fdf)
Signed-off-by: Reynold Xin <rxin@apache.org>

213b70a2

Oct 05, 2013
- scala 2.10 requires Java 1.6, · 9b0c9c89
  Martin Weindel authored 11 years ago
  
  using Scala 2.10.3, resolved maven-scala-plugin warning
  9b0c9c89
Oct 01, 2013
- ask ivy/sbt to check local maven repo under ~/.m2 · 9fd6bba6
  Du Li authored 11 years ago
  
  9fd6bba6
Sep 26, 2013

Standalone Scheduler fault tolerance using ZooKeeper · f549ea33

Aaron Davidson authored 11 years ago

This patch implements full distributed fault tolerance for standalone scheduler Masters.
There is only one master Leader at a time, which is actively serving scheduling
requests. If this Leader crashes, another master will eventually be elected, reconstruct
the state from the first Master, and continue serving scheduling requests.

Leader election is performed using the ZooKeeper leader election pattern. We try to minimize
the use of ZooKeeper and the assumptions about ZooKeeper's behavior, so there is a layer of
retries and session monitoring on top of the ZooKeeper client.

Master failover follows directly from the single-node Master recovery via the file
system (patch 194ba4b8), save that the Master state is stored in ZooKeeper instead.

Configuration:
By default, no recovery mechanism is enabled (spark.deploy.recoveryMode = NONE).
By setting spark.deploy.recoveryMode to ZOOKEEPER and setting spark.deploy.zookeeper.url
to an appropriate ZooKeeper URL, ZooKeeper recovery mode is enabled.
By setting spark.deploy.recoveryMode to FILESYSTEM and setting spark.deploy.recoveryDirectory
to an appropriate directory accessible by the Master, we will keep the behavior of from 194ba4b8.

Additionally, places where a Master could be specificied by a spark:// url can now take
comma-delimited lists to specify backup masters. Note that this is only used for registration
of NEW Workers and application Clients. Once a Worker or Client has registered with the
Master Leader, it is "in the system" and will never need to register again.

Forthcoming:
Documentation, tests (! - only ad hoc testing has been performed so far)
I do not intend for this commit to be merged until tests are added, but this patch should
still be mostly reviewable until then.

f549ea33

Removed scala -optimize flag. · 3f283278
Reynold Xin authored 11 years ago

3f283278
fixed maven build for scala 2.10 · 7ff4c2d3
Prashant Sharma authored 11 years ago

7ff4c2d3

Sep 24, 2013
- Update build version in master · 6079721f
  Patrick Wendell authored 11 years ago
  
  6079721f
Sep 21, 2013
- Akka 2.2 migration · 276c37a5
  Prashant Sharma authored 11 years ago
  
  276c37a5
Sep 15, 2013
- Bumping Mesos version to 0.13.0 · c856860c
  Patrick Wendell authored 11 years ago
  
  c856860c
- Fixed repl suite · 20c65bc3
  Prashant Sharma authored 11 years ago
  
  20c65bc3
Sep 14, 2013
- Fix build on ubuntu · 68068977
  Holden Karau authored 11 years ago
  
  68068977
Sep 11, 2013

Fix HDFS access bug with assembly build. · 0c1985b1

Patrick Wendell authored 11 years ago

Due to this change in HDFS:
https://issues.apache.org/jira/browse/HADOOP-7549

there is a bug when using the new assembly builds. The symptom is that any HDFS access
results in an exception saying "No filesystem for scheme 'hdfs'". This adds a merge
strategy in the assembly build which fixes the problem.

0c1985b1

Sep 10, 2013
- Add explicit jets3t dependency, which is excluded in hadoop-client · f117dc6d
  Matei Zaharia authored 11 years ago
  
  f117dc6d
Sep 08, 2013
- Fix target JVM version in scala build · 27bd74c8
  Patrick Wendell authored 11 years ago
  
  27bd74c8
- Ganglia sink · 8de8ee5d
  Patrick Wendell authored 11 years ago
  
  8de8ee5d
Sep 07, 2013
- Adding Apache license to two files · 6d219864
  Patrick Wendell authored 11 years ago
  
  6d219864
Sep 06, 2013
- Minor YARN build cleanups · 30a32c83
  Jey Kottalam authored 11 years ago
  
  30a32c83