Skip to content
Snippets Groups Projects
Commit 797f8a00 authored by Pierre Borckmans's avatar Pierre Borckmans Committed by Sean Owen
Browse files

[SPARK-6402][DOC] - Remove some refererences to shark in docs and ec2

EC2 script and job scheduling documentation still refered to Shark.
I removed these references.

I also removed a remaining `SHARK_VERSION` variable from `ec2-variables.sh`.

Author: Pierre Borckmans <pierre.borckmans@realimpactanalytics.com>

Closes #5083 from pierre-borckmans/remove_refererences_to_shark_in_docs and squashes the following commits:

4e90ffc [Pierre Borckmans] Removed deprecated SHARK_VERSION
caea407 [Pierre Borckmans] Remove shark reference from ec2 script doc
196c744 [Pierre Borckmans] Removed references to Shark
parent 2c3f83c3
No related branches found
No related tags found
No related merge requests found
......@@ -5,7 +5,7 @@ title: Running Spark on EC2
The `spark-ec2` script, located in Spark's `ec2` directory, allows you
to launch, manage and shut down Spark clusters on Amazon EC2. It automatically
sets up Spark, Shark and HDFS on the cluster for you. This guide describes
sets up Spark and HDFS on the cluster for you. This guide describes
how to use `spark-ec2` to launch clusters, how to run jobs on them, and how
to shut them down. It assumes you've already signed up for an EC2 account
on the [Amazon Web Services site](http://aws.amazon.com/).
......
......@@ -14,8 +14,7 @@ runs an independent set of executor processes. The cluster managers that Spark r
facilities for [scheduling across applications](#scheduling-across-applications). Second,
_within_ each Spark application, multiple "jobs" (Spark actions) may be running concurrently
if they were submitted by different threads. This is common if your application is serving requests
over the network; for example, the [Shark](http://shark.cs.berkeley.edu) server works this way. Spark
includes a [fair scheduler](#scheduling-within-an-application) to schedule resources within each SparkContext.
over the network. Spark includes a [fair scheduler](#scheduling-within-an-application) to schedule resources within each SparkContext.
# Scheduling Across Applications
......@@ -52,8 +51,7 @@ an application to gain back cores on one node when it has work to do. To use thi
Note that none of the modes currently provide memory sharing across applications. If you would like to share
data this way, we recommend running a single server application that can serve multiple requests by querying
the same RDDs. For example, the [Shark](http://shark.cs.berkeley.edu) JDBC server works this way for SQL
queries. In future releases, in-memory storage systems such as [Tachyon](http://tachyon-project.org) will
the same RDDs. In future releases, in-memory storage systems such as [Tachyon](http://tachyon-project.org) will
provide another approach to share RDDs.
## Dynamic Resource Allocation
......
......@@ -25,7 +25,6 @@ export MAPRED_LOCAL_DIRS="{{mapred_local_dirs}}"
export SPARK_LOCAL_DIRS="{{spark_local_dirs}}"
export MODULES="{{modules}}"
export SPARK_VERSION="{{spark_version}}"
export SHARK_VERSION="{{shark_version}}"
export TACHYON_VERSION="{{tachyon_version}}"
export HADOOP_MAJOR_VERSION="{{hadoop_major_version}}"
export SWAP_MB="{{swap}}"
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment