diff --git a/docs/index.md b/docs/index.md index 2daa208b3b9031c3268aab60c33b30fa8eb88db9..e3647717a1f187387677aae31cd46f8b2dbf2923 100644 --- a/docs/index.md +++ b/docs/index.md @@ -9,17 +9,18 @@ It also supports a rich set of higher-level tools including [Shark](http://shark # Downloading -Get Spark by visiting the [downloads page](http://spark.apache.org/downloads.html) of the Apache Spark site. This documentation is for Spark version {{site.SPARK_VERSION}}. +Get Spark by visiting the [downloads page](http://spark.apache.org/downloads.html) of the Apache Spark site. This documentation is for Spark version {{site.SPARK_VERSION}}. The downloads page +contains Spark packages for many popular HDFS versions. If you'd like to build Spark from +scratch, visit the [building with Maven](building-with-maven.html) page. -Spark runs on both Windows and UNIX-like systems (e.g. Linux, Mac OS). All you need to run it is to have `java` to installed on your system `PATH`, or the `JAVA_HOME` environment variable pointing to a Java installation. +Spark runs on both Windows and UNIX-like systems (e.g. Linux, Mac OS). All you need to run it is +to have `java` to installed on your system `PATH`, or the `JAVA_HOME` environment variable +pointing to a Java installation. -# Building - -Spark uses [Simple Build Tool](http://www.scala-sbt.org), which is bundled with it. To compile the code, go into the top-level Spark directory and run - - sbt/sbt assembly - -For its Scala API, Spark {{site.SPARK_VERSION}} depends on Scala {{site.SCALA_BINARY_VERSION}}. If you write applications in Scala, you will need to use a compatible Scala version (e.g. {{site.SCALA_BINARY_VERSION}}.X) -- newer major versions may not work. You can get the right version of Scala from [scala-lang.org](http://www.scala-lang.org/download/). +For its Scala API, Spark {{site.SPARK_VERSION}} depends on Scala {{site.SCALA_BINARY_VERSION}}. +If you write applications in Scala, you will need to use a compatible Scala version +(e.g. {{site.SCALA_BINARY_VERSION}}.X) -- newer major versions may not work. You can get the +right version of Scala from [scala-lang.org](http://www.scala-lang.org/download/). # Running the Examples and Shell @@ -50,23 +51,6 @@ options for deployment: * [Apache Mesos](running-on-mesos.html) * [Hadoop YARN](running-on-yarn.html) -# A Note About Hadoop Versions - -Spark uses the Hadoop-client library to talk to HDFS and other Hadoop-supported -storage systems. Because the HDFS protocol has changed in different versions of -Hadoop, you must build Spark against the same version that your cluster uses. -By default, Spark links to Hadoop 1.0.4. You can change this by setting the -`SPARK_HADOOP_VERSION` variable when compiling: - - SPARK_HADOOP_VERSION=2.2.0 sbt/sbt assembly - -In addition, if you wish to run Spark on [YARN](running-on-yarn.html), set -`SPARK_YARN` to `true`: - - SPARK_HADOOP_VERSION=2.0.5-alpha SPARK_YARN=true sbt/sbt assembly - -Note that on Windows, you need to set the environment variables on separate lines, e.g., `set SPARK_HADOOP_VERSION=1.2.1`. - # Where to Go from Here **Programming guides:** diff --git a/docs/quick-start.md b/docs/quick-start.md index 64996b52e0404c4622bf1d0d45490d046e2a3ec2..478b790f92e179afee67dd2767c01d268f73fcfe 100644 --- a/docs/quick-start.md +++ b/docs/quick-start.md @@ -9,11 +9,9 @@ title: Quick Start This tutorial provides a quick introduction to using Spark. We will first introduce the API through Spark's interactive Scala shell (don't worry if you don't know Scala -- you will not need much for this), then show how to write standalone applications in Scala, Java, and Python. See the [programming guide](scala-programming-guide.html) for a more complete reference. -To follow along with this guide, you only need to have successfully built Spark on one machine. Simply go into your Spark directory and run: - -{% highlight bash %} -$ sbt/sbt assembly -{% endhighlight %} +To follow along with this guide, first download a packaged release of Spark from the +[Spark website](http://spark.apache.org/downloads.html). Since we won't be using HDFS, +you can download a package for any version of Hadoop. # Interactive Analysis with the Spark Shell