Small fixes to README

2c5a4b89 · Matei Zaharia · 2b29a1d4 · 2c5a4b89
Commit 2c5a4b89 authored 11 years ago by Matei Zaharia
--- a/README.md
+++ b/README.md
-# Spark
+# Apache Spark
-Lightning-Fast Cluster Computing - <http://www.spark-project.org/>
+Lightning-Fast Cluster Computing - <http://spark.incubator.apache.org/>
 ## Online Documentation
 You can find the latest Spark documentation, including a programming
-guide, on the project webpage at <http://spark-project.org/documentation.html>.
+guide, on the project webpage at <http://spark.incubator.apache.org/documentation.html>.
 This README file only contains basic setup instructions.
@@ -18,16 +18,14 @@ Spark and its example programs, run:
    sbt/sbt assembly
-Spark also supports building using Maven. If you would like to build using Maven,
+Once you've built Spark, the easiest way to start using it is the shell:
-see the [instructions for building Spark with Maven](http://spark-project.org/docs/latest/building-with-maven.html)
-in the Spark documentation..
-To run Spark, you will need to have Scala's bin directory in your `PATH`, or
+    ./spark-shell
-you will need to set the `SCALA_HOME` environment variable to point to where
-you've installed Scala. Scala must be accessible through one of these
-methods on your cluster's worker nodes as well as its master.
-To run one of the examples, use `./run-example <class> <params>`. For example:
+Or, for the Python API, the Python shell (`./pyspark`).
+Spark also comes with several sample programs in the `examples` directory.
+To run one of them, use `./run-example <class> <params>`. For example:
    ./run-example spark.examples.SparkLR local[2]
@@ -35,7 +33,7 @@ will run the Logistic Regression example locally on 2 CPUs.
 Each of the example programs prints usage help if no params are given.
-All of the Spark samples take a `<host>` parameter that is the cluster URL
+All of the Spark samples take a `<master>` parameter that is the cluster URL
 to connect to. This can be a mesos:// or spark:// URL, or "local" to run
 locally with one thread, or "local[N]" to run locally with N threads.
@@ -58,13 +56,13 @@ versions without YARN, use:
    $ SPARK_HADOOP_VERSION=2.0.0-mr1-cdh4.2.0 sbt/sbt assembly
 For Apache Hadoop 2.x, 0.23.x, Cloudera CDH MRv2, and other Hadoop versions
-with YARN, also set `SPARK_WITH_YARN=true`:
+with YARN, also set `SPARK_YARN=true`:
    # Apache Hadoop 2.0.5-alpha
-    $ SPARK_HADOOP_VERSION=2.0.5-alpha SPARK_WITH_YARN=true sbt/sbt assembly
+    $ SPARK_HADOOP_VERSION=2.0.5-alpha SPARK_YARN=true sbt/sbt assembly
    # Cloudera CDH 4.2.0 with MapReduce v2
-    $ SPARK_HADOOP_VERSION=2.0.0-cdh4.2.0 SPARK_WITH_YARN=true sbt/sbt assembly
+    $ SPARK_HADOOP_VERSION=2.0.0-cdh4.2.0 SPARK_YARN=true sbt/sbt assembly
 For convenience, these variables may also be set through the `conf/spark-env.sh` file
 described below.
@@ -81,22 +79,14 @@ If your project is built with Maven, add this to your POM file's `<dependencies>
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-client</artifactId>
-      <!-- the brackets are needed to tell Maven that this is a hard dependency on version "1.2.1" exactly -->
+      <version>1.2.1</version>
-      <version>[1.2.1]</version>
    </dependency>
 ## Configuration
-Please refer to the "Configuration" guide in the online documentation for a
+Please refer to the [Configuration guide](http://spark.incubator.apache.org/docs/latest/configuration.html)
-full overview on how to configure Spark. At the minimum, you will need to
+in the online documentation for an overview on how to configure Spark.
-create a `conf/spark-env.sh` script (copy `conf/spark-env.sh.template`) and
-set the following two variables:
- `SCALA_HOME`: Location where Scala is installed.
- `MESOS_NATIVE_LIBRARY`: Your Mesos library (only needed if you want to run
-  on Mesos). For example, this might be `/usr/local/lib/libmesos.so` on Linux.
 ## Contributing to Spark