Update README

1c73d897 · Matei Zaharia · 64dbf8d3 · 1c73d897
Commit 1c73d897 authored 12 years ago by Matei Zaharia
--- a/README.md
+++ b/README.md
@@ -20,7 +20,7 @@ which is packaged with it. To build Spark and its example programs, run:
 To run Spark, you will need to have Scala's bin in your `PATH`, or you
 will need to set the `SCALA_HOME` environment variable to point to where
 you've installed Scala. Scala must be accessible through one of these
-methods on Mesos slave nodes as well as on the master.
+methods on your cluster's worker nodes as well as its master.

 To run one of the examples, use `./run <class> <params>`. For example:

@@ -31,11 +31,11 @@ will run the Logistic Regression example locally on 2 CPUs.
 Each of the example programs prints usage help if no params are given.

 All of the Spark samples take a `<host>` parameter that is the Mesos master
-to connect to. This can be a Mesos URL, or "local" to run locally with one
-thread, or "local[N]" to run locally with N threads.
+to connect to. This can be a Mesos or Spark cluster URL, or "local" to run
+locally with one thread, or "local[N]" to run locally with N threads.


-## A Note About Hadoop
+## A Note About Hadoop Versions

 Spark uses the Hadoop core library to talk to HDFS and other Hadoop-supported
 storage systems. Because the HDFS API has changed in different versions of
@@ -46,27 +46,23 @@ of `project/SparkBuild.scala`, then rebuilding Spark.

 ## Configuration

-Spark can be configured through two files: `conf/java-opts` and
-`conf/spark-env.sh`.
+Please refer to the "Configuration" guide in the online documentation for a
+full overview on how to configure Spark. At the minimum, you will need to
+create a `conf/spark-env.sh` script (copy `conf/spark-env.sh.template`) and
+set the following two variables:

-In `java-opts`, you can add flags to be passed to the JVM when running Spark.
+- `SCALA_HOME`: Location where Scala is installed.

-In `spark-env.sh`, you can set any environment variables you wish to be available
-when running Spark programs, such as `PATH`, `SCALA_HOME`, etc. There are also
-several Spark-specific variables you can set:
+- `MESOS_NATIVE_LIBRARY`: Your Mesos library (only needed if you want to run
+  on Mesos). For example, this might be `/usr/local/lib/libmesos.so` on Linux.

- `SPARK_CLASSPATH`: Extra entries to be added to the classpath, separated by ":".

- `SPARK_MEM`: Memory for Spark to use, in the format used by java's `-Xmx`
-  option (for example, `-Xmx200m` means 200 MB, `-Xmx1g` means 1 GB, etc).
+## Contributing to Spark

- `SPARK_LIBRARY_PATH`: Extra entries to add to `java.library.path` for locating
-  shared libraries.
-
- `SPARK_JAVA_OPTS`: Extra options to pass to JVM.
-
- `MESOS_NATIVE_LIBRARY`: Your Mesos library, if you want to run on a Mesos
-  cluster. For example, this might be `/usr/local/lib/libmesos.so` on Linux.
-
-Note that `spark-env.sh` must be a shell script (it must be executable and start
-with a `#!` header to specify the shell to use).
+Contributions via GitHub pull requests are gladly accepted from their original
+author. Along with any pull requests, please state that the contribution is
+your original work and that you license the work to the project under the
+project's open source license. Whether or not you state this explicitly, by
+submitting any copyrighted material via pull request, email, or other means
+you agree to license the material under the project's open source license and
+warrant that you have the legal authority to do so.