diff --git a/README.md b/README.md index e7af34f51319737493d876893a8e34b2f9bdc012..89b5a0abfd7f17044a656dc2c58ed559642a9940 100644 --- a/README.md +++ b/README.md @@ -16,7 +16,7 @@ Spark requires Scala 2.9.3 (Scala 2.10 is not yet supported). The project is built using Simple Build Tool (SBT), which is packaged with it. To build Spark and its example programs, run: - sbt/sbt package assembly + sbt/sbt assembly Spark also supports building using Maven. If you would like to build using Maven, see the [instructions for building Spark with Maven](http://spark-project.org/docs/latest/building-with-maven.html) @@ -52,19 +52,19 @@ For Apache Hadoop versions 1.x, Cloudera CDH MRv1, and other Hadoop versions without YARN, use: # Apache Hadoop 1.2.1 - $ SPARK_HADOOP_VERSION=1.2.1 sbt/sbt package assembly + $ SPARK_HADOOP_VERSION=1.2.1 sbt/sbt assembly # Cloudera CDH 4.2.0 with MapReduce v1 - $ SPARK_HADOOP_VERSION=2.0.0-mr1-cdh4.2.0 sbt/sbt package assembly + $ SPARK_HADOOP_VERSION=2.0.0-mr1-cdh4.2.0 sbt/sbt assembly For Apache Hadoop 2.x, 0.23.x, Cloudera CDH MRv2, and other Hadoop versions with YARN, also set `SPARK_WITH_YARN=true`: # Apache Hadoop 2.0.5-alpha - $ SPARK_HADOOP_VERSION=2.0.5-alpha SPARK_WITH_YARN=true sbt/sbt package assembly + $ SPARK_HADOOP_VERSION=2.0.5-alpha SPARK_WITH_YARN=true sbt/sbt assembly # Cloudera CDH 4.2.0 with MapReduce v2 - $ SPARK_HADOOP_VERSION=2.0.0-cdh4.2.0 SPARK_WITH_YARN=true sbt/sbt package assembly + $ SPARK_HADOOP_VERSION=2.0.0-cdh4.2.0 SPARK_WITH_YARN=true sbt/sbt assembly For convenience, these variables may also be set through the `conf/spark-env.sh` file described below. diff --git a/docs/building-with-maven.md b/docs/building-with-maven.md index a9f2cb8a7ae1828401b9e8998d81a61070a0b396..72d37fec0ae976b5062625ed372557e02388353b 100644 --- a/docs/building-with-maven.md +++ b/docs/building-with-maven.md @@ -15,18 +15,18 @@ To enable support for HDFS and other Hadoop-supported storage systems, specify t For Apache Hadoop versions 1.x, Cloudera CDH MRv1, and other Hadoop versions without YARN, use: # Apache Hadoop 1.2.1 - $ mvn -Dhadoop.version=1.2.1 clean install + $ mvn -Dhadoop.version=1.2.1 clean package # Cloudera CDH 4.2.0 with MapReduce v1 - $ mvn -Dhadoop.version=2.0.0-mr1-cdh4.2.0 clean install + $ mvn -Dhadoop.version=2.0.0-mr1-cdh4.2.0 clean package For Apache Hadoop 2.x, 0.23.x, Cloudera CDH MRv2, and other Hadoop versions with YARN, enable the "hadoop2-yarn" profile: # Apache Hadoop 2.0.5-alpha - $ mvn -Phadoop2-yarn -Dhadoop.version=2.0.5-alpha clean install + $ mvn -Phadoop2-yarn -Dhadoop.version=2.0.5-alpha clean package # Cloudera CDH 4.2.0 with MapReduce v2 - $ mvn -Phadoop2-yarn -Dhadoop.version=2.0.0-cdh4.2.0 clean install + $ mvn -Phadoop2-yarn -Dhadoop.version=2.0.0-cdh4.2.0 clean package ## Spark Tests in Maven ## @@ -35,7 +35,7 @@ Tests are run by default via the scalatest-maven-plugin. With this you can do th Skip test execution (but not compilation): - $ mvn -Dhadoop.version=... -DskipTests clean install + $ mvn -Dhadoop.version=... -DskipTests clean package To run a specific test suite: @@ -72,8 +72,8 @@ This setup works fine in IntelliJ IDEA 11.1.4. After opening the project via the ## Building Spark Debian Packages ## -It includes support for building a Debian package containing a 'fat-jar' which includes the repl, the examples and bagel. This can be created by specifying the deb profile: +It includes support for building a Debian package containing a 'fat-jar' which includes the repl, the examples and bagel. This can be created by specifying the following profiles: - $ mvn -Pdeb clean install + $ mvn -Prepl-bin -Pdeb clean package The debian package can then be found under repl/target. We added the short commit hash to the file name so that we can distinguish individual packages build for SNAPSHOT versions. diff --git a/docs/index.md b/docs/index.md index e51a6998f6c41b55814d4e269df8359942aa677c..ec9c7dd4f30eb2e380b3573433b4952ad84b4780 100644 --- a/docs/index.md +++ b/docs/index.md @@ -20,7 +20,7 @@ of these methods on slave nodes on your cluster. Spark uses [Simple Build Tool](http://www.scala-sbt.org), which is bundled with it. To compile the code, go into the top-level Spark directory and run - sbt/sbt package + sbt/sbt assembly Spark also supports building using Maven. If you would like to build using Maven, see the [instructions for building Spark with Maven](building-with-maven.html). diff --git a/docs/python-programming-guide.md b/docs/python-programming-guide.md index 794bff56474c6b7331ba6906a09f64fa93de5b39..15d3ebfcae60c776f8d45ef7ce3e1ea2e5ca0780 100644 --- a/docs/python-programming-guide.md +++ b/docs/python-programming-guide.md @@ -70,7 +70,7 @@ The script automatically adds the `pyspark` package to the `PYTHONPATH`. The `pyspark` script launches a Python interpreter that is configured to run PySpark jobs. To use `pyspark` interactively, first build Spark, then launch it directly from the command line without any options: {% highlight bash %} -$ sbt/sbt package +$ sbt/sbt assembly $ ./pyspark {% endhighlight %} diff --git a/docs/quick-start.md b/docs/quick-start.md index 335643536aac959386cf8a915f5912d393a06d2f..4e9deadbaa8a39660219e553747c1ccc8b7ba2ce 100644 --- a/docs/quick-start.md +++ b/docs/quick-start.md @@ -12,7 +12,7 @@ See the [programming guide](scala-programming-guide.html) for a more complete re To follow along with this guide, you only need to have successfully built Spark on one machine. Simply go into your Spark directory and run: {% highlight bash %} -$ sbt/sbt package +$ sbt/sbt assembly {% endhighlight %} # Interactive Analysis with the Spark Shell