Snippets Groups Projects

9 years ago

[SPARK-11039][Documentation][Web UI] Document additional ui configurations · b591de7c

Nick Pritchard authored 9 years ago

Add documentation for configuration:
- spark.sql.ui.retainedExecutions
- spark.streaming.ui.retainedBatches

Author: Nick Pritchard <nicholas.pritchard@falkonry.com>

Closes #9052 from pnpritchard/SPARK-11039.

b591de7c

[SPARK-11039][Documentation][Web UI] Document additional ui configurations

Nick Pritchard authored 9 years ago

Add documentation for configuration:
- spark.sql.ui.retainedExecutions
- spark.streaming.ui.retainedBatches

Author: Nick Pritchard <nicholas.pritchard@falkonry.com>

Closes #9052 from pnpritchard/SPARK-11039.

configuration.md 60.87 KiB

layout: global
displayTitle: Spark Configuration
title: Configuration

This will become a table of contents (this text will be scraped). {:toc}

Spark provides three locations to configure the system:

Spark properties control most application parameters and can be set by using a SparkConf object, or through Java system properties.
Environment variables can be used to set per-machine settings, such as the IP address, through the conf/spark-env.sh script on each node.
Logging can be configured through log4j.properties.

Spark Properties

Spark properties control most application settings and are configured separately for each application. These properties can be set directly on a SparkConf passed to your SparkContext. SparkConf allows you to configure some of the common properties (e.g. master URL and application name), as well as arbitrary key-value pairs through the set() method. For example, we could initialize an application with two threads as follows:

Note that we run with local[2], meaning two threads - which represents "minimal" parallelism, which can help detect bugs that only exist when we run in a distributed context.

{% highlight scala %} val conf = new SparkConf() .setMaster("local[2]") .setAppName("CountingSheep") val sc = new SparkContext(conf) {% endhighlight %}

Note that we can have more than 1 thread in local mode, and in cases like Spark Streaming, we may actually require one to prevent any sort of starvation issues.

Properties that specify some time duration should be configured with a unit of time. The following format is accepted: