Skip to content
Snippets Groups Projects
  • WeichenXu's avatar
    91f2735a
    [DOC] add config option spark.ui.enabled into document · 91f2735a
    WeichenXu authored
    ## What changes were proposed in this pull request?
    
    The configuration doc lost the config option `spark.ui.enabled` (default value is `true`)
    I think this option is important because many cases we would like to turn it off.
    so I add it.
    
    ## How was this patch tested?
    
    N/A
    
    Author: WeichenXu <WeichenXu123@outlook.com>
    
    Closes #14604 from WeichenXu123/add_doc_param_spark_ui_enabled.
    91f2735a
    History
    [DOC] add config option spark.ui.enabled into document
    WeichenXu authored
    ## What changes were proposed in this pull request?
    
    The configuration doc lost the config option `spark.ui.enabled` (default value is `true`)
    I think this option is important because many cases we would like to turn it off.
    so I add it.
    
    ## How was this patch tested?
    
    N/A
    
    Author: WeichenXu <WeichenXu123@outlook.com>
    
    Closes #14604 from WeichenXu123/add_doc_param_spark_ui_enabled.
configuration.md 68.67 KiB
layout: global
displayTitle: Spark Configuration
title: Configuration
  • This will become a table of contents (this text will be scraped). {:toc}

Spark provides three locations to configure the system:

  • Spark properties control most application parameters and can be set by using a SparkConf object, or through Java system properties.
  • Environment variables can be used to set per-machine settings, such as the IP address, through the conf/spark-env.sh script on each node.
  • Logging can be configured through log4j.properties.

Spark Properties

Spark properties control most application settings and are configured separately for each application. These properties can be set directly on a SparkConf passed to your SparkContext. SparkConf allows you to configure some of the common properties (e.g. master URL and application name), as well as arbitrary key-value pairs through the set() method. For example, we could initialize an application with two threads as follows:

Note that we run with local[2], meaning two threads - which represents "minimal" parallelism, which can help detect bugs that only exist when we run in a distributed context.

{% highlight scala %} val conf = new SparkConf() .setMaster("local[2]") .setAppName("CountingSheep") val sc = new SparkContext(conf) {% endhighlight %}

Note that we can have more than 1 thread in local mode, and in cases like Spark Streaming, we may actually require more than 1 thread to prevent any sort of starvation issues.

Properties that specify some time duration should be configured with a unit of time. The following format is accepted: