Skip to content
Snippets Groups Projects
Commit 2462dbcc authored by Sun Rui's avatar Sun Rui Committed by Shivaram Venkataraman
Browse files

[SPARK-10971][SPARKR] RRunner should allow setting path to Rscript.

Add a new spark conf option "spark.sparkr.r.driver.command" to specify the executable for an R script in client modes.

The existing spark conf option "spark.sparkr.r.command" is used to specify the executable for an R script in cluster modes for both driver and workers. See also [launch R worker script](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/api/r/RRDD.scala#L395).

BTW, [envrionment variable "SPARKR_DRIVER_R"](https://github.com/apache/spark/blob/master/launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java#L275) is used to locate R shell on the local host.

For your information, PYSPARK has two environment variables serving simliar purpose:
PYSPARK_PYTHON	      Python binary executable to use for PySpark in both driver and workers (default is `python`).
PYSPARK_DRIVER_PYTHON	Python binary executable to use for PySpark in driver only (default is PYSPARK_PYTHON).
pySpark use the code [here](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala#L41) to determine the python executable for a python script.

Author: Sun Rui <rui.sun@intel.com>

Closes #9179 from sun-rui/SPARK-10971.
parent 4725cb98
No related branches found
No related tags found
No related merge requests found
...@@ -40,7 +40,16 @@ object RRunner { ...@@ -40,7 +40,16 @@ object RRunner {
// Time to wait for SparkR backend to initialize in seconds // Time to wait for SparkR backend to initialize in seconds
val backendTimeout = sys.env.getOrElse("SPARKR_BACKEND_TIMEOUT", "120").toInt val backendTimeout = sys.env.getOrElse("SPARKR_BACKEND_TIMEOUT", "120").toInt
val rCommand = "Rscript" val rCommand = {
// "spark.sparkr.r.command" is deprecated and replaced by "spark.r.command",
// but kept here for backward compatibility.
var cmd = sys.props.getOrElse("spark.sparkr.r.command", "Rscript")
cmd = sys.props.getOrElse("spark.r.command", cmd)
if (sys.props.getOrElse("spark.submit.deployMode", "client") == "client") {
cmd = sys.props.getOrElse("spark.r.driver.command", cmd)
}
cmd
}
// Check if the file path exists. // Check if the file path exists.
// If not, change directory to current working directory for YARN cluster mode // If not, change directory to current working directory for YARN cluster mode
......
...@@ -1589,6 +1589,20 @@ Apart from these, the following properties are also available, and may be useful ...@@ -1589,6 +1589,20 @@ Apart from these, the following properties are also available, and may be useful
Number of threads used by RBackend to handle RPC calls from SparkR package. Number of threads used by RBackend to handle RPC calls from SparkR package.
</td> </td>
</tr> </tr>
<tr>
<td><code>spark.r.command</code></td>
<td>Rscript</td>
<td>
Executable for executing R scripts in cluster modes for both driver and workers.
</td>
</tr>
<tr>
<td><code>spark.r.driver.command</code></td>
<td>spark.r.command</td>
<td>
Executable for executing R scripts in client modes for driver. Ignored in cluster modes.
</td>
</tr>
</table> </table>
#### Cluster Managers #### Cluster Managers
...@@ -1628,6 +1642,10 @@ The following variables can be set in `spark-env.sh`: ...@@ -1628,6 +1642,10 @@ The following variables can be set in `spark-env.sh`:
<td><code>PYSPARK_DRIVER_PYTHON</code></td> <td><code>PYSPARK_DRIVER_PYTHON</code></td>
<td>Python binary executable to use for PySpark in driver only (default is <code>PYSPARK_PYTHON</code>).</td> <td>Python binary executable to use for PySpark in driver only (default is <code>PYSPARK_PYTHON</code>).</td>
</tr> </tr>
<tr>
<td><code>SPARKR_DRIVER_R</code></td>
<td>R binary executable to use for SparkR shell (default is <code>R</code>).</td>
</tr>
<tr> <tr>
<td><code>SPARK_LOCAL_IP</code></td> <td><code>SPARK_LOCAL_IP</code></td>
<td>IP address of the machine to bind to.</td> <td>IP address of the machine to bind to.</td>
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment