-
- Downloads
[SPARK-19720][CORE] Redact sensitive information from SparkSubmit console
## What changes were proposed in this pull request? This change redacts senstive information (based on `spark.redaction.regex` property) from the Spark Submit console logs. Such sensitive information is already being redacted from event logs and yarn logs, etc. ## How was this patch tested? Testing was done manually to make sure that the console logs were not printing any sensitive information. Here's some output from the console: ``` Spark properties used, including those specified through --conf and those from the properties file /etc/spark2/conf/spark-defaults.conf: (spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD,*********(redacted)) (spark.authenticate,false) (spark.executorEnv.HADOOP_CREDSTORE_PASSWORD,*********(redacted)) ``` ``` System properties: (spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD,*********(redacted)) (spark.authenticate,false) (spark.executorEnv.HADOOP_CREDSTORE_PASSWORD,*********(redacted)) ``` There is a risk if new print statements were added to the console down the road, sensitive information may still get leaked, since there is no test that asserts on the console log output. I considered it out of the scope of this JIRA to write an integration test to make sure new leaks don't happen in the future. Running unit tests to make sure nothing else is broken by this change. Author: Mark Grover <mark@apache.org> Closes #17047 from markgrover/master_redaction.
Showing
- core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala 2 additions, 1 deletion.../src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
- core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala 9 additions, 3 deletions.../scala/org/apache/spark/deploy/SparkSubmitArguments.scala
- core/src/main/scala/org/apache/spark/util/Utils.scala 20 additions, 1 deletioncore/src/main/scala/org/apache/spark/util/Utils.scala
Loading
Please register or sign in to comment