-
- Downloads
[SPARK-14468] Always enable OutputCommitCoordinator
## What changes were proposed in this pull request? `OutputCommitCoordinator` was introduced to deal with concurrent task attempts racing to write output, leading to data loss or corruption. For more detail, read the [JIRA description](https://issues.apache.org/jira/browse/SPARK-14468). Before: `OutputCommitCoordinator` is enabled only if speculation is enabled. After: `OutputCommitCoordinator` is always enabled. Users may still disable this through `spark.hadoop.outputCommitCoordination.enabled`, but they really shouldn't... ## How was this patch tested? `OutputCommitCoordinator*Suite` Author: Andrew Or <andrew@databricks.com> Closes #12244 from andrewor14/always-occ.
Showing
- core/src/main/scala/org/apache/spark/mapred/SparkHadoopMapRedUtil.scala 6 additions, 10 deletions...scala/org/apache/spark/mapred/SparkHadoopMapRedUtil.scala
- core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorIntegrationSuite.scala 1 addition, 1 deletion...k/scheduler/OutputCommitCoordinatorIntegrationSuite.scala
- core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala 1 addition, 1 deletion...apache/spark/scheduler/OutputCommitCoordinatorSuite.scala
Please register or sign in to comment