-
- Downloads
[SPARK-11328][SQL] Improve error message when hitting this issue
The issue is that the output commiter is not idempotent and retry attempts will fail because the output file already exists. It is not safe to clean up the file as this output committer is by design not retryable. Currently, the job fails with a confusing file exists error. This patch is a stop gap to tell the user to look at the top of the error log for the proper message. This is difficult to test locally as Spark is hardcoded not to retry. Manually verified by upping the retry attempts. Author: Nong Li <nong@databricks.com> Author: Nong Li <nongli@gmail.com> Closes #10080 from nongli/spark-11328.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriterContainer.scala 20 additions, 2 deletions...che/spark/sql/execution/datasources/WriterContainer.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/DirectParquetOutputCommitter.scala 2 additions, 1 deletion...on/datasources/parquet/DirectParquetOutputCommitter.scala
Please register or sign in to comment