Skip to content
Snippets Groups Projects
Commit 173aa949 authored by Michael Armbrust's avatar Michael Armbrust
Browse files

[SPARK-12546][SQL] Change default number of open parquet files

A common problem that users encounter with Spark 1.6.0 is that writing to a partitioned parquet table OOMs.  The root cause is that parquet allocates a significant amount of memory that is not accounted for by our own mechanisms.  As a workaround, we can ensure that only a single file is open per task unless the user explicitly asks for more.

Author: Michael Armbrust <michael@databricks.com>

Closes #11308 from marmbrus/parquetWriteOOM.
parent 4a91806a
No related branches found
No related tags found
No related merge requests found
......@@ -430,7 +430,7 @@ private[spark] object SQLConf {
val PARTITION_MAX_FILES =
intConf("spark.sql.sources.maxConcurrentWrites",
defaultValue = Some(5),
defaultValue = Some(1),
doc = "The maximum number of concurrent files to open before falling back on sorting when " +
"writing out files using dynamic partitioning.")
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment