Skip to content
Snippets Groups Projects
Commit d94a44d7 authored by Cheng Lian's avatar Cheng Lian Committed by Michael Armbrust
Browse files

[SPARK-3269][SQL] Decreases initial buffer size for row set to prevent OOM

When a large batch size is specified, `SparkSQLOperationManager` OOMs even if the whole result set is much smaller than the batch size.

Author: Cheng Lian <lian.cs.zju@gmail.com>

Closes #2171 from liancheng/jdbc-fetch-size and squashes the following commits:

5e1623b [Cheng Lian] Decreases initial buffer size for row set to prevent OOM
parent b1eccfc8
No related branches found
No related tags found
No related merge requests found
......@@ -66,9 +66,10 @@ class SparkSQLOperationManager(hiveContext: HiveContext) extends OperationManage
if (!iter.hasNext) {
new RowSet()
} else {
val maxRows = maxRowsL.toInt // Do you really want a row batch larger than Int Max? No.
// maxRowsL here typically maps to java.sql.Statement.getFetchSize, which is an int
val maxRows = maxRowsL.toInt
var curRow = 0
var rowSet = new ArrayBuffer[Row](maxRows)
var rowSet = new ArrayBuffer[Row](maxRows.min(1024))
while (curRow < maxRows && iter.hasNext) {
val sparkRow = iter.next()
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment