Skip to content
Snippets Groups Projects
Commit ab3363e9 authored by Drew Robb's avatar Drew Robb Committed by Yanbo Liang
Browse files

[SPARK-17986][ML] SQLTransformer should remove temporary tables

## What changes were proposed in this pull request?

A call to the method `SQLTransformer.transform` previously would create a temporary table and never delete it. This change adds a call to `dropTempView()` that deletes this temporary table before returning the result so that the table will not remain in spark's table catalog. Because `tableName` is randomized and not exposed, there should be no expected use of this table outside of the `transform` method.

## How was this patch tested?

A single new assertion was added to the existing test of the `SQLTransformer.transform` method that all temporary tables are removed. Without the corresponding code change, this new assertion fails. I am not aware of any circumstances in which removing this temporary view would be bad for performance or correctness in other ways, but some expertise here would be helpful.

Author: Drew Robb <drewrobb@gmail.com>

Closes #15526 from drewrobb/SPARK-17986.
parent 01b26a06
No related branches found
No related tags found
No related merge requests found
...@@ -67,7 +67,9 @@ class SQLTransformer @Since("1.6.0") (@Since("1.6.0") override val uid: String) ...@@ -67,7 +67,9 @@ class SQLTransformer @Since("1.6.0") (@Since("1.6.0") override val uid: String)
val tableName = Identifiable.randomUID(uid) val tableName = Identifiable.randomUID(uid)
dataset.createOrReplaceTempView(tableName) dataset.createOrReplaceTempView(tableName)
val realStatement = $(statement).replace(tableIdentifier, tableName) val realStatement = $(statement).replace(tableIdentifier, tableName)
dataset.sparkSession.sql(realStatement) val result = dataset.sparkSession.sql(realStatement)
dataset.sparkSession.catalog.dropTempView(tableName)
result
} }
@Since("1.6.0") @Since("1.6.0")
......
...@@ -43,6 +43,7 @@ class SQLTransformerSuite ...@@ -43,6 +43,7 @@ class SQLTransformerSuite
assert(result.schema.toString == resultSchema.toString) assert(result.schema.toString == resultSchema.toString)
assert(resultSchema == expected.schema) assert(resultSchema == expected.schema)
assert(result.collect().toSeq == expected.collect().toSeq) assert(result.collect().toSeq == expected.collect().toSeq)
assert(original.sparkSession.catalog.listTables().count() == 0)
} }
test("read/write") { test("read/write") {
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment