Skip to content
Snippets Groups Projects
Commit d99b49b1 authored by Eric Liang's avatar Eric Liang Committed by Herman van Hovell
Browse files

[SPARK-20450][SQL] Unexpected first-query schema inference cost with 2.1.1

## What changes were proposed in this pull request?

https://issues.apache.org/jira/browse/SPARK-19611 fixes a regression from 2.0 where Spark silently fails to read case-sensitive fields missing a case-sensitive schema in the table properties. The fix is to detect this situation, infer the schema, and write the case-sensitive schema into the metastore.

However this can incur an unexpected performance hit the first time such a problematic table is queried (and there is a high false-positive rate here since most tables don't actually have case-sensitive fields).

This PR changes the default to NEVER_INFER (same behavior as 2.1.0). In 2.2, we can consider leaving the default to INFER_AND_SAVE.

## How was this patch tested?

Unit tests.

Author: Eric Liang <ekl@databricks.com>

Closes #17749 from ericl/spark-20450.
parent ba505805
No related branches found
No related tags found
No related merge requests found
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment