Commit d99b49b1 authored 7 years ago by Eric Liang Committed by Herman van Hovell 7 years ago

[SPARK-20450][SQL] Unexpected first-query schema inference cost with 2.1.1

## What changes were proposed in this pull request?

https://issues.apache.org/jira/browse/SPARK-19611 fixes a regression from 2.0 where Spark silently fails to read case-sensitive fields missing a case-sensitive schema in the table properties. The fix is to detect this situation, infer the schema, and write the case-sensitive schema into the metastore.

However this can incur an unexpected performance hit the first time such a problematic table is queried (and there is a high false-positive rate here since most tables don't actually have case-sensitive fields).

This PR changes the default to NEVER_INFER (same behavior as 2.1.0). In 2.2, we can consider leaving the default to INFER_AND_SAVE.

## How was this patch tested?

Unit tests.

Author: Eric Liang <ekl@databricks.com>

Closes #17749 from ericl/spark-20450.

parent ba505805

No related branches found

No related tags found

No related merge requests found

Hide whitespace changes

Inline Side-by-side

Showing with 1 addition and 1 deletion

Please register or to comment