[SPARK-21912][SQL] ORC/Parquet table should not create invalid column names
## What changes were proposed in this pull request? Currently, users meet job abortions while creating or altering ORC/Parquet tables with invalid column names. We had better prevent this by raising **AnalysisException** with a guide to use aliases instead like Paquet data source tables. **BEFORE** ```scala scala> sql("CREATE TABLE orc1 USING ORC AS SELECT 1 `a b`") 17/09/04 13:28:21 ERROR Utils: Aborting task java.lang.IllegalArgumentException: Error: : expected at the position 8 of 'struct<a b:int>' but ' ' is found. 17/09/04 13:28:21 ERROR FileFormatWriter: Job job_20170904132821_0001 aborted. 17/09/04 13:28:21 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1) org.apache.spark.SparkException: Task failed while writing rows. ``` **AFTER** ```scala scala> sql("CREATE TABLE orc1 USING ORC AS SELECT 1 `a b`") 17/09/04 13:27:40 ERROR CreateDataSourceTableAsSelectCommand: Failed to write to table orc1 org.apache.spark.sql.AnalysisException: Attribute name "a b" contains invalid character(s) among " ,;{}()\n\t=". Please use alias to rename it.; ``` ## How was this patch tested? Pass the Jenkins with a new test case. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #19124 from dongjoon-hyun/SPARK-21912.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala 21 additions, 0 deletions...in/scala/org/apache/spark/sql/execution/command/ddl.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala 3 additions, 2 deletions...scala/org/apache/spark/sql/execution/command/tables.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala 2 additions, 0 deletions.../spark/sql/execution/datasources/DataSourceStrategy.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala 42 additions, 0 deletions...e/spark/sql/execution/datasources/orc/OrcFileFormat.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala 1 addition, 1 deletion...xecution/datasources/parquet/ParquetSchemaConverter.scala
- sql/core/src/test/resources/sql-tests/inputs/show_columns.sql 2 additions, 2 deletions...core/src/test/resources/sql-tests/inputs/show_columns.sql
- sql/core/src/test/resources/sql-tests/results/show_columns.sql.out 2 additions, 2 deletions...src/test/resources/sql-tests/results/show_columns.sql.out
- sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala 2 additions, 0 deletions...main/scala/org/apache/spark/sql/hive/HiveStrategies.scala
- sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala 34 additions, 0 deletions...a/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
Loading
Please register or sign in to comment