Skip to content
Snippets Groups Projects
Commit 3b7fb84c authored by gatorsmile's avatar gatorsmile Committed by Yin Huai
Browse files

[SPARK-15676][SQL] Disallow Column Names as Partition Columns For Hive Tables

#### What changes were proposed in this pull request?
When creating a Hive Table (not data source tables), a common error users might make is to specify an existing column name as a partition column. Below is what Hive returns in this case:
```
hive> CREATE TABLE partitioned (id bigint, data string) PARTITIONED BY (data string, part string);
FAILED: SemanticException [Error 10035]: Column repeated in partitioning columns
```
Currently, the error we issued is very confusing:
```
org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:For direct MetaStore DB connections, we don't support retries at the client level.);
```
This PR is to fix the above issue by capturing the usage error in `Parser`.

#### How was this patch tested?
Added a test case to `DDLCommandSuite`

Author: gatorsmile <gatorsmile@gmail.com>

Closes #13415 from gatorsmile/partitionColumnsInTableSchema.
parent a6a18a45
No related branches found
No related tags found
No related merge requests found
...@@ -903,6 +903,23 @@ class SparkSqlAstBuilder(conf: SQLConf) extends AstBuilder { ...@@ -903,6 +903,23 @@ class SparkSqlAstBuilder(conf: SQLConf) extends AstBuilder {
val properties = Option(ctx.tablePropertyList).map(visitPropertyKeyValues).getOrElse(Map.empty) val properties = Option(ctx.tablePropertyList).map(visitPropertyKeyValues).getOrElse(Map.empty)
val selectQuery = Option(ctx.query).map(plan) val selectQuery = Option(ctx.query).map(plan)
// Ensuring whether no duplicate name is used in table definition
val colNames = cols.map(_.name)
if (colNames.length != colNames.distinct.length) {
val duplicateColumns = colNames.groupBy(identity).collect {
case (x, ys) if ys.length > 1 => "\"" + x + "\""
}
throw operationNotAllowed(s"Duplicated column names found in table definition of $name: " +
duplicateColumns.mkString("[", ",", "]"), ctx)
}
// For Hive tables, partition columns must not be part of the schema
val badPartCols = partitionCols.map(_.name).toSet.intersect(colNames.toSet)
if (badPartCols.nonEmpty) {
throw operationNotAllowed(s"Partition columns may not be specified in the schema: " +
badPartCols.map("\"" + _ + "\"").mkString("[", ",", "]"), ctx)
}
// Note: Hive requires partition columns to be distinct from the schema, so we need // Note: Hive requires partition columns to be distinct from the schema, so we need
// to include the partition columns here explicitly // to include the partition columns here explicitly
val schema = cols ++ partitionCols val schema = cols ++ partitionCols
......
...@@ -334,6 +334,20 @@ class DDLCommandSuite extends PlanTest { ...@@ -334,6 +334,20 @@ class DDLCommandSuite extends PlanTest {
assert(ct.table.storage.locationUri == Some("/something/anything")) assert(ct.table.storage.locationUri == Some("/something/anything"))
} }
test("create table - column repeated in partitioning columns") {
val query = "CREATE TABLE tab1 (key INT, value STRING) PARTITIONED BY (key INT, hr STRING)"
val e = intercept[ParseException] { parser.parsePlan(query) }
assert(e.getMessage.contains(
"Operation not allowed: Partition columns may not be specified in the schema: [\"key\"]"))
}
test("create table - duplicate column names in the table definition") {
val query = "CREATE TABLE default.tab1 (key INT, key STRING)"
val e = intercept[ParseException] { parser.parsePlan(query) }
assert(e.getMessage.contains("Operation not allowed: Duplicated column names found in " +
"table definition of `default`.`tab1`: [\"key\"]"))
}
test("create table using - with partitioned by") { test("create table using - with partitioned by") {
val query = "CREATE TABLE my_tab(a INT, b STRING) USING parquet PARTITIONED BY (a)" val query = "CREATE TABLE my_tab(a INT, b STRING) USING parquet PARTITIONED BY (a)"
val expected = CreateTableUsing( val expected = CreateTableUsing(
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment