-
- Downloads
[SPARK-21144][SQL] Print a warning if the data schema and partition schema...
[SPARK-21144][SQL] Print a warning if the data schema and partition schema have the duplicate columns ## What changes were proposed in this pull request? The current master outputs unexpected results when the data schema and partition schema have the duplicate columns: ``` withTempPath { dir => val basePath = dir.getCanonicalPath spark.range(0, 3).toDF("foo").write.parquet(new Path(basePath, "foo=1").toString) spark.range(0, 3).toDF("foo").write.parquet(new Path(basePath, "foo=a").toString) spark.read.parquet(basePath).show() } +---+ |foo| +---+ | 1| | 1| | a| | a| | 1| | a| +---+ ``` This patch added code to print a warning when the duplication found. ## How was this patch tested? Manually checked. Author: Takeshi Yamamuro <yamamuro@apache.org> Closes #18375 from maropu/SPARK-21144-3.
Showing
- sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala 53 additions, 0 deletions...rc/main/scala/org/apache/spark/sql/util/SchemaUtils.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala 6 additions, 0 deletions...g/apache/spark/sql/execution/datasources/DataSource.scala
Loading
Please register or sign in to comment