-
- Downloads
[SPARK-3007][SQL] Adds dynamic partitioning support
PR #2226 was reverted because it broke Jenkins builds for unknown reason. This debugging PR aims to fix the Jenkins build. This PR also fixes two bugs: 1. Compression configurations in `InsertIntoHiveTable` are disabled by mistake The `FileSinkDesc` object passed to the writer container doesn't have compression related configurations. These configurations are not taken care of until `saveAsHiveFile` is called. This PR moves compression code forward, right after instantiation of the `FileSinkDesc` object. 1. `PreInsertionCasts` doesn't take table partitions into account In `castChildOutput`, `table.attributes` only contains non-partition columns, thus for partitioned table `childOutputDataTypes` never equals to `tableOutputDataTypes`. This results funny analyzed plan like this: ``` == Analyzed Logical Plan == InsertIntoTable Map(partcol1 -> None, partcol2 -> None), false MetastoreRelation default, dynamic_part_table, None Project [c_0#1164,c_1#1165,c_2#1166] Project [c_0#1164,c_1#1165,c_2#1166] Project [c_0#1164,c_1#1165,c_2#1166] ... (repeats 99 times) ... Project [c_0#1164,c_1#1165,c_2#1166] Project [c_0#1164,c_1#1165,c_2#1166] Project [1 AS c_0#1164,1 AS c_1#1165,1 AS c_2#1166] Filter (key#1170 = 150) MetastoreRelation default, src, None ``` Awful though this logical plan looks, it's harmless because all projects will be eliminated by optimizer. Guess that's why this issue hasn't been caught before. Author: Cheng Lian <lian.cs.zju@gmail.com> Author: baishuo(白硕) <vc_java@hotmail.com> Author: baishuo <vc_java@hotmail.com> Closes #2616 from liancheng/dp-fix and squashes the following commits: 21935b6 [Cheng Lian] Adds back deleted trailing space f471c4b [Cheng Lian] PreInsertionCasts should take table partitions into account a132c80 [Cheng Lian] Fixes output compression 9c6eb2d [Cheng Lian] Adds tests to verify dynamic partitioning folder layout 0eed349 [Cheng Lian] Addresses @yhuai's comments 26632c3 [Cheng Lian] Adds more tests 9227181 [Cheng Lian] Minor refactoring c47470e [Cheng Lian] Refactors InsertIntoHiveTable to a Command 6fb16d7 [Cheng Lian] Fixes typo in test name, regenerated golden answer files d53daa5 [Cheng Lian] Refactors dynamic partitioning support b821611 [baishuo] pass check style 997c990 [baishuo] use HiveConf.DEFAULTPARTITIONNAME to replace hive.exec.default.partition.name 761ecf2 [baishuo] modify according micheal's advice 207c6ac [baishuo] modify for some bad indentation caea6fb [baishuo] modify code to pass scala style checks b660e74 [baishuo] delete a empty else branch cd822f0 [baishuo] do a little modify 8e7268c [baishuo] update file after test 3f91665 [baishuo(白硕)] Update Cast.scala 8ad173c [baishuo(白硕)] Update InsertIntoHiveTable.scala 051ba91 [baishuo(白硕)] Update Cast.scala d452eb3 [baishuo(白硕)] Update HiveQuerySuite.scala 37c603b [baishuo(白硕)] Update InsertIntoHiveTable.scala 98cfb1f [baishuo(白硕)] Update HiveCompatibilitySuite.scala 6af73f4 [baishuo(白硕)] Update InsertIntoHiveTable.scala adf02f1 [baishuo(白硕)] Update InsertIntoHiveTable.scala 1867e23 [baishuo(白硕)] Update SparkHadoopWriter.scala 6bb5880 [baishuo(白硕)] Update HiveQl.scala
Showing
- sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala 17 additions, 0 deletions...che/spark/sql/hive/execution/HiveCompatibilitySuite.scala
- sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala 2 additions, 1 deletion...cala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala
- sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala 0 additions, 5 deletions...ive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala
- sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala 119 additions, 99 deletions...apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
- sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveWriterContainers.scala 217 additions, 0 deletions...cala/org/apache/spark/sql/hive/hiveWriterContainers.scala
- sql/hive/src/test/resources/golden/dynamic_partition-0-be33aaa7253c8f248ff3921cd7dae340 0 additions, 0 deletions...lden/dynamic_partition-0-be33aaa7253c8f248ff3921cd7dae340
- sql/hive/src/test/resources/golden/dynamic_partition-1-640552dd462707563fd255a713f83b41 0 additions, 0 deletions...lden/dynamic_partition-1-640552dd462707563fd255a713f83b41
- sql/hive/src/test/resources/golden/dynamic_partition-2-36456c9d0d2e3ef72ab5ba9ba48e5493 1 addition, 0 deletions...lden/dynamic_partition-2-36456c9d0d2e3ef72ab5ba9ba48e5493
- sql/hive/src/test/resources/golden/dynamic_partition-3-b7f7fa7ebf666f4fee27e149d8c6961f 0 additions, 0 deletions...lden/dynamic_partition-3-b7f7fa7ebf666f4fee27e149d8c6961f
- sql/hive/src/test/resources/golden/dynamic_partition-4-8bdb71ad8cb3cc3026043def2525de3a 0 additions, 0 deletions...lden/dynamic_partition-4-8bdb71ad8cb3cc3026043def2525de3a
- sql/hive/src/test/resources/golden/dynamic_partition-5-c630dce438f3792e7fb0f523fbbb3e1e 0 additions, 0 deletions...lden/dynamic_partition-5-c630dce438f3792e7fb0f523fbbb3e1e
- sql/hive/src/test/resources/golden/dynamic_partition-6-7abc9ec8a36cdc5e89e955265a7fd7cf 0 additions, 0 deletions...lden/dynamic_partition-6-7abc9ec8a36cdc5e89e955265a7fd7cf
- sql/hive/src/test/resources/golden/dynamic_partition-7-be33aaa7253c8f248ff3921cd7dae340 0 additions, 0 deletions...lden/dynamic_partition-7-be33aaa7253c8f248ff3921cd7dae340
- sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala 94 additions, 6 deletions.../org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
Loading
Please register or sign in to comment