-
- Downloads
[SPARK-3036][SPARK-3037][SQL] Add MapType/ArrayType containing null value support to Parquet.
JIRA: - https://issues.apache.org/jira/browse/SPARK-3036 - https://issues.apache.org/jira/browse/SPARK-3037 Currently this uses the following Parquet schema for `MapType` when `valueContainsNull` is `true`: ``` message root { optional group a (MAP) { repeated group map (MAP_KEY_VALUE) { required int32 key; optional int32 value; } } } ``` for `ArrayType` when `containsNull` is `true`: ``` message root { optional group a (LIST) { repeated group bag { optional int32 array; } } } ``` We have to think about compatibilities with older version of Spark or Hive or others I mentioned in the JIRA issues. Notice: This PR is based on #1963 and #1889. Please check them first. /cc marmbrus, yhuai Author: Takuya UESHIN <ueshin@happy-camper.st> Closes #2032 from ueshin/issues/SPARK-3036_3037 and squashes the following commits: 4e8e9e7 [Takuya UESHIN] Add ArrayType containing null value support to Parquet. 013c2ca [Takuya UESHIN] Add MapType containing null value support to Parquet. 62989de [Takuya UESHIN] Merge branch 'issues/SPARK-2969' into issues/SPARK-3036_3037 8e38b53 [Takuya UESHIN] Merge branch 'issues/SPARK-3063' into issues/SPARK-3036_3037
Showing
- sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetConverter.scala 83 additions, 0 deletions...scala/org/apache/spark/sql/parquet/ParquetConverter.scala
- sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableSupport.scala 33 additions, 21 deletions...la/org/apache/spark/sql/parquet/ParquetTableSupport.scala
- sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala 39 additions, 15 deletions...ain/scala/org/apache/spark/sql/parquet/ParquetTypes.scala
- sql/core/src/test/scala/org/apache/spark/sql/parquet/ParquetQuerySuite.scala 12 additions, 4 deletions...cala/org/apache/spark/sql/parquet/ParquetQuerySuite.scala
Loading
Please register or sign in to comment