-
- Downloads
[SPARK-11007] [SQL] Adds dictionary aware Parquet decimal converters
For Parquet decimal columns that are encoded using plain-dictionary encoding, we can make the upper level converter aware of the dictionary, so that we can pre-instantiate all the decimals to avoid duplicated instantiation. Note that plain-dictionary encoding isn't available for `FIXED_LEN_BYTE_ARRAY` for Parquet writer version `PARQUET_1_0`. So currently only decimals written as `INT32` and `INT64` can benefit from this optimization. Author: Cheng Lian <lian@databricks.com> Closes #9040 from liancheng/spark-11007.decimal-converter-dict-support.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystRowConverter.scala 71 additions, 12 deletions.../execution/datasources/parquet/CatalystRowConverter.scala
- sql/core/src/test/resources/dec-in-i32.parquet 0 additions, 0 deletionssql/core/src/test/resources/dec-in-i32.parquet
- sql/core/src/test/resources/dec-in-i64.parquet 0 additions, 0 deletionssql/core/src/test/resources/dec-in-i64.parquet
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala 19 additions, 0 deletions...rk/sql/execution/datasources/parquet/ParquetIOSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetProtobufCompatibilitySuite.scala 8 additions, 14 deletions...tasources/parquet/ParquetProtobufCompatibilitySuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetTest.scala 5 additions, 0 deletions...spark/sql/execution/datasources/parquet/ParquetTest.scala
Loading
Please register or sign in to comment