-
- Downloads
[SPARK-14015][SQL] Support TimestampType in vectorized parquet reader
## What changes were proposed in this pull request? This PR adds support for TimestampType in the vectorized parquet reader ## How was this patch tested? 1. `VectorizedColumnReader` initially had a gating condition on `primitiveType.getPrimitiveTypeName() == PrimitiveType.PrimitiveTypeName.INT96)` that made us fall back on parquet-mr for handling timestamps. This condition is now removed. 2. The `ParquetHadoopFsRelationSuite` (that tests for all supported hive types -- including `TimestampType`) fails when the gating condition is removed (https://github.com/apache/spark/pull/11808) and should now pass with this change. Similarly, the `ParquetHiveCompatibilitySuite.SPARK-10177 timestamp` test that fails when the gating condition is removed, should now pass as well. 3. Added tests in `HadoopFsRelationTest` that test both the dictionary encoded and non-encoded versions across all supported datatypes. Author: Sameer Agarwal <sameer@databricks.com> Closes #11882 from sameeragarwal/timestamp-parquet.
Showing
- sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java 26 additions, 3 deletions...execution/datasources/parquet/VectorizedColumnReader.java
- sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedParquetRecordReader.java 0 additions, 13 deletions...on/datasources/parquet/VectorizedParquetRecordReader.java
- sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java 1 addition, 1 deletion...e/spark/sql/execution/vectorized/OffHeapColumnVector.java
- sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java 2 additions, 1 deletion...he/spark/sql/execution/vectorized/OnHeapColumnVector.java
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystRowConverter.scala 10 additions, 0 deletions.../execution/datasources/parquet/CatalystRowConverter.scala
- sql/hive/src/test/scala/org/apache/spark/sql/sources/hadoopFsRelationSuites.scala 47 additions, 35 deletions...org/apache/spark/sql/sources/hadoopFsRelationSuites.scala
Loading
Please register or sign in to comment