-
- Downloads
[SPARK-13530][SQL] Add ShortType support to UnsafeRowParquetRecordReader
JIRA: https://issues.apache.org/jira/browse/SPARK-13530 ## What changes were proposed in this pull request? By enabling vectorized parquet scanner by default, the unit test `ParquetHadoopFsRelationSuite` based on `HadoopFsRelationTest` will be failed due to the lack of short type support in `UnsafeRowParquetRecordReader`. We should fix it. The error exception: [info] ParquetHadoopFsRelationSuite: [info] - test all data types - StringType (499 milliseconds) [info] - test all data types - BinaryType (447 milliseconds) [info] - test all data types - BooleanType (520 milliseconds) [info] - test all data types - ByteType (418 milliseconds) 00:22:58.920 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 124.0 (TID 1949) org.apache.commons.lang.NotImplementedException: Unimplemented type: ShortType at org.apache.spark.sql.execution.datasources.parquet.UnsafeRowParquetRecordReader$ColumnReader.readIntBatch(UnsafeRowParquetRecordReader.java:769) at org.apache.spark.sql.execution.datasources.parquet.UnsafeRowParquetRecordReader$ColumnReader.readBatch(UnsafeRowParquetRecordReader.java:640) at org.apache.spark.sql.execution.datasources.parquet.UnsafeRowParquetRecordReader$ColumnReader.access$000(UnsafeRowParquetRecordReader.java:461) at org.apache.spark.sql.execution.datasources.parquet.UnsafeRowParquetRecordReader.nextBatch(UnsafeRowParquetRecordReader.java:224) ## How was this patch tested? The unit test `ParquetHadoopFsRelationSuite` based on `HadoopFsRelationTest` will be [failed](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52110/consoleFull) due to the lack of short type support in UnsafeRowParquetRecordReader. By adding this support, the test can be passed. Author: Liang-Chi Hsieh <viirya@gmail.com> Closes #11412 from viirya/add-shorttype-support.
Showing
- sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/UnsafeRowParquetRecordReader.java 3 additions, 0 deletions...ion/datasources/parquet/UnsafeRowParquetRecordReader.java
- sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java 33 additions, 1 deletion...cution/datasources/parquet/VectorizedRleValuesReader.java
Please register or sign in to comment