-
- Downloads
[SPARK-12992] [SQL] Support vectorized decoding in UnsafeRowParquetRecordReader.
WIP: running tests. Code needs a bit of clean up. This patch completes the vectorized decoding with the goal of passing the existing tests. There is still more patches to support the rest of the format spec, even just for flat schemas. This patch adds a new flag to enable the vectorized decoding. Tests were updated to try with both modes where applicable. Once this is working well, we can remove the previous code path. Author: Nong Li <nong@databricks.com> Closes #11055 from nongli/spark-12992-2.
Showing
- sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/UnsafeRowParquetRecordReader.java 154 additions, 20 deletions...ion/datasources/parquet/UnsafeRowParquetRecordReader.java
- sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java 53 additions, 6 deletions...tion/datasources/parquet/VectorizedPlainValuesReader.java
- sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java 174 additions, 6 deletions...cution/datasources/parquet/VectorizedRleValuesReader.java
- sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedValuesReader.java 9 additions, 4 deletions...execution/datasources/parquet/VectorizedValuesReader.java
- sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVectorUtils.java 4 additions, 0 deletions...che/spark/sql/execution/vectorized/ColumnVectorUtils.java
- sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnarBatch.java 33 additions, 6 deletions.../apache/spark/sql/execution/vectorized/ColumnarBatch.java
- sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java 3 additions, 1 deletion...e/spark/sql/execution/vectorized/OffHeapColumnVector.java
- sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java 1 addition, 1 deletion...he/spark/sql/execution/vectorized/OnHeapColumnVector.java
- sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala 8 additions, 0 deletionssql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SqlNewHadoopRDD.scala 3 additions, 0 deletions...che/spark/sql/execution/datasources/SqlNewHadoopRDD.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystSchemaConverter.scala 2 additions, 1 deletion...ecution/datasources/parquet/CatalystSchemaConverter.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala 57 additions, 29 deletions...rk/sql/execution/datasources/parquet/ParquetIOSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetQuerySuite.scala 4 additions, 2 deletions...sql/execution/datasources/parquet/ParquetQuerySuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadBenchmark.scala 24 additions, 9 deletions.../execution/datasources/parquet/ParquetReadBenchmark.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetTest.scala 18 additions, 4 deletions...spark/sql/execution/datasources/parquet/ParquetTest.scala
- sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveParquetSuite.scala 2 additions, 1 deletion...st/scala/org/apache/spark/sql/hive/HiveParquetSuite.scala
Loading
Please register or sign in to comment