-
- Downloads
[SPARK-10917] [SQL] improve performance of complex type in columnar cache
This PR improve the performance of complex types in columnar cache by using UnsafeProjection instead of KryoSerializer. A simple benchmark show that this PR could improve the performance of scanning a cached table with complex columns by 15x (comparing to Spark 1.5). Here is the code used to benchmark: ``` df = sc.range(1<<23).map(lambda i: Row(a=Row(b=i, c=str(i)), d=range(10), e=dict(zip(range(10), [str(i) for i in range(10)])))).toDF() df.write.parquet("table") ``` ``` df = sqlContext.read.parquet("table") df.cache() df.count() t = time.time() print df.select("*")._jdf.queryExecution().toRdd().count() print time.time() - t ``` Author: Davies Liu <davies@databricks.com> Closes #8971 from davies/complex.
Showing
- sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeArrayData.java 0 additions, 1 deletion...pache/spark/sql/catalyst/expressions/UnsafeArrayData.java
- sql/catalyst/src/main/scala/org/apache/spark/sql/types/ArrayBasedMapData.scala 5 additions, 0 deletions.../scala/org/apache/spark/sql/types/ArrayBasedMapData.scala
- sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnAccessor.scala 31 additions, 7 deletions.../scala/org/apache/spark/sql/columnar/ColumnAccessor.scala
- sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnBuilder.scala 28 additions, 12 deletions...n/scala/org/apache/spark/sql/columnar/ColumnBuilder.scala
- sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnStats.scala 7 additions, 4 deletions...ain/scala/org/apache/spark/sql/columnar/ColumnStats.scala
- sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala 190 additions, 29 deletions...main/scala/org/apache/spark/sql/columnar/ColumnType.scala
- sql/core/src/test/scala/org/apache/spark/sql/columnar/ColumnStatsSuite.scala 4 additions, 5 deletions...cala/org/apache/spark/sql/columnar/ColumnStatsSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/columnar/ColumnTypeSuite.scala 51 additions, 184 deletions...scala/org/apache/spark/sql/columnar/ColumnTypeSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/columnar/ColumnarTestUtils.scala 12 additions, 6 deletions...ala/org/apache/spark/sql/columnar/ColumnarTestUtils.scala
- sql/core/src/test/scala/org/apache/spark/sql/columnar/InMemoryColumnarQuerySuite.scala 4 additions, 4 deletions...pache/spark/sql/columnar/InMemoryColumnarQuerySuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/columnar/NullableColumnAccessorSuite.scala 9 additions, 4 deletions...ache/spark/sql/columnar/NullableColumnAccessorSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/columnar/NullableColumnBuilderSuite.scala 11 additions, 10 deletions...pache/spark/sql/columnar/NullableColumnBuilderSuite.scala
Loading
Please register or sign in to comment