-
- Downloads
[SPARK-18980][SQL] implement Aggregator with TypedImperativeAggregate
## What changes were proposed in this pull request? Currently we implement `Aggregator` with `DeclarativeAggregate`, which will serialize/deserialize the buffer object every time we process an input. This PR implements `Aggregator` with `TypedImperativeAggregate` and avoids to serialize/deserialize buffer object many times. The benchmark shows we get about 2 times speed up. For simple buffer object that doesn't need serialization, we still go with `DeclarativeAggregate`, to avoid performance regression. ## How was this patch tested? N/A Author: Wenchen Fan <wenchen@databricks.com> Closes #16383 from cloud-fan/aggregator.
Showing
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala 4 additions, 2 deletions...atalyst/expressions/aggregate/ApproximatePercentile.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountMinSketchAgg.scala 4 additions, 2 deletions...ql/catalyst/expressions/aggregate/CountMinSketchAgg.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Percentile.scala 8 additions, 2 deletions...spark/sql/catalyst/expressions/aggregate/Percentile.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/interfaces.scala 14 additions, 9 deletions...spark/sql/catalyst/expressions/aggregate/interfaces.scala
- sql/core/src/main/scala/org/apache/spark/sql/Column.scala 4 additions, 4 deletionssql/core/src/main/scala/org/apache/spark/sql/Column.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TypedAggregateExpression.scala 160 additions, 25 deletions...rk/sql/execution/aggregate/TypedAggregateExpression.scala
- sql/core/src/test/scala/org/apache/spark/sql/DatasetBenchmark.scala 6 additions, 6 deletions...rc/test/scala/org/apache/spark/sql/DatasetBenchmark.scala
- sql/core/src/test/scala/org/apache/spark/sql/TypedImperativeAggregateSuite.scala 4 additions, 2 deletions.../org/apache/spark/sql/TypedImperativeAggregateSuite.scala
- sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala 4 additions, 2 deletions...e/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala
- sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/TestingTypedCount.scala 4 additions, 2 deletions...g/apache/spark/sql/hive/execution/TestingTypedCount.scala
Loading
Please register or sign in to comment