-
- Downloads
[SPARK-10988] [SQL] Reduce duplication in Aggregate2's expression rewriting logic
In `aggregate/utils.scala`, there is a substantial amount of duplication in the expression-rewriting logic. As a prerequisite to supporting imperative aggregate functions in `TungstenAggregate`, this patch refactors this file so that the same expression-rewriting logic is used for both `SortAggregate` and `TungstenAggregate`. In order to allow both operators to use the same rewriting logic, `TungstenAggregationIterator. generateResultProjection()` has been updated so that it first evaluates all declarative aggregate functions' `evaluateExpression`s and writes the results into a temporary buffer, and then uses this temporary buffer and the grouping expressions to evaluate the final resultExpressions. This matches the logic in SortAggregateIterator, where this two-pass approach is necessary in order to support imperative aggregates. If this change turns out to cause performance regressions, then we can look into re-implementing the single-pass evaluation in a cleaner way as part of a followup patch. Since the rewriting logic is now shared across both operators, this patch also extracts that logic and places it in `SparkStrategies`. This makes the rewriting logic a bit easier to follow, I think. Author: Josh Rosen <joshrosen@databricks.com> Closes #9015 from JoshRosen/SPARK-10988.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala 52 additions, 15 deletions...cala/org/apache/spark/sql/execution/SparkStrategies.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala 4 additions, 0 deletions...che/spark/sql/execution/aggregate/TungstenAggregate.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala 18 additions, 4 deletions...sql/execution/aggregate/TungstenAggregationIterator.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/utils.scala 68 additions, 176 deletions...cala/org/apache/spark/sql/execution/aggregate/utils.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIteratorSuite.scala 1 addition, 1 deletion...xecution/aggregate/TungstenAggregationIteratorSuite.scala
Loading
Please register or sign in to comment