-
- Downloads
[SPARK-18969][SQL] Support grouping by nondeterministic expressions
## What changes were proposed in this pull request? Currently nondeterministic expressions are allowed in `Aggregate`(see the [comment](https://github.com/apache/spark/blob/v2.0.2/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala#L249-L251)), but the `PullOutNondeterministic` analyzer rule failed to handle `Aggregate`, this PR fixes it. close https://github.com/apache/spark/pull/16379 There is still one remaining issue: `SELECT a + rand() FROM t GROUP BY a + rand()` is not allowed, because the 2 `rand()` are different(we generate random seed as the default seed for `rand()`). https://issues.apache.org/jira/browse/SPARK-19035 is tracking this issue. ## How was this patch tested? a new test suite Author: Wenchen Fan <wenchen@databricks.com> Closes #16404 from cloud-fan/groupby. (cherry picked from commit 871d2666) Signed-off-by:Wenchen Fan <wenchen@databricks.com>
Showing
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala 23 additions, 14 deletions...ala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/PullOutNondeterministicSuite.scala 56 additions, 0 deletions.../sql/catalyst/analysis/PullOutNondeterministicSuite.scala
- sql/core/src/test/resources/sql-tests/results/group-by-ordinal.sql.out 7 additions, 3 deletions...test/resources/sql-tests/results/group-by-ordinal.sql.out
Loading
Please register or sign in to comment