-
- Downloads
[SPARK-13657] [SQL] Support parsing very long AND/OR expressions
## What changes were proposed in this pull request? In order to avoid StackOverflow when parse a expression with hundreds of ORs, we should use loop instead of recursive functions to flatten the tree as list. This PR also build a balanced tree to reduce the depth of generated And/Or expression, to avoid StackOverflow in analyzer/optimizer. ## How was this patch tested? Add new unit tests. Manually tested with TPCDS Q3 with hundreds predicates in it [1]. These predicates help to reduce the number of partitions, then the query time went from 60 seconds to 8 seconds. [1] https://github.com/cloudera/impala-tpcds-kit/blob/master/queries/q3.sql Author: Davies Liu <davies@databricks.com> Closes #11501 from davies/long_or.
Showing
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/CatalystQl.scala 40 additions, 2 deletions...ala/org/apache/spark/sql/catalyst/parser/CatalystQl.scala
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/CatalystQlSuite.scala 11 additions, 0 deletions...rg/apache/spark/sql/catalyst/parser/CatalystQlSuite.scala
Loading
Please register or sign in to comment