Commit c8da5356 authored 7 years ago by wangzhenhua Committed by Wenchen Fan 7 years ago

[SPARK-20718][SQL] FileSourceScanExec with different filter orders should be...

[SPARK-20718][SQL] FileSourceScanExec with different filter orders should be the same after canonicalization

## What changes were proposed in this pull request?

Since `constraints` in `QueryPlan` is a set, the order of filters can differ. Usually this is ok because of canonicalization. However, in `FileSourceScanExec`, its data filters and partition filters are sequences, and their orders are not canonicalized. So `def sameResult` returns different results for different orders of data/partition filters. This leads to, e.g. different decision for `ReuseExchange`, and thus results in unstable performance.

## How was this patch tested?

Added a new test for `FileSourceScanExec.sameResult`.

Author: wangzhenhua <wangzhenhua@huawei.com>

Closes #17959 from wzhfy/canonicalizeFileSourceScanExec.

parent 2b36eb69

No related branches found

No related tags found

No related merge requests found

Hide whitespace changes

Inline Side-by-side

Showing with 62 additions and 3 deletions

Please register or to comment