Skip to content
Snippets Groups Projects
user avatar
Josh Rosen authored
## What changes were proposed in this pull request?

CollectLimit.execute() incorrectly omits per-partition limits, leading to performance regressions in case this case is hit (which should not happen in normal operation, but can occur in some cases (see #15068 for one example).

## How was this patch tested?

Regression test in SQLQuerySuite that asserts the number of records scanned from the input RDD.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #15070 from JoshRosen/SPARK-17515.
3f6a2bb3
History
Name Last commit Last update