-
- Downloads
[SPARK-14092] [SQL] move shouldStop() to end of while loop
## What changes were proposed in this pull request? This PR rollback some changes in #11274 , which introduced some performance regression when do a simple aggregation on parquet scan with one integer column. Does not really understand how this change introduce this huge impact, maybe related show JIT compiler inline functions. (saw very different stats from profiling). ## How was this patch tested? Manually run the parquet reader benchmark, before this change: ``` Intel(R) Core(TM) i7-4558U CPU 2.80GHz Int and String Scan: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------- SQL Parquet Vectorized 2391 / 3107 43.9 22.8 1.0X ``` After this change ``` Java HotSpot(TM) 64-Bit Server VM 1.7.0_60-b19 on Mac OS X 10.9.5 Intel(R) Core(TM) i7-4558U CPU 2.80GHz Int and String Scan: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------- SQL Parquet Vectorized 2032 / 2626 51.6 19.4 1.0X``` Author: Davies Liu <davies@databricks.com> Closes #11912 from davies/fix_regression.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala 5 additions, 3 deletions...in/scala/org/apache/spark/sql/execution/ExistingRDD.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala 5 additions, 3 deletions...la/org/apache/spark/sql/execution/WholeStageCodegen.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala 2 additions, 1 deletion...scala/org/apache/spark/sql/execution/basicOperators.scala
Please register or sign in to comment