-
- Downloads
[SPARK-18016][SQL][CATALYST] Code Generation: Constant Pool Limit - Class Splitting
## What changes were proposed in this pull request? This pull-request exclusively includes the class splitting feature described in #16648. When code for a given class would grow beyond 1600k bytes, a private, nested sub-class is generated into which subsequent functions are inlined. Additional sub-classes are generated as the code threshold is met subsequent times. This code includes 3 changes: 1. Includes helper maps, lists, and functions for keeping track of sub-classes during code generation (included in the `CodeGenerator` class). These helper functions allow nested classes and split functions to be initialized/declared/inlined to the appropriate locations in the various projection classes. 2. Changes `addNewFunction` to return a string to support instances where a split function is inlined to a nested class and not the outer class (and so must be invoked using the class-qualified name). Uses of `addNewFunction` throughout the codebase are modified so that the returned name is properly used. 3. Removes instances of the `this` keyword when used on data inside generated classes. All state declared in the outer class is by default global and accessible to the nested classes. However, if a reference to global state in a nested class is prepended with the `this` keyword, it would attempt to reference state belonging to the nested class (which would not exist), rather than the correct variable belonging to the outer class. ## How was this patch tested? Added a test case to the `GeneratedProjectionSuite` that increases the number of columns tested in various projections to a threshold that would previously have triggered a `JaninoRuntimeException` for the Constant Pool. Note: This PR does not address the second Constant Pool issue with code generation (also mentioned in #16648): excess global mutable state. A second PR may be opened to resolve that issue. Author: ALeksander Eskilson <alek.eskilson@cerner.com> Closes #18075 from bdrillard/class_splitting_only.
Showing
- sql/catalyst/pom.xml 7 additions, 0 deletionssql/catalyst/pom.xml
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala 3 additions, 3 deletions.../org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala 117 additions, 23 deletions...park/sql/catalyst/expressions/codegen/CodeGenerator.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateMutableProjection.scala 10 additions, 7 deletions...alyst/expressions/codegen/GenerateMutableProjection.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateOrdering.scala 3 additions, 0 deletions...k/sql/catalyst/expressions/codegen/GenerateOrdering.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GeneratePredicate.scala 3 additions, 0 deletions.../sql/catalyst/expressions/codegen/GeneratePredicate.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateSafeProjection.scala 6 additions, 3 deletions...catalyst/expressions/codegen/GenerateSafeProjection.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala 6 additions, 3 deletions...talyst/expressions/codegen/GenerateUnsafeProjection.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala 3 additions, 3 deletions...e/spark/sql/catalyst/expressions/complexTypeCreator.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala 2 additions, 2 deletions...ark/sql/catalyst/expressions/conditionalExpressions.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala 3 additions, 3 deletions...rg/apache/spark/sql/catalyst/expressions/generators.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala 1 addition, 1 deletion...ache/spark/sql/catalyst/expressions/objects/objects.scala
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/codegen/GeneratedProjectionSuite.scala 61 additions, 11 deletions...talyst/expressions/codegen/GeneratedProjectionSuite.scala
- sql/core/pom.xml 7 additions, 0 deletionssql/core/pom.xml
- sql/core/src/main/scala/org/apache/spark/sql/execution/ColumnarBatchScan.scala 3 additions, 3 deletions...la/org/apache/spark/sql/execution/ColumnarBatchScan.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/SortExec.scala 2 additions, 2 deletions.../main/scala/org/apache/spark/sql/execution/SortExec.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala 3 additions, 0 deletions...rg/apache/spark/sql/execution/WholeStageCodegenExec.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala 4 additions, 4 deletions...che/spark/sql/execution/aggregate/HashAggregateExec.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala 6 additions, 5 deletions...g/apache/spark/sql/execution/basicPhysicalOperators.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/GenerateColumnAccessor.scala 7 additions, 6 deletions...spark/sql/execution/columnar/GenerateColumnAccessor.scala
Loading
Please register or sign in to comment