-
- Downloads
[SPARK-12813][SQL] Eliminate serialization for back to back operations
The goal of this PR is to eliminate unnecessary translations when there are back-to-back `MapPartitions` operations. In order to achieve this I also made the following simplifications: - Operators no longer have hold encoders, instead they have only the expressions that they need. The benefits here are twofold: the expressions are visible to transformations so go through the normal resolution/binding process. now that they are visible we can change them on a case by case basis. - Operators no longer have type parameters. Since the engine is responsible for its own type checking, having the types visible to the complier was an unnecessary complication. We still leverage the scala compiler in the companion factory when constructing a new operator, but after this the types are discarded. Deferred to a follow up PR: - Remove as much of the resolution/binding from Dataset/GroupedDataset as possible. We should still eagerly check resolution and throw an error though in the case of mismatches for an `as` operation. - Eliminate serializations in more cases by adding more cases to `EliminateSerialization` Author: Michael Armbrust <michael@databricks.com> Closes #10747 from marmbrus/encoderExpressions.
Showing
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala 4 additions, 0 deletions...ala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala 5 additions, 0 deletions...a/org/apache/spark/sql/catalyst/analysis/unresolved.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala 10 additions, 0 deletions...pache/spark/sql/catalyst/encoders/ExpressionEncoder.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/BoundAttribute.scala 3 additions, 1 deletion...pache/spark/sql/catalyst/expressions/BoundAttribute.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala 6 additions, 0 deletions...che/spark/sql/catalyst/expressions/namedExpressions.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects.scala 4 additions, 0 deletions...a/org/apache/spark/sql/catalyst/expressions/objects.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala 15 additions, 1 deletion...a/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala 0 additions, 119 deletions...che/spark/sql/catalyst/plans/logical/basicOperators.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala 185 additions, 0 deletions.../org/apache/spark/sql/catalyst/plans/logical/object.scala
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/EliminateSerializationSuite.scala 76 additions, 0 deletions.../sql/catalyst/optimizer/EliminateSerializationSuite.scala
- sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala 2 additions, 7 deletionssql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
- sql/core/src/main/scala/org/apache/spark/sql/GroupedDataset.scala 1 addition, 5 deletions.../src/main/scala/org/apache/spark/sql/GroupedDataset.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala 9 additions, 10 deletions...cala/org/apache/spark/sql/execution/SparkStrategies.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala 0 additions, 127 deletions...scala/org/apache/spark/sql/execution/basicOperators.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/objects.scala 182 additions, 0 deletions...c/main/scala/org/apache/spark/sql/execution/objects.scala
- sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala 12 additions, 0 deletions...re/src/test/scala/org/apache/spark/sql/DatasetSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/QueryTest.scala 4 additions, 4 deletionssql/core/src/test/scala/org/apache/spark/sql/QueryTest.scala
Loading
Please register or sign in to comment