Skip to content
Snippets Groups Projects
Commit 2837bf85 authored by Michael Armbrust's avatar Michael Armbrust
Browse files

[SPARK-3798][SQL] Store the output of a generator in a val

This prevents it from changing during serialization, leading to corrupted results.

Author: Michael Armbrust <michael@databricks.com>

Closes #2656 from marmbrus/generateBug and squashes the following commits:

efa32eb [Michael Armbrust] Store the output of a generator in a val. This prevents it from changing during serialization.
parent 4e9b551a
No related branches found
No related tags found
No related merge requests found
...@@ -39,7 +39,8 @@ case class Generate( ...@@ -39,7 +39,8 @@ case class Generate(
child: SparkPlan) child: SparkPlan)
extends UnaryNode { extends UnaryNode {
protected def generatorOutput: Seq[Attribute] = { // This must be a val since the generator output expr ids are not preserved by serialization.
protected val generatorOutput: Seq[Attribute] = {
if (join && outer) { if (join && outer) {
generator.output.map(_.withNullability(true)) generator.output.map(_.withNullability(true))
} else { } else {
...@@ -62,7 +63,7 @@ case class Generate( ...@@ -62,7 +63,7 @@ case class Generate(
newProjection(child.output ++ nullValues, child.output) newProjection(child.output ++ nullValues, child.output)
val joinProjection = val joinProjection =
newProjection(child.output ++ generator.output, child.output ++ generator.output) newProjection(child.output ++ generatorOutput, child.output ++ generatorOutput)
val joinedRow = new JoinedRow val joinedRow = new JoinedRow
iter.flatMap {row => iter.flatMap {row =>
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment