Skip to content
Snippets Groups Projects
Commit 80f58510 authored by Ryan Blue's avatar Ryan Blue Committed by Yin Huai
Browse files

[SPARK-18368][SQL] Fix regexp replace when serialized


## What changes were proposed in this pull request?

This makes the result value both transient and lazy, so that if the RegExpReplace object is initialized then serialized, `result: StringBuffer` will be correctly initialized.

## How was this patch tested?

* Verified that this patch fixed the query that found the bug.
* Added a test case that fails without the fix.

Author: Ryan Blue <blue@apache.org>

Closes #15834 from rdblue/SPARK-18368-fix-regexp-replace.

(cherry picked from commit d4028de9)
Signed-off-by: default avatarYin Huai <yhuai@databricks.com>
parent 626f6d6d
No related branches found
No related tags found
No related merge requests found
......@@ -230,7 +230,7 @@ case class RegExpReplace(subject: Expression, regexp: Expression, rep: Expressio
@transient private var lastReplacement: String = _
@transient private var lastReplacementInUTF8: UTF8String = _
// result buffer write by Matcher
@transient private val result: StringBuffer = new StringBuffer
@transient private lazy val result: StringBuffer = new StringBuffer
override def nullSafeEval(s: Any, p: Any, r: Any): Any = {
if (!p.equals(lastRegex)) {
......
......@@ -17,7 +17,8 @@
package org.apache.spark.sql.catalyst.expressions
import org.apache.spark.SparkFunSuite
import org.apache.spark.{SparkConf, SparkFunSuite}
import org.apache.spark.serializer.JavaSerializer
import org.apache.spark.sql.catalyst.dsl.expressions._
import org.apache.spark.sql.types.StringType
......@@ -191,4 +192,17 @@ class RegexpExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
checkEvaluation(StringSplit(s1, s2), null, row3)
}
test("RegExpReplace serialization") {
val serializer = new JavaSerializer(new SparkConf()).newInstance
val row = create_row("abc", "b", "")
val s = 's.string.at(0)
val p = 'p.string.at(1)
val r = 'r.string.at(2)
val expr: RegExpReplace = serializer.deserialize(serializer.serialize(RegExpReplace(s, p, r)))
checkEvaluation(expr, "ac", row)
}
}
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment