Skip to content
Snippets Groups Projects
Commit ec79183a authored by Dongjoon Hyun's avatar Dongjoon Hyun Committed by Reynold Xin
Browse files

[SPARK-16340][SQL] Support column arguments for `regexp_replace` Dataset operation

## What changes were proposed in this pull request?

Currently, `regexp_replace` function supports `Column` arguments in a query. This PR supports that in a `Dataset` operation, too.

## How was this patch tested?

Pass the Jenkins tests with a updated testcase.

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes #14060 from dongjoon-hyun/SPARK-16340.
parent ec18cd0a
No related branches found
No related tags found
No related merge requests found
......@@ -2193,6 +2193,16 @@ object functions {
RegExpReplace(e.expr, lit(pattern).expr, lit(replacement).expr)
}
/**
* Replace all substrings of the specified string value that match regexp with rep.
*
* @group string_funcs
* @since 2.1.0
*/
def regexp_replace(e: Column, pattern: Column, replacement: Column): Column = withExpr {
RegExpReplace(e.expr, pattern.expr, replacement.expr)
}
/**
* Decodes a BASE64 encoded string column and returns it as a binary column.
* This is the reverse of base64.
......
......@@ -77,8 +77,10 @@ class StringFunctionsSuite extends QueryTest with SharedSQLContext {
checkAnswer(
df.select(
regexp_replace($"a", "(\\d+)", "num"),
regexp_replace($"a", $"b", $"c"),
regexp_extract($"a", "(\\d+)-(\\d+)", 1)),
Row("num-num", "100") :: Row("num-num", "100") :: Row("num-num", "100") :: Nil)
Row("num-num", "300", "100") :: Row("num-num", "400", "100") ::
Row("num-num", "400-400", "100") :: Nil)
// for testing the mutable state of the expression in code gen.
// This is a hack way to enable the codegen, thus the codegen is enable by default,
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment