Skip to content
Snippets Groups Projects
Commit 15ef3740 authored by Kousuke Saruta's avatar Kousuke Saruta Committed by Herman van Hovell
Browse files

[SPARK-19334][SQL] Fix the code injection vulnerability related to Generator functions.

## What changes were proposed in this pull request?

Similar to SPARK-15165, codegen is in danger of arbitrary code injection. The root cause is how variable names are created by codegen.
In GenerateExec#codeGenAccessor, a variable name is created like as follows.

```
val value = ctx.freshName(name)
```

The variable `value` is named based on the value of the variable `name` and the value of `name` is from schema given by users so an attacker can attack with queries like as follows.

```
SELECT inline(array(cast(struct(1) AS struct<`=new Object() { {f();} public void f() {throw new RuntimeException("This exception is injected.");} public int x;}.x`:int>)))
```

In the example above, a RuntimeException is thrown but an attacker can replace it with arbitrary code.

## How was this patch tested?

Added a new test case.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #16681 from sarutak/SPARK-19334.
parent cdb691eb
No related branches found
No related tags found
No related merge requests found
...@@ -181,7 +181,14 @@ case class GenerateExec( ...@@ -181,7 +181,14 @@ case class GenerateExec(
val row = codeGenAccessor(ctx, data.value, "col", index, st, nullable, checks) val row = codeGenAccessor(ctx, data.value, "col", index, st, nullable, checks)
val fieldChecks = checks ++ optionalCode(nullable, row.isNull) val fieldChecks = checks ++ optionalCode(nullable, row.isNull)
val columns = st.fields.toSeq.zipWithIndex.map { case (f, i) => val columns = st.fields.toSeq.zipWithIndex.map { case (f, i) =>
codeGenAccessor(ctx, row.value, f.name, i.toString, f.dataType, f.nullable, fieldChecks) codeGenAccessor(
ctx,
row.value,
s"st_col${i}",
i.toString,
f.dataType,
f.nullable,
fieldChecks)
} }
("", row.code, columns) ("", row.code, columns)
...@@ -247,7 +254,7 @@ case class GenerateExec( ...@@ -247,7 +254,7 @@ case class GenerateExec(
val values = e.dataType match { val values = e.dataType match {
case ArrayType(st: StructType, nullable) => case ArrayType(st: StructType, nullable) =>
st.fields.toSeq.zipWithIndex.map { case (f, i) => st.fields.toSeq.zipWithIndex.map { case (f, i) =>
codeGenAccessor(ctx, current, f.name, s"$i", f.dataType, f.nullable, checks) codeGenAccessor(ctx, current, s"st_col${i}", s"$i", f.dataType, f.nullable, checks)
} }
} }
......
...@@ -2548,4 +2548,20 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext { ...@@ -2548,4 +2548,20 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
checkAnswer( sql("SELECT * FROM `_tbl`"), Row(1) :: Row(2) :: Row(3) :: Nil) checkAnswer( sql("SELECT * FROM `_tbl`"), Row(1) :: Row(2) :: Row(3) :: Nil)
} }
} }
test("SPARK-19334: check code injection is prevented") {
// The end of comment (*/) should be escaped.
val badQuery =
"""|SELECT inline(array(cast(struct(1) AS
| struct<`=
| new Object() {
| {f();}
| public void f() {throw new RuntimeException("This exception is injected.");}
| public int x;
| }.x
| `:int>)))""".stripMargin.replaceAll("\n", "")
checkAnswer(sql(badQuery), Row(1) :: Nil)
}
} }
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment