Skip to content
Snippets Groups Projects
Commit d30238f1 authored by actuaryzhang's avatar actuaryzhang Committed by Felix Cheung
Browse files

[SPARK-19682][SPARKR] Issue warning (or error) when subset method "[[" takes vector index


## What changes were proposed in this pull request?
The `[[` method is supposed to take a single index and return a column. This is different from base R which takes a vector index.  We should check for this and issue warning or error when vector index is supplied (which is very likely given the behavior in base R).

Currently I'm issuing a warning message and just take the first element of the vector index. We could change this to an error it that's better.

## How was this patch tested?
new tests

Author: actuaryzhang <actuaryzhang10@gmail.com>

Closes #17017 from actuaryzhang/sparkRSubsetter.

(cherry picked from commit 7bf09433)
Signed-off-by: default avatarFelix Cheung <felixcheung@apache.org>
parent 21afc453
No related branches found
No related tags found
No related merge requests found
......@@ -1800,6 +1800,10 @@ setClassUnion("numericOrcharacter", c("numeric", "character"))
#' @note [[ since 1.4.0
setMethod("[[", signature(x = "SparkDataFrame", i = "numericOrcharacter"),
function(x, i) {
if (length(i) > 1) {
warning("Subset index has length > 1. Only the first index is used.")
i <- i[1]
}
if (is.numeric(i)) {
cols <- columns(x)
i <- cols[[i]]
......@@ -1813,6 +1817,10 @@ setMethod("[[", signature(x = "SparkDataFrame", i = "numericOrcharacter"),
#' @note [[<- since 2.1.1
setMethod("[[<-", signature(x = "SparkDataFrame", i = "numericOrcharacter"),
function(x, i, value) {
if (length(i) > 1) {
warning("Subset index has length > 1. Only the first index is used.")
i <- i[1]
}
if (is.numeric(i)) {
cols <- columns(x)
i <- cols[[i]]
......
......@@ -1007,6 +1007,18 @@ test_that("select operators", {
expect_is(df[[2]], "Column")
expect_is(df[["age"]], "Column")
expect_warning(df[[1:2]],
"Subset index has length > 1. Only the first index is used.")
expect_is(suppressWarnings(df[[1:2]]), "Column")
expect_warning(df[[c("name", "age")]],
"Subset index has length > 1. Only the first index is used.")
expect_is(suppressWarnings(df[[c("name", "age")]]), "Column")
expect_warning(df[[1:2]] <- df[[1]],
"Subset index has length > 1. Only the first index is used.")
expect_warning(df[[c("name", "age")]] <- df[[1]],
"Subset index has length > 1. Only the first index is used.")
expect_is(df[, 1, drop = F], "SparkDataFrame")
expect_equal(columns(df[, 1, drop = F]), c("name"))
expect_equal(columns(df[, "age", drop = F]), c("age"))
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment