Update data frame via function doesn't work

42,345

Solution 1

test in your function is a copy of the object from your global environment (I'm assuming that's where it is defined). Assignment happens in the current environment unless specified otherwise, so any changes that happen inside the function apply only to the copy inside the function, not the object in your global environment.

And it's good form to pass all necessary objects as arguments to the function.

Personally, I would return(test) at the end of your function and make the assignment outside of the function, but I'm not sure if you can do this in your actual situation.

test.fun <- function (x, test) {
    test[test$v1==x,"v2"] <- 10
    return(test)
}
test <- data.frame(v1=c(rep(1,3),rep(2,3)),v2=0)
(test <- test.fun(1, test))
#  v1 v2
#1  1 10
#2  1 10
#3  1 10
#4  2  0
#5  2  0
#6  2  0

If it is absolutely necessary to modify an object outside your function directly, so you need to tell R that you want to assign the local copy of test to the test in the .GlobalEnv.

test.fun <- function (x, test) {
    test[test$v1==x,"v2"] <- 10
    assign('test',test,envir=.GlobalEnv)
    #test <<- test  # This also works, but the above is more explicit.
}
(test.fun(1, test))
#  v1 v2
#1  1 10
#2  1 10
#3  1 10
#4  2  0
#5  2  0
#6  2  0

Using assign or <<- in this fashion is fairly uncommon, though, and many experienced R programmers will recommend against it.

Solution 2

Changing the <- to <<- in your function, does the trick as well, see the R-manual . Quote from that page:

The operators <<- and ->> are normally only used in functions, and cause a search to made through parent environments for an existing definition of the variable being assigned. If such a variable is found (and its binding is not locked) then its value is redefined, otherwise assignment takes place in the global environment.

Your code should then be:

test <- data.frame(v1=c(rep(1,3),rep(2,3)),v2=0) 

test.fun <- function (x) {
  test[test$v1==x,"v2"] <<- 10
  print(test)
}

test.fun(1)

Solution 3

It is good practice to not change global variables in functions, because this may have undesirable side effects. To avoid this in R, any changes to objects inside a function actually only change copies that are local to that function's environment.

If you really want to change test, you have to assign the return value of the function to test (it would be better to write the function with a more explicit return value,

 test <- test.fun(1)

Or choose the global environment to assign to within test.fun,

test.fun <- function (x) {             
    test[test$v1==x,"v2"] <- 10             
    print(test)
    assign("test",test,.GlobalEnv)           
} 

Solution 4

I think this happens because of the different environments that are evaluated. Your function copies test from the global environment into a temporary local environment (which is created on the function call) and then test is only evaluated (i.e., changed) in this local environment.

You could overcome this issue by using the super-assignment <<-, but this is NOT recommended and will lead to horrible unforeseen problems (your computer catches a virus, your girlfriend starts to cheat on you,...).

Generally the solution given by Joshua Ulrich is the way to go on these kind of problems. You pass the original object and return it. On function call you assign the result to your original object.

Solution 5

You could write a replacement function. This is a function with a name that ends in '<-' and essentially wraps it in a:

foo = bar(foo)

wrapper. So in your case:

> "setV2<-" = function (x,value,m){x[x$v1==m,"v2"]=value;return(x)}
> test <- data.frame(v1=c(rep(1,3),rep(2,3)),v2=0) 
> setV2(test,1)=10
> test
  v1 v2
1  1 10
2  1 10
3  1 10
4  2  0
5  2  0
6  2  0
> setV2(test,2)=99
> test
  v1 v2
1  1 10
2  1 10
3  1 10
4  2 99
5  2 99
6  2 99

Note you have to quote the function name on creation or R gets confused.

Share:
42,345
donodarazao
Author by

donodarazao

Updated on January 15, 2020

Comments

  • donodarazao
    donodarazao over 4 years

    I ran into a little problem using R…

    In the following data frame

    test <- data.frame(v1=c(rep(1,3),rep(2,3)),v2=0) 
    

    I want to change values for v2 in the rows where v1 is 1.

    test[test$v1==1,"v2"] <- 10
    

    works just fine.

    test
      v1 v2
    1  1 10
    2  1 10
    3  1 10
    4  2  0
    5  2  0
    6  2  0
    

    However, I need to do that in a function.

    test <- data.frame(v1=c(rep(1,3),rep(2,3)),v2=0)
    
    test.fun <- function (x) {
        test[test$v1==x,"v2"] <- 10
        print(test)
    }
    

    Calling the function seems to work.

    test.fun(1)
      v1 v2
    1  1 10
    2  1 10
    3  1 10
    4  2  0
    5  2  0
    6  2  0
    

    However, when I now look at test:

    test
      v1 v2
    1  1  0
    2  1  0
    3  1  0
    4  2  0
    5  2  0
    6  2  0
    

    it didn’t work. Is there a command that tells R to really update the data frame in the function? Thanks a lot for any help!