Writing functions in R, keeping scoping in mind

r scope

20,400

Solution 1

If I know that I'm going to need a function parametrized by some values and called repeatedly, I avoid globals by using a closure:

make.fn2 <- function(a, b) {
    fn2 <- function(x) {
        return( x + a + b )
    }
    return( fn2 )
}

a <- 2; b <- 3
fn2.1 <- make.fn2(a, b)
fn2.1(3)    # 8
fn2.1(4)    # 9

a <- 4
fn2.2 <- make.fn2(a, b)
fn2.2(3)    # 10
fn2.1(3)    # 8

This neatly avoids referencing global variables, instead using the enclosing environment of the function for a and b. Modification of globals a and b doesn't lead to unintended side effects when fn2 instances are called.

Solution 2

There's a reason that some languages don't allow global variables: they can easily lead to broken code.

The scoping rules in R allow you to write code in a lazy fashion - letting functions use variables in other environments can save you some typing, and it's great for playing around in simple cases.

If you are doing anything remotely complicated however, then I recommend that you pass a function all the variables that it needs (or at the very least, have some thorough sanity checking in place to have a fallback in case the variables don't exist).

In the example above:

The best practise is to use fn1.

Alternatively, try something like

 fn3 <- function(x)
   {
      if(!exists("a", envir=.GlobalEnv))
      {
         warning("Variable 'a' does not exist in the global environment")
         a <- 1
      }

      if(!exists("b", envir=.GlobalEnv))
      {
         warning("Variable 'b' does not exist in the global environment")
         b <- 2
      }

      x + a + b
   }

Solution 3

Does the problem come about when you're just using a global variable in a function or when you try to assign the variable? If it's the latter I suspect it's because you're not using <<- as an assignment within the function. And while using <<- appears to be the dark side 1 it may very well work for your purposes. If it is the former, the function is probably masking the global variable.

Naming global variables in a manner that it would be difficult to mask them locally might help. e.g.: global.pimultiples <- 1:4*pi

20,400

Author by

Christopher DuBois

Graduate student at UC Irvine.

Updated on February 08, 2020

Comments

Christopher DuBois about 4 years
I often write functions that need to see other objects in my environment. For example:
```
> a <- 3
> b <- 3
> x <- 1:5
> fn1 <- function(x,a,b) a+b+x
> fn2 <- function(x) a+b+x
> fn1(x,a,b)
[1]  7  8  9 10 11
> fn2(x)
[1]  7  8  9 10 11
```
As expected, both these functions are identical because fn2 can "see" a and b when it executes. But whenever I start to take advantage of this, within about 30 minutes I end up calling the function without one of the necessary variables (e.g. a or b). If I don't take advantage of this, then I feel like I am passing around objects unnecessarily.

Is it better to be explicit about what a function requires? Or should this be taken care of via inline comments or other documentation of the function? Is there a better way?