Writing functions in R, keeping scoping in mind
Solution 1
If I know that I'm going to need a function parametrized by some values and called repeatedly, I avoid globals by using a closure:
make.fn2 <- function(a, b) {
fn2 <- function(x) {
return( x + a + b )
}
return( fn2 )
}
a <- 2; b <- 3
fn2.1 <- make.fn2(a, b)
fn2.1(3) # 8
fn2.1(4) # 9
a <- 4
fn2.2 <- make.fn2(a, b)
fn2.2(3) # 10
fn2.1(3) # 8
This neatly avoids referencing global variables, instead using the enclosing environment of the function for a and b. Modification of globals a and b doesn't lead to unintended side effects when fn2 instances are called.
Solution 2
There's a reason that some languages don't allow global variables: they can easily lead to broken code.
The scoping rules in R allow you to write code in a lazy fashion - letting functions use variables in other environments can save you some typing, and it's great for playing around in simple cases.
If you are doing anything remotely complicated however, then I recommend that you pass a function all the variables that it needs (or at the very least, have some thorough sanity checking in place to have a fallback in case the variables don't exist).
In the example above:
The best practise is to use fn1.
Alternatively, try something like
fn3 <- function(x)
{
if(!exists("a", envir=.GlobalEnv))
{
warning("Variable 'a' does not exist in the global environment")
a <- 1
}
if(!exists("b", envir=.GlobalEnv))
{
warning("Variable 'b' does not exist in the global environment")
b <- 2
}
x + a + b
}
Solution 3
Does the problem come about when you're just using a global variable in a function or when you try to assign the variable? If it's the latter I suspect it's because you're not using <<-
as an assignment within the function. And while using <<-
appears to be the dark side 1 it may very well work for your purposes. If it is the former, the function is probably masking the global variable.
Naming global variables in a manner that it would be difficult to mask them locally might help. e.g.: global.pimultiples <- 1:4*pi
Comments
-
Christopher DuBois about 4 years
I often write functions that need to see other objects in my environment. For example:
> a <- 3 > b <- 3 > x <- 1:5 > fn1 <- function(x,a,b) a+b+x > fn2 <- function(x) a+b+x > fn1(x,a,b) [1] 7 8 9 10 11 > fn2(x) [1] 7 8 9 10 11
As expected, both these functions are identical because
fn2
can "see" a and b when it executes. But whenever I start to take advantage of this, within about 30 minutes I end up calling the function without one of the necessary variables (e.g. a or b). If I don't take advantage of this, then I feel like I am passing around objects unnecessarily.Is it better to be explicit about what a function requires? Or should this be taken care of via inline comments or other documentation of the function? Is there a better way?