vector of variable names in R

39,864

Solution 1

You can use the "get" function to get an object based on a character string of its name, but in the long run it is better to store the variables in a list and just access them that way, things become much simpler, you can grab subsets, you can use lapply or sapply to run the same code on every element. When saving or deleting you can just work on the entire list rather than trying to remember every element. e.g.:

mylist <- list(a=rnorm(100), b=rnorm(100) )
names(mylist)
summary(mylist[[1]])
# or
summary(mylist[['a']])
# or
summary(mylist$a)
# or 
d <- 'a'
summary(mylist[[d]])

# or
lapply( mylist, summary )

If you are programatically creating models for analysis with lm (or other modeling functions), then one approach is to just subset your data and use the ".", e.g.:

yvar <- 'Sepal.Width'
xvars <- c('Petal.Width','Sepal.Length')
fit <- lm( Sepal.Width ~ ., data=iris[, c(yvar,xvars)] )

Or you can build the formula using "paste" or "sprintf" then use "as.formula" to convert it to a formula, e.g.:

yvar <- 'Sepal.Width'
xvars <- c('Petal.Width','Sepal.Length')
my.formula <- paste( yvar, '~', paste( xvars, collapse=' + ' ) )
my.formula <- as.formula(my.formula)
fit <- lm( my.formula, data=iris )

Note also the problem of multiple comparisons if you are looking at many different models fit automatically.

Solution 2

you could use a list k=list(a,b). This creates a list with components a and b but is not a list of variable names.

Solution 3

get() is what you're looking for :

summary(get(k[1]))

edit : get() is not what you're looking for, it's list(). get() could be useful too though.

If you're looking for automatic generation of regression analyses, you might actually benefit from using eval(), although every R-programmer will warn you about using eval() unless you know very well what you're doing. Please read the help files about eval() and parse() very carefully before you use them.

An example :

d <- data.frame(
  var1 = rnorm(1000),
  var2 = rpois(1000,4),
  var3 = sample(letters[1:3],1000,replace=T)
)

vars <- names(d)

auto.lm <- function(d,dep,indep){
      expr <- paste(
          "out <- lm(",
          dep,
          "~",
          paste(indep,collapse="*"),
          ",data=d)"
      )
      eval(parse(text=expr))
      return(out)
}

auto.lm(d,vars[1],vars[2:3])
Share:
39,864
Misha
Author by

Misha

Updated on August 28, 2020

Comments

  • Misha
    Misha over 3 years

    I'd like to create a function that automatically generates uni and multivariate regression analyses, but I'm not able to figure out how I can specify **variables in vectors...**This seems very easy, but skimming the documentation I havent figured it out so far...

    Easy example

    a<-rnorm(100)
    b<-rnorm(100)
    k<-c("a","b")
    d<-c(a,b)
    summary(k[1])
    

    But k[1]="a" and is a character vector...d is just b appended to a, not the variable names. In effect I'd like k[1] to represent the vector a.

    Appreciate any answers...

    //M

  • Joris Meys
    Joris Meys over 13 years
    You're welcome. But actually Halpo is right. If you want k[1] to represent the vector a, then you need a list. It's worth looking into as well.
  • Joris Meys
    Joris Meys over 13 years
    Indeed, using as.formula() is a lot cleaner than the eval() parse() construct I used.
  • Roman Luštrik
    Roman Luštrik over 13 years
    A nice way of pre-allocating a list is via vector("list", n) where n is the number of elements the list is suppose to hold. Sorry to be a bit off topic. :)