VIFs returning aliased coefficients in R

23,720

Use the 'alias' function in R to see which variables are linearly dependent. Remove the dependent variables and the vif function should work correctly.

formula <- as.formula(Spring_Autumn ~ Oct + Nov + Dec + Jan + Feb + Mar + Apr + May + Jun + Jul + Aug + Sep + X1min + X3min +   X7min + X30min + X90min + X1max + X3max + X7max + X30max + X90max + BF + Dmin + Dmax+ LP + LPD + HP + HPD + RR + FR + Rev, data = IHA_stats)
fit <-lm(formula)

#the linearly dependent variables
ld.vars <- attributes(alias(fit)$Complete)$dimnames[[1]]

#remove the linearly dependent variables variables
formula.new <- as.formula(
    paste(
        paste(deparse(formula), collapse=""), 
        paste(ld.vars, collapse="-"),
        sep="-"
    )
)

#run model again
fit.new <-lm(formula.new)
vif(fit.new)

NOTE: This will not work in the case that you have auto generated dummy variables that are identical to other variables. The variable names get messed up. You can create your own hack to get around it.

Share:
23,720
James White
Author by

James White

Updated on July 05, 2022

Comments

  • James White
    James White almost 2 years

    I was wondering if anyone could help me with the following problem. When I conduct a VIF analysis between various explanatory variables it comes up with the following error messeage.

    test <-vif(lm(Spring_Autumn ~ Oct + Nov + Dec + Jan + Feb +  
     Mar + Apr + May + Jun + Jul + Aug + Sep + X1min + X3min +   X7min + X30min + X90min + X1max + X3max + X7max + X30max + X90max + BF + Dmin + Dmax+ LP + LPD + HP + HPD + RR + FR + Rev, data = IHA_stats))
    
    
    Error in vif.default(lm(Spring_Autumn ~ Oct + Nov + Dec + Jan + Feb +  : 
      there are aliased coefficients in the model
    

    After reading online it would seem I have two variables that are perfectly collinear, but I couldn't see 2 variables perfectly correlated through the cor function and don't now how to interpret an alias function table. Does anyone have any suggestions? Thank you in advance.

    James (a link to the original dataset is pasted below but can email if there are any issues with accessing this).

    https://www.dropbox.com/s/nqmagu9m3mjhy9n/IHA_statistics.csv?dl=0