How can I obtain the rsquare out of an anova in R

13,788

Solution 1

tl;dr: you can get the R-squared of the anova by looking at the summary output of the corresponding linear model

Let's go step by step:

1) Let's use the data from here

pain <- c(4, 5, 4, 3, 2, 4, 3, 4, 4, 6, 8, 4, 5, 4, 6, 5, 8, 6, 6, 7, 6, 6, 7, 5, 6, 5, 5)
drug <- c(rep("A", 9), rep("B", 9), rep("C", 9))
migraine <- data.frame(pain, drug)

2) Let's get the anova:

AOV <- aov(pain ~ drug, data=migraine)

summary(AOV)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## drug         2  28.22  14.111   11.91 0.000256 ***
## Residuals   24  28.44   1.185                     
## ---
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

3) Now, the anova is directly related to the linear model, so let's get it and find the anova from it:

LM <- lm(pain ~ drug, data=migraine)

anova(LM)

## Analysis of Variance Table
## 
## Response: pain
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## drug       2 28.222 14.1111  11.906 0.0002559 ***
## Residuals 24 28.444  1.1852                      
## ---
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

As expected, the results are exactly the same. This means that...

3) We can get the R-squared from the linear model:

summary(LM)

## Call:
## lm(formula = pain ~ drug, data = migraine)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.7778 -0.7778  0.1111  0.3333  2.2222 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   3.6667     0.3629  10.104 4.01e-10 ***
## drugB         2.1111     0.5132   4.114 0.000395 ***
## drugC         2.2222     0.5132   4.330 0.000228 ***
## ---
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
## 
## Residual standard error: 1.089 on 24 degrees of freedom
## Multiple R-squared:  0.498,  Adjusted R-squared:  0.4562 
## F-statistic: 11.91 on 2 and 24 DF,  p-value: 0.0002559

So the R-squared is 0.498

But what if we don't believe this?

4) What is the R-squared? It's the sum of squares regression divided by the total sum of squares (i.e., the sum of squares of the regression plus the sum of squares of the residuals). So let's find those numbers in the anova and calculate the R-squared directly:

# We use the tidy function from the broom package to extract values
library(broom)

tidy_aov <- tidy(AOV)
tidy_aov

##        term df    sumsq    meansq statistic      p.value
## 1      drug  2 28.22222 14.111111  11.90625 0.0002558807
## 2 Residuals 24 28.44444  1.185185        NA           NA

# The values we need are in the sumsq column of this data frame

sum_squares_regression <- tidy_aov$sumsq[1]
sum_squares_residuals <- tidy_aov$sumsq[2]

R_squared <- sum_squares_regression /
            (sum_squares_regression + sum_squares_residuals)

R_squared

## 0.4980392

So we get the same result: R-squared is 0.4980392

Solution 2

If you want to calculate the Adjusted R-square then you can apply the following formula (from https://www.statisticshowto.datasciencecentral.com/adjusted-r2/):

s <- summary(LM)
r2 <- s$r.squared
n <- dim(migraine)[1]
k <- 2
#adjusted R-square
1 - ((1-r2)*(n-1)/(n-k-1))
#the same as
s$adj.r.squared

Adjustment means penalization for additional variables ('k' in formula) just like in case of the AIC calculation. If the goodness-of-fit, the estimations vs residuals ratio does not increase significantly by adding a new independent variable then you shouldn't include it.

So, R-square will always increase by involving more and more variables while Adjusted R-square will stop improving after a certain number of regressors.

Share:
13,788

Related videos on Youtube

jakzr
Author by

jakzr

I have a degree in ecological modelling and I am right now pursuing a master degree in computing to expand my skills. I work on java, C, Oracle , R and more.. I also play blues!

Updated on June 04, 2022

Comments

  • jakzr
    jakzr almost 2 years

    I'm looking for the method/function that returns de Rsquared of an anova model in R.

    Could not find anything so far.

    Thanks