pull out p-values and r-squared from a linear regression

317,806

Solution 1

r-squared: You can return the r-squared value directly from the summary object summary(fit)$r.squared. See names(summary(fit)) for a list of all the items you can extract directly.

Model p-value: If you want to obtain the p-value of the overall regression model, this blog post outlines a function to return the p-value:

lmp <- function (modelobject) {
    if (class(modelobject) != "lm") stop("Not an object of class 'lm' ")
    f <- summary(modelobject)$fstatistic
    p <- pf(f[1],f[2],f[3],lower.tail=F)
    attributes(p) <- NULL
    return(p)
}

> lmp(fit)
[1] 1.622665e-05

In the case of a simple regression with one predictor, the model p-value and the p-value for the coefficient will be the same.

Coefficient p-values: If you have more than one predictor, then the above will return the model p-value, and the p-value for coefficients can be extracted using:

summary(fit)$coefficients[,4]  

Alternatively, you can grab the p-value of coefficients from the anova(fit) object in a similar fashion to the summary object above.

Solution 2

Notice that summary(fit) generates an object with all the information you need. The beta, se, t and p vectors are stored in it. Get the p-values by selecting the 4th column of the coefficients matrix (stored in the summary object):

summary(fit)$coefficients[,4] 
summary(fit)$r.squared

Try str(summary(fit)) to see all the info that this object contains.

Edit: I had misread Chase's answer which basically tells you how to get to what I give here.

Solution 3

You can see the structure of the object returned by summary() by calling str(summary(fit)). Each piece can be accessed using $. The p-value for the F statistic is more easily had from the object returned by anova.

Concisely, you can do this:

rSquared <- summary(fit)$r.squared
pVal <- anova(fit)$'Pr(>F)'[1]

Solution 4

I came across this question while exploring suggested solutions for a similar problem; I presume that for future reference it may be worthwhile to update the available list of answer with a solution utilising the broom package.

Sample code

x = cumsum(c(0, runif(100, -1, +1)))
y = cumsum(c(0, runif(100, -1, +1)))
fit = lm(y ~ x)
require(broom)
glance(fit)

Results

>> glance(fit)
  r.squared adj.r.squared    sigma statistic    p.value df    logLik      AIC      BIC deviance df.residual
1 0.5442762     0.5396729 1.502943  118.2368 1.3719e-18  2 -183.4527 372.9055 380.7508 223.6251          99

Side notes

I find the glance function is useful as it neatly summarises the key values. The results are stored as a data.frame which makes further manipulation easy:

>> class(glance(fit))
[1] "data.frame"

Solution 5

While both of the answers above are good, the procedure for extracting parts of objects is more general.

In many cases, functions return lists, and the individual components can be accessed using str() which will print the components along with their names. You can then access them using the $ operator, i.e. myobject$componentname.

In the case of lm objects, there are a number of predefined methods one can use such as coef(), resid(), summary() etc, but you won't always be so lucky.

Share:
317,806

Related videos on Youtube

grautur
Author by

grautur

Updated on December 30, 2020

Comments

  • grautur
    grautur over 3 years

    How do you pull out the p-value (for the significance of the coefficient of the single explanatory variable being non-zero) and R-squared value from a simple linear regression model? For example...

    x = cumsum(c(0, runif(100, -1, +1)))
    y = cumsum(c(0, runif(100, -1, +1)))
    fit = lm(y ~ x)
    summary(fit)
    

    I know that summary(fit) displays the p-value and R-squared value, but I want to be able to stick these into other variables.

  • hadley
    hadley over 12 years
    It's a bit better to use inherits rather than class directly. And maybe you want unname(pf(f[1],f[2],f[3],lower.tail=F))?
  • Bakaburg
    Bakaburg over 9 years
    this works only for univariate regressions where the p val of the regression is the same of the predictor
  • António Ribeiro
    António Ribeiro about 8 years
    Care to provide an explanation, even if briefly, on why this code works?
  • Ben Bolker
    Ben Bolker about 8 years
    how does this improve on the existing answers (and in particular the accepted answer)?
  • j_v_wow_d
    j_v_wow_d over 7 years
    I tried this method, but it will fail if the linear model contains any NA terms
  • Andrew Brēza
    Andrew Brēza about 4 years
    This is a great answer!
  • Comfort Eagle
    Comfort Eagle over 3 years
    If you prefer a one-liner: summary(fit)$fstatistic %>% {unname(pf(.[1],.[2],.[3],lower.tail=F))}
  • Comfort Eagle
    Comfort Eagle over 3 years
    Or as pipe-less one-liner: with(summary(fit), pf(fstatistic[1],fstatistic[2],fstatistic[3],lower.tail=F))

Related