How to calculate the 95% confidence interval for the slope in a linear regression model in R

208,539

Let's fit the model:

> library(ISwR)
> fit <- lm(metabolic.rate ~ body.weight, rmr)
> summary(fit)

Call:
lm(formula = metabolic.rate ~ body.weight, data = rmr)

Residuals:
    Min      1Q  Median      3Q     Max 
-245.74 -113.99  -32.05  104.96  484.81 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 811.2267    76.9755  10.539 2.29e-13 ***
body.weight   7.0595     0.9776   7.221 7.03e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 157.9 on 42 degrees of freedom
Multiple R-squared: 0.5539, Adjusted R-squared: 0.5433 
F-statistic: 52.15 on 1 and 42 DF,  p-value: 7.025e-09 

The 95% confidence interval for the slope is the estimated coefficient (7.0595) ± two standard errors (0.9776).

This can be computed using confint:

> confint(fit, 'body.weight', level=0.95)
               2.5 % 97.5 %
body.weight 5.086656 9.0324
Share:
208,539
Yu Fu
Author by

Yu Fu

BY DAY: Alt-Rock Ninja Cowgirl at Veridian Dynamics. BY NIGHT: I write code and code rights for penalcoders.example.org, an awesome non-profit that will totally take your money at that link. My kids are cuter than yours. FOR FUN: C+ Jokes, Segway Roller Derby, NYT Sat. Crosswords (in Sharpie!), Ostrich Grooming. "If you see scary things, look for the helpers-you'll always see people helping."-Fred Rogers"

Updated on July 09, 2022

Comments

  • Yu Fu
    Yu Fu almost 2 years

    Here is an exercise from Introductory Statistics with R:

    With the rmr data set, plot metabolic rate versus body weight. Fit a linear regression model to the relation. According to the fitted model, what is the predicted metabolic rate for a body weight of 70 kg? Give a 95% confidence interval for the slope of the line.

    rmr data set is in the 'ISwR' package. It looks like this:

    > rmr
       body.weight metabolic.rate
    1         49.9           1079
    2         50.8           1146
    3         51.8           1115
    4         52.6           1161
    5         57.6           1325
    6         61.4           1351
    7         62.3           1402
    8         64.9           1365
    9         43.1            870
    10        48.1           1372
    11        52.2           1132
    12        53.5           1172
    13        55.0           1034
    14        55.0           1155
    15        56.0           1392
    16        57.8           1090
    17        59.0            982
    18        59.0           1178
    19        59.2           1342
    20        59.5           1027
    21        60.0           1316
    22        62.1           1574
    23        64.9           1526
    24        66.0           1268
    25        66.4           1205
    26        72.8           1382
    27        74.8           1273
    28        77.1           1439
    29        82.0           1536
    30        82.0           1151
    31        83.4           1248
    32        86.2           1466
    33        88.6           1323
    34        89.3           1300
    35        91.6           1519
    36        99.8           1639
    37       103.0           1382
    38       104.5           1414
    39       107.7           1473
    40       110.2           2074
    41       122.0           1777
    42       123.1           1640
    43       125.2           1630
    44       143.3           1708
    

    I know how to calculate the predicted y at a given x but how can I calculate the confidence interval for the slope?

  • ds440
    ds440 about 11 years
    This is equivalent to the following: coef=summary(fit)$coefficients[2,1] err=summary(fit)$coefficients[2,2] coef + c(-1,1)*err*qt(0.975, 42) [1] 5.086656 9.032400: it's the estimated coefficient +- qt(1-alpha/2, df) standard errors
  • Yu Fu
    Yu Fu about 11 years
    Thank you, NPE! So estimated coefficient +/- two standard errors is an approximation and the latter method provides a accurate way to calculate the confidence interval, right?
  • ds440
    ds440 about 11 years
    Yes, the two SE is a good ballpark: if the linear model assumptions are correct then it will follow a T distribution so as sample size increases it approaches ~1.96, for smaller samples it is higher.
  • jpcgandre
    jpcgandre over 9 years
    @NPE: Are you assuming that a Gaussian pdf for the slope values? If this hypothesis does not hold you could use bootstrap methods.