How to calculate the 95% confidence interval for the slope in a linear regression model in R

r statistics linear-regression confidence-interval

208,539

Let's fit the model:

> library(ISwR)
> fit <- lm(metabolic.rate ~ body.weight, rmr)
> summary(fit)

Call:
lm(formula = metabolic.rate ~ body.weight, data = rmr)

Residuals:
    Min      1Q  Median      3Q     Max 
-245.74 -113.99  -32.05  104.96  484.81 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 811.2267    76.9755  10.539 2.29e-13 ***
body.weight   7.0595     0.9776   7.221 7.03e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 157.9 on 42 degrees of freedom
Multiple R-squared: 0.5539, Adjusted R-squared: 0.5433 
F-statistic: 52.15 on 1 and 42 DF,  p-value: 7.025e-09

The 95% confidence interval for the slope is the estimated coefficient (7.0595) ± two standard errors (0.9776).

This can be computed using confint:

> confint(fit, 'body.weight', level=0.95)
               2.5 % 97.5 %
body.weight 5.086656 9.0324

208,539

Author by

Yu Fu

BY DAY: Alt-Rock Ninja Cowgirl at Veridian Dynamics. BY NIGHT: I write code and code rights for penalcoders.example.org, an awesome non-profit that will totally take your money at that link. My kids are cuter than yours. FOR FUN: C+ Jokes, Segway Roller Derby, NYT Sat. Crosswords (in Sharpie!), Ostrich Grooming. "If you see scary things, look for the helpers-you'll always see people helping."-Fred Rogers"

Updated on July 09, 2022

Comments

Yu Fu almost 2 years

Here is an exercise from Introductory Statistics with R:

With the rmr data set, plot metabolic rate versus body weight. Fit a linear regression model to the relation. According to the fitted model, what is the predicted metabolic rate for a body weight of 70 kg? Give a 95% confidence interval for the slope of the line.

rmr data set is in the 'ISwR' package. It looks like this:

> rmr
   body.weight metabolic.rate
1         49.9           1079
2         50.8           1146
3         51.8           1115
4         52.6           1161
5         57.6           1325
6         61.4           1351
7         62.3           1402
8         64.9           1365
9         43.1            870
10        48.1           1372
11        52.2           1132
12        53.5           1172
13        55.0           1034
14        55.0           1155
15        56.0           1392
16        57.8           1090
17        59.0            982
18        59.0           1178
19        59.2           1342
20        59.5           1027
21        60.0           1316
22        62.1           1574
23        64.9           1526
24        66.0           1268
25        66.4           1205
26        72.8           1382
27        74.8           1273
28        77.1           1439
29        82.0           1536
30        82.0           1151
31        83.4           1248
32        86.2           1466
33        88.6           1323
34        89.3           1300
35        91.6           1519
36        99.8           1639
37       103.0           1382
38       104.5           1414
39       107.7           1473
40       110.2           2074
41       122.0           1777
42       123.1           1640
43       125.2           1630
44       143.3           1708

I know how to calculate the predicted y at a given x but how can I calculate the confidence interval for the slope?

ds440 about 11 years

This is equivalent to the following: coef=summary(fit)$coefficients[2,1] err=summary(fit)$coefficients[2,2] coef + c(-1,1)*err*qt(0.975, 42) [1] 5.086656 9.032400: it's the estimated coefficient +- qt(1-alpha/2, df) standard errors
Yu Fu about 11 years

Thank you, NPE! So estimated coefficient +/- two standard errors is an approximation and the latter method provides a accurate way to calculate the confidence interval, right?
ds440 about 11 years

Yes, the two SE is a good ballpark: if the linear model assumptions are correct then it will follow a T distribution so as sample size increases it approaches ~1.96, for smaller samples it is higher.
jpcgandre over 9 years

@NPE: Are you assuming that a Gaussian pdf for the slope values? If this hypothesis does not hold you could use bootstrap methods.