NMSA407 Linear Regression: Tutorial

Confidence interval for the model based mean, prediction interval

Data Hosi0




Introduction

Load used data and calculate basic summaries

data(Hosi0, package = "mffSM")
head(Hosi0)
##   bweight blength weight12 length12 mage fage
## 1    3000      50     8900       78   23   30
## 2    3700      51    11550       77   26   28
## 3    3410      49    12000       80   21   24
## 4    3250      49     9600       73   19   22
## 5    3000      48     9600       74   16   22
## 6    3370      50    11100       73   19   18
dim(Hosi0)
## [1] 4838    6
summary(Hosi0)
##     bweight        blength         weight12        length12         mage            fage      
##  Min.   :1750   Min.   :46.00   Min.   : 6500   Min.   :63.0   Min.   :16.00   Min.   :17.00  
##  1st Qu.:3140   1st Qu.:49.00   1st Qu.: 9700   1st Qu.:74.0   1st Qu.:22.00   1st Qu.:24.00  
##  Median :3450   Median :50.00   Median :10350   Median :76.0   Median :25.00   Median :28.00  
##  Mean   :3429   Mean   :50.28   Mean   :10433   Mean   :76.6   Mean   :25.52   Mean   :28.49  
##  3rd Qu.:3710   3rd Qu.:51.00   3rd Qu.:11100   3rd Qu.:79.0   3rd Qu.:29.00   3rd Qu.:32.00  
##  Max.   :5100   Max.   :54.00   Max.   :16000   Max.   :94.0   Max.   :45.00   Max.   :69.00

Create a jittered version of blength

set.seed(221913282)
Hosi0 <- transform(Hosi0, blength.jitter = blength + runif(nrow(Hosi0), -0.1, 0.1))




Linear model (regression line) for dependence of birth weight on birth length

summary(mwl <- lm(bweight ~ blength, data = Hosi0))
## 
## Call:
## lm(formula = bweight ~ blength, data = Hosi0)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1520.33  -188.20   -10.33   189.67  1531.80 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -6230.146    121.095  -51.45   <2e-16 ***
## blength       192.124      2.407   79.81   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 291.7 on 4836 degrees of freedom
## Multiple R-squared:  0.5685, Adjusted R-squared:  0.5684 
## F-statistic:  6370 on 1 and 4836 DF,  p-value: < 2.2e-16




Confidence intervals for the model based means

LSE of the model based mean (and prediction) for blength equal to 47.5, 50, 52.5 cm

pdata <- data.frame(blength = c(47.5, 50, 52.5))
predict(mwl, newdata = pdata)
##        1        2        3 
## 2895.766 3376.077 3856.388

Calculate also standard errors

predict(mwl, newdata = pdata, se.fit = TRUE)
## $fit
##        1        2        3 
## 2895.766 3376.077 3856.388 
## 
## $se.fit
##        1        2        3 
## 7.889673 4.245714 6.799569 
## 
## $df
## [1] 4836
## 
## $residual.scale
## [1] 291.6664

95% confidence intervals for the model based means

predict(mwl, newdata = pdata, interval = "confidence", level = 0.95)
##        fit      lwr      upr
## 1 2895.766 2880.298 2911.233
## 2 3376.077 3367.753 3384.400
## 3 3856.388 3843.058 3869.718




Prediction intervals

95% prediction intervals

predict(mwl, newdata = pdata, interval = "prediction", level = 0.95)
##        fit      lwr      upr
## 1 2895.766 2323.758 3467.774
## 2 3376.077 2804.217 3947.936
## 3 3856.388 3284.434 4428.342




Confidence and prediction bands

Calculation

blengthNew <- seq(45.8, 54.2, length = 500)
pdata <- data.frame(blength = blengthNew)
pwlmean <- predict(mwl, newdata = pdata, interval = "confidence", level = 0.95)
pwlpred <- predict(mwl, newdata = pdata, interval = "prediction", level = 0.95)

Plot

par(mfrow = c(1, 1), bty = BTY, mar = c(5, 4, 1, 1)+0.1)
plot(bweight ~ blength.jitter, data = Hosi0, col = COL2, bg = BGC2, ylim = c(1750, 5100),
     pch = PCH, xlab = "Birth length [cm]", ylab = "Birth weight [g]")

  ### Estimated means/predictions
lines(blengthNew, pwlmean[, "fit"], col = "red4", lwd = 2)

  ### Confidence band around the regression function
lines(blengthNew, pwlmean[, "lwr"], col = "red2", lwd = 2, lty = 2)
lines(blengthNew, pwlmean[, "upr"], col = "red2", lwd = 2, lty = 2)

  ### Prediction band
lines(blengthNew, pwlpred[, "lwr"], col = "blue3", lwd = 2, lty = 5)
lines(blengthNew, pwlpred[, "upr"], col = "blue3", lwd = 2, lty = 5)

legend(46, 5000, legend = c("Confidence band", "Prediction band"), col = c("red2", "blue3"), lwd = 2, lty = c(2, 5))

plot of chunk fig-NormalLM-03-01