Prediction Interval for MLR

Assume that the error term ϵ in the multiple linear regression (MLR) model is independent of x_k (k = 1, 2, ..., p), and is normally distributed, with zero mean and constant variance. For a given set of values of x_k (k = 1, 2, ..., p), the interval estimate of the dependent variable y is called the prediction interval.

Problem

In data set stackloss, develop a 95% prediction interval of the stack loss if the air flow is 72, water temperature is 20 and acid concentration is 85.

Solution

We apply the lm function to a formula that describes the variable stack.loss by the variables Air.Flow, Water.Temp and Acid.Conc. And we save the linear regression model in a new variable stackloss.lm.

> attach(stackloss) # attach the data frame
> stackloss.lm = lm(stack.loss ~
+ Air.Flow + Water.Temp + Acid.Conc.)

Then we wrap the parameters inside a new data frame variable newdata.

> newdata = data.frame(Air.Flow=72,
+ Water.Temp=20,
+ Acid.Conc.=85)

We now apply the predict function and set the predictor variable in the newdata argument. We also set the interval type as "predict", and use the default 0.95 confidence level.

> predict(stackloss.lm, newdata, interval="predict")
fit lwr upr
1 24.582 16.466 32.697
> detach(stackloss) # clean up

Answer

The 95% confidence interval of the stack loss with the given parameters is between 16.466 and 32.697.

Note

Further detail of the predict function for linear regression model can be found in the R documentation.

> help(predict.lm)

Tags:

An R Introduction to Statistics

Prediction Interval for MLR

Problem

Solution

Answer

Note

R Tutorial eBook

R Tutorials