Significance Test for MLR
Assume that the error term ϵ in the multiple linear regression (MLR) model is independent of xk (k = 1, 2, ..., p), and is normally distributed, with zero mean and constant variance. We can decide whether there is any significant relationship between the dependent variable y and any of the independent variables xk (k = 1, 2, ..., p).
Problem
Decide which of the independent variables in the multiple linear regression model of the data set stackloss are statistically significant at .05 significance level.
Solution
We apply the lm function to a formula that describes the variable stack.loss by the variables Air.Flow, Water.Temp and Acid.Conc. And we save the linear regression model in a new variable stackloss.lm.
The t values of the independent variables can be found with the summary function.
Call:
lm(formula = stack.loss ~ Air.Flow + Water.Temp + Acid.Conc.,
data = stackloss)
Residuals:
Min 1Q Median 3Q Max
-7.238 -1.712 -0.455 2.361 5.698
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -39.920 11.896 -3.36 0.0038 **
Air.Flow 0.716 0.135 5.31 5.8e-05 ***
Water.Temp 1.295 0.368 3.52 0.0026 **
Acid.Conc. -0.152 0.156 -0.97 0.3440
---
Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
Residual standard error: 3.24 on 17 degrees of freedom
Multiple R-squared: 0.914, Adjusted R-squared: 0.898
F-statistic: 59.9 on 3 and 17 DF, p-value: 3.02e-09
Answer
As the p-values of Air.Flow and Water.Temp are less than 0.05, they are both statistically significant in the multiple linear regression model of stackloss.
Note
Further detail of the summary function for linear regression model can be found in the R documentation.