Interval Estimate of Population Mean with Unknown Variance
After we found a point estimate of the population mean, we would need a way to quantify its accuracy. Here, we discuss the case where the population variance is not assumed.
Let us denote the 100(1 −α∕2) percentile of the Student t distribution with n− 1 degrees of freedom as tα∕2. For random samples of sufficiently large size, and with standard deviation s, the end points of the interval estimate at (1 −α) confidence level is given as follows:
Problem
Without assuming the population standard deviation of the student height in survey, find the margin of error and interval estimate at 95% confidence level.
Solution
We first filter out missing values in survey$Height with the na.omit function, and save it in height.response.
Then we compute the sample standard deviation.
> s = sd(height.response) # sample standard deviation
> SE = s/sqrt(n); SE # standard error estimate
[1] 0.68117
Since there are two tails of the Student t distribution, the 95% confidence level would imply the 97.5th percentile of the Student t distribution at the upper tail. Therefore, tα∕2 is given by qt(.975, df=n-1). We multiply it with the standard error estimate SE and get the margin of error.
We then add it up with the sample mean, and find the confidence interval.
Answer
Without assumption on the population standard deviation, the margin of error for the student height survey at 95% confidence level is 1.3429 centimeters. The confidence interval is between 171.04 and 173.72 centimeters.
Alternative Solution
Instead of using the textbook formula, we can apply the t.test function in the built-in stats package.