When we collect data from a sample, we use it to make educated guesses about an entire population. For example, if you want to know the average height of all high school students in your state, you can't measure everyone-so you measure a sample and estimate the population average. But how confident should you be in that estimate? A confidence interval (often abbreviated CI) gives us a range of values within which we believe the true population parameter lies, along with a level of confidence that the interval actually contains that parameter. This chapter will introduce you to the concept of confidence intervals, explain how they're constructed, and show you how to interpret them correctly in real-world contexts.
In statistics, we distinguish between two key ideas: a parameter and a statistic. A parameter is a numerical characteristic of a population-for example, the true average weight of all apples in an orchard. A statistic is a numerical characteristic of a sample-like the average weight of 50 apples you picked and weighed. We use statistics to estimate parameters.
Because we're working with only a sample rather than the whole population, our estimates come with uncertainty. Imagine you want to estimate the average commute time for all workers in a city. You survey 100 workers and find an average commute time of 28 minutes. Does that mean the true average for all workers is exactly 28 minutes? Probably not. If you took a different sample of 100 workers, you might get 27 minutes or 30 minutes. This variation from sample to sample is called sampling variability.
Instead of giving just one number (a point estimate), we can provide a range-an interval estimate-that likely contains the true parameter. This is where confidence intervals come in.
A confidence interval is a range of values, calculated from sample data, that is likely to contain the true population parameter. It consists of two parts:
Think of a confidence interval like casting a net to catch a fish. The interval is your net, and the true population parameter is the fish. A 95% confidence level means that if you cast this net 100 times (taking 100 different samples), about 95 of those nets would actually catch the fish.
The most common confidence level used in practice is 95%, though 90% and 99% are also frequently used. The confidence level you choose reflects how sure you want to be. A higher confidence level gives a wider interval.
It's essential to interpret confidence intervals correctly. If we construct a 95% confidence interval for the mean commute time and get (25.3, 30.7) minutes, here's what that means:
Correct interpretation: "We are 95% confident that the true average commute time for all workers in the city is between 25.3 and 30.7 minutes."
This does not mean there's a 95% probability that the true mean is in this specific interval. Once the interval is calculated, the true mean either is or isn't in it-we just don't know which. The "95%" refers to the process: if we repeated this sampling and interval-construction process many times, about 95% of the intervals we create would contain the true mean.
When constructing a confidence interval for a population mean \( \mu \) (the Greek letter mu), we need several pieces of information:
The confidence interval for a population mean takes the form:
\[ \text{Point Estimate} \pm \text{Margin of Error} \]More specifically:
\[ \bar{x} \pm (\text{critical value}) \times \left(\frac{\text{standard deviation}}{\sqrt{n}}\right) \]Here, \( \bar{x} \) is the sample mean, the critical value depends on the confidence level and the distribution used, and \( \frac{\text{standard deviation}}{\sqrt{n}} \) is called the standard error of the mean.
The standard error measures how much we expect sample means to vary from sample to sample. It's calculated as:
\[ SE = \frac{s}{\sqrt{n}} \]where \( s \) is the sample standard deviation and \( n \) is the sample size. Notice that as the sample size increases, the standard error decreases-larger samples give more precise estimates.
In most real-world situations, we don't know the population standard deviation \( \sigma \). Instead, we use the sample standard deviation \( s \) as an estimate. When we do this, we use the t-distribution instead of the normal distribution to find our critical value.
The t-distribution is similar to the normal distribution but has heavier tails, meaning it accounts for the extra uncertainty that comes from estimating the standard deviation. The shape of the t-distribution depends on the degrees of freedom (df), which for a single sample mean equals \( n - 1 \).
Example: A researcher wants to estimate the average amount of time high school students spend on homework each night.
She surveys 25 students and finds a sample mean of 82 minutes with a sample standard deviation of 15 minutes.Construct a 95% confidence interval for the true average homework time.
Solution:
Step 1: Identify the given values
\( \bar{x} = 82 \) minutes, \( s = 15 \) minutes, \( n = 25 \), confidence level = 95%Step 2: Calculate degrees of freedom
df = \( n - 1 = 25 - 1 = 24 \)Step 3: Find the critical value \( t^* \)
For 95% confidence and df = 24, from a t-table: \( t^* = 2.064 \)Step 4: Calculate the standard error
\( SE = \frac{s}{\sqrt{n}} = \frac{15}{\sqrt{25}} = \frac{15}{5} = 3 \) minutesStep 5: Calculate the margin of error
\( ME = t^* \times SE = 2.064 \times 3 = 6.192 \) minutesStep 6: Construct the interval
Lower bound = \( 82 - 6.192 = 75.808 \) minutes
Upper bound = \( 82 + 6.192 = 88.192 \) minutesWe are 95% confident that the true average homework time for all high school students is between 75.8 and 88.2 minutes.
The width of a confidence interval tells us how precise our estimate is. A narrower interval means a more precise estimate. Three main factors affect the width:
Higher confidence levels produce wider intervals. If you want to be more confident that your interval contains the true parameter, you need to cast a wider net. For example, a 99% confidence interval will be wider than a 95% confidence interval for the same data.
Larger samples produce narrower intervals. Since standard error equals \( \frac{s}{\sqrt{n}} \), increasing \( n \) decreases the standard error, which decreases the margin of error. Doubling your sample size doesn't halve the interval width, though-you'd need to quadruple the sample size to halve the margin of error.
More variability (larger standard deviation \( s \)) produces wider intervals. If the data values are very spread out, our estimate of the mean is less precise. If the data are tightly clustered, our estimate is more precise.
Think of shooting arrows at a target. If your arrows land all over the place (high variability), you're less sure where the center should be. If they cluster tightly (low variability), you can pinpoint the center more accurately.
Besides estimating means, we often want to estimate proportions. For example, what proportion of voters support a particular candidate? What percentage of smartphones have a certain defect?
A proportion is a value between 0 and 1 that represents the fraction of a population with a particular characteristic. We denote the population proportion by \( p \) and the sample proportion by \( \hat{p} \) (read as "p-hat").
The confidence interval for a population proportion is:
\[ \hat{p} \pm z^* \times \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \]where:
For proportions, we typically use the z-distribution (standard normal) rather than the t-distribution, provided the sample size is large enough. A common rule of thumb is that both \( n\hat{p} \) and \( n(1-\hat{p}) \) should be at least 10.
| Confidence Level | Critical Value z* |
|---|---|
| 90% | 1.645 |
| 95% | 1.960 |
| 99% | 2.576 |
Example: A poll of 400 randomly selected voters finds that 220 support a new city ordinance.
Construct a 95% confidence interval for the true proportion of all voters who support the ordinance.
Solution:
Step 1: Calculate the sample proportion
\( \hat{p} = \frac{220}{400} = 0.55 \)Step 2: Check conditions
\( n\hat{p} = 400 \times 0.55 = 220 \geq 10 \) ✓
\( n(1-\hat{p}) = 400 \times 0.45 = 180 \geq 10 \) ✓
Conditions are satisfied.Step 3: Find the critical value
For 95% confidence: \( z^* = 1.960 \)Step 4: Calculate the standard error
\( SE = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} = \sqrt{\frac{0.55 \times 0.45}{400}} = \sqrt{\frac{0.2475}{400}} = \sqrt{0.00061875} \approx 0.0249 \)Step 5: Calculate the margin of error
\( ME = z^* \times SE = 1.960 \times 0.0249 \approx 0.0488 \)Step 6: Construct the interval
Lower bound = \( 0.55 - 0.0488 = 0.5012 \)
Upper bound = \( 0.55 + 0.0488 = 0.5988 \)Converting to percentages: (50.12%, 59.88%)
We are 95% confident that the true proportion of all voters who support the ordinance is between 50.1% and 59.9%.
Confidence intervals are powerful tools for making informed decisions. They help us determine whether differences we observe are meaningful or could just be due to random chance.
Suppose a company claims their batteries last an average of 100 hours. You test a sample and construct a 95% confidence interval of (94, 98) hours. Since 100 is not in this interval, you have evidence that the true mean is likely less than the company claims.
When confidence intervals from two different groups don't overlap, this suggests a real difference between the groups. However, if they do overlap, the difference might not be statistically meaningful. For a more rigorous comparison, statisticians construct confidence intervals for the difference between two means or proportions.
Example: A school district tests a new reading program in 30 classrooms and compares results to 30 classrooms using the traditional method.
The new program group has a mean score of 78 with a 95% CI of (75, 81).
The traditional group has a mean score of 72 with a 95% CI of (69, 75).What can we conclude about the effectiveness of the new program?
Solution:
Step 1: Examine the confidence intervals
New program: (75, 81)
Traditional: (69, 75)Step 2: Check for overlap
The intervals do not overlap-the lowest value for the new program (75) is equal to the highest value for the traditional method (75), with no true overlap in the interior of the intervals.Step 3: Draw a conclusion
Since the confidence intervals barely touch and don't overlap substantially, we have evidence that the new reading program produces higher average scores than the traditional method.We can conclude with reasonable confidence that the new program is more effective than the traditional method.
The margin of error (ME) is half the width of the confidence interval. It represents the maximum expected difference between the sample statistic and the true population parameter. Smaller margins of error give more precise estimates.
Sometimes we want to determine how large a sample we need to achieve a specific margin of error. We can rearrange the confidence interval formula to solve for \( n \).
To find the required sample size for a desired margin of error \( ME \) when estimating a mean:
\[ n = \left(\frac{z^* \times \sigma}{ME}\right)^2 \]Since we often don't know \( \sigma \), we use an estimate from a pilot study or previous research. We use \( z^* \) here as an approximation; for more precision with smaller samples, use \( t^* \), but this requires iteration since \( t^* \) depends on \( n \).
To find the required sample size for a desired margin of error when estimating a proportion:
\[ n = \left(\frac{z^*}{ME}\right)^2 \times \hat{p}(1-\hat{p}) \]If you don't have an estimate for \( \hat{p} \), use \( \hat{p} = 0.5 \), which gives the maximum possible value of \( \hat{p}(1-\hat{p}) \) and thus the largest (most conservative) sample size.
Example: A researcher wants to estimate the proportion of adults who exercise regularly, with a margin of error of no more than 3 percentage points (0.03) at 95% confidence.
How large a sample is needed if no prior estimate is available?
Solution:
Step 1: Identify the given information
\( ME = 0.03 \), confidence level = 95%, so \( z^* = 1.960 \)
No prior estimate, so use \( \hat{p} = 0.5 \)Step 2: Apply the sample size formula
\( n = \left(\frac{z^*}{ME}\right)^2 \times \hat{p}(1-\hat{p}) \)Step 3: Substitute values
\( n = \left(\frac{1.960}{0.03}\right)^2 \times 0.5 \times 0.5 \)
\( n = (65.333)^2 \times 0.25 \)
\( n = 4268.44 \times 0.25 \)
\( n = 1067.11 \)Step 4: Round up to the nearest whole number
\( n = 1068 \)The researcher needs a sample of at least 1068 adults to achieve the desired margin of error.
For confidence intervals to be valid, certain conditions must be met:
Always check these conditions before constructing a confidence interval. If they're not met, the interval may not be reliable.
The choice of confidence level depends on the consequences of being wrong. In medical research or engineering applications where safety is critical, researchers often use 99% confidence. In social science research, 95% is standard. In preliminary or exploratory studies, 90% might suffice.
Remember the trade-off: higher confidence means wider intervals (less precision), and lower confidence means narrower intervals (more precision but less certainty).
When reporting a confidence interval, always include:
For example: "The 95% confidence interval for the average battery life is (94, 98) hours."
Confidence intervals assume your sampling method is sound. No statistical technique can fix a biased sample. If your sample systematically excludes certain groups or over-represents others, your confidence interval won't accurately reflect the population, no matter how large your sample.
A confidence interval built on biased data is like a precise map of the wrong location-the precision doesn't help if you're looking in the wrong place.