In the real world, many things vary in predictable patterns. Heights of adults, test scores, measurement errors, and even the weights of apples from an orchard tend to cluster around an average value, with fewer and fewer observations as you move farther away from that average. The normal distribution, also called the Gaussian distribution or bell curve, is a mathematical model that describes this pattern precisely. Understanding normal distributions allows you to make predictions, calculate probabilities, and interpret data in fields ranging from biology and psychology to quality control and economics.
A normal distribution is a continuous probability distribution that is symmetric and bell-shaped. It is completely determined by two parameters: the mean (denoted by the Greek letter μ, pronounced "mu") and the standard deviation (denoted by the Greek letter σ, pronounced "sigma").
The mean μ tells you the center of the distribution-the value around which data tends to cluster. The standard deviation σ measures how spread out the data is. A small σ means the data is tightly packed around the mean; a large σ means the data is more spread out.
We write that a variable \( X \) follows a normal distribution as:
\[ X \sim N(\mu, \sigma^2) \]The symbol \( \sim \) means "is distributed as," and \( N(\mu, \sigma^2) \) indicates a normal distribution with mean μ and variance σ².
The shape of a normal distribution is defined mathematically by its probability density function (PDF):
\[ f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x - \mu)^2}{2\sigma^2}} \]In this formula:
You do not need to compute this formula by hand in practice. Instead, you will use tables, calculators, or software to find probabilities. However, understanding that the formula exists helps you appreciate why the distribution has its characteristic shape.
A special case of the normal distribution is the standard normal distribution, which has mean μ = 0 and standard deviation σ = 1. We write this as:
\[ Z \sim N(0, 1) \]The variable \( Z \) is often called a z-score or standard score. The standard normal distribution is extremely useful because any normal distribution can be converted to it using a simple transformation.
To convert a value \( x \) from a normal distribution with mean μ and standard deviation σ to a z-score, use the formula:
\[ z = \frac{x - \mu}{\sigma} \]This formula tells you how many standard deviations \( x \) is above or below the mean. A positive z-score means \( x \) is above the mean; a negative z-score means \( x \) is below the mean.
Example: The heights of adult women in a population are normally distributed with a mean of 65 inches and a standard deviation of 3 inches.
What is the z-score for a woman who is 71 inches tall?
Solution:
Given: μ = 65 inches, σ = 3 inches, x = 71 inches
Use the z-score formula:
\( z = \frac{x - \mu}{\sigma} = \frac{71 - 65}{3} = \frac{6}{3} = 2 \)
The z-score is 2, meaning this woman's height is 2 standard deviations above the mean.
For any normal distribution, a simple rule describes how data is spread around the mean. This is called the Empirical Rule or the 68-95-99.7 Rule:
This rule is a powerful tool for quickly estimating probabilities without using tables or technology.
Example: SAT math scores are normally distributed with a mean of 500 and a standard deviation of 100.
What percentage of students score between 400 and 600?
Solution:
400 is one standard deviation below the mean: 500 - 100 = 400
600 is one standard deviation above the mean: 500 + 100 = 600
By the Empirical Rule, approximately 68% of students score between 400 and 600.
Example: IQ scores are normally distributed with a mean of 100 and a standard deviation of 15.
What percentage of people have IQ scores between 70 and 130?
Solution:
70 is two standard deviations below the mean: 100 - 2(15) = 100 - 30 = 70
130 is two standard deviations above the mean: 100 + 2(15) = 100 + 30 = 130
By the Empirical Rule, approximately 95% of people have IQ scores between 70 and 130.
The standard normal table (also called the z-table) provides the area under the standard normal curve to the left of a given z-score. This area represents the probability that \( Z \) is less than or equal to that z-score.
Most z-tables give \( P(Z \leq z) \), the cumulative probability from the far left tail up to \( z \).
To find \( P(Z \leq z) \):
Example: Find the probability that a standard normal variable \( Z \) is less than 1.25.
What is \( P(Z \leq 1.25) \)?
Solution:
Locate z = 1.25 in the standard normal table.
Row for 1.2, column for 0.05.
The table gives approximately 0.8944.
Therefore, \( P(Z \leq 1.25) = \) 0.8944, or about 89.44%.
Because the total area under the curve is 1, we have:
\[ P(Z > z) = 1 - P(Z \leq z) \]Example: Find the probability that \( Z \) is greater than 0.75.
What is \( P(Z > 0.75) \)?
Solution:
From the z-table, \( P(Z \leq 0.75) \approx 0.7734 \).
Use the complement rule:
\( P(Z > 0.75) = 1 - 0.7734 = 0.2266 \)
Therefore, \( P(Z > 0.75) = \) 0.2266, or about 22.66%.
To find \( P(a < z="">< b)="" \),="">
\[ P(a < z="">< b)="P(Z" \leq="" b)="" -="" p(z="" \leq="" a)="" \]="">Example: Find the probability that \( Z \) is between -1.0 and 1.5.
What is \( P(-1.0 < z="">< 1.5)="">
Solution:
From the z-table, \( P(Z \leq 1.5) \approx 0.9332 \).
From the z-table, \( P(Z \leq -1.0) \approx 0.1587 \).
\( P(-1.0 < z="">< 1.5)="0.9332" -="" 0.1587="0.7745">
Therefore, \( P(-1.0 < z="">< 1.5)="\)">0.7745, or about 77.45%.
When working with a normal distribution that is not standard (μ ≠ 0 or σ ≠ 1), convert the values to z-scores first, then use the z-table.
Example: The amount of soda dispensed by a machine is normally distributed with a mean of 12 ounces and a standard deviation of 0.5 ounces.
What is the probability that a randomly selected cup contains more than 13 ounces?
Solution:
Given: μ = 12, σ = 0.5, x = 13
Convert to a z-score: \( z = \frac{13 - 12}{0.5} = \frac{1}{0.5} = 2.0 \)
From the z-table, \( P(Z \leq 2.0) \approx 0.9772 \).
\( P(Z > 2.0) = 1 - 0.9772 = 0.0228 \)
The probability that a cup contains more than 13 ounces is 0.0228, or about 2.28%.
Sometimes you know the probability and need to find the corresponding value or z-score. This is called an inverse normal problem or finding a percentile.
To find the z-score corresponding to a cumulative probability \( P(Z \leq z) = p \):
Example: Find the z-score such that 90% of the standard normal distribution lies to the left of it.
What is the value of \( z \) such that \( P(Z \leq z) = 0.90 \)?
Solution:
Look in the body of the z-table for the probability closest to 0.90.
The closest value is 0.8997, corresponding to z = 1.28.
Alternatively, 0.9015 corresponds to z = 1.29.
Using linear interpolation or a calculator, the z-score is approximately 1.28.
To find a value \( x \) given a probability when \( X \sim N(\mu, \sigma^2) \):
Example: Test scores are normally distributed with a mean of 75 and a standard deviation of 10.
What score represents the 85th percentile?Find the score such that 85% of students score below it.
Solution:
Find the z-score such that \( P(Z \leq z) = 0.85 \).
From the z-table, the closest probability is 0.8508, corresponding to z ≈ 1.04.
Convert to the original scale: \( x = \mu + z\sigma = 75 + 1.04(10) = 75 + 10.4 = 85.4 \)
The 85th percentile score is approximately 85.4.
Normal distributions appear throughout statistics and real-world phenomena. Here are some common applications:
Manufacturing processes often produce items whose measurements (length, weight, diameter) vary normally around a target value. Companies use normal distribution models to set acceptable tolerance ranges and identify defective products.
For example, if bolts produced by a machine have lengths normally distributed with a mean of 5.0 cm and a standard deviation of 0.1 cm, a quality control engineer can calculate the percentage of bolts that fall outside acceptable limits, say between 4.8 cm and 5.2 cm.
Many standardized tests (SAT, ACT, IQ tests) are designed so that scores follow a normal distribution. This allows educators and psychologists to compare individual performance to the population and identify percentiles.
Heights, weights, blood pressure readings, and many biological traits in large populations approximate normal distributions. This enables researchers to model populations and make predictions.
Measurement errors in scientific experiments often follow a normal distribution centered at zero. Understanding this helps scientists quantify uncertainty and improve experimental design.
If \( X \sim N(\mu, \sigma^2) \) and you apply a linear transformation \( Y = aX + b \), where \( a \) and \( b \) are constants, then \( Y \) is also normally distributed:
\[ Y \sim N(a\mu + b, a^2\sigma^2) \]This property means that adding a constant shifts the mean, and multiplying by a constant scales both the mean and the standard deviation.
If \( X_1 \sim N(\mu_1, \sigma_1^2) \) and \( X_2 \sim N(\mu_2, \sigma_2^2) \) are independent, then their sum is also normal:
\[ X_1 + X_2 \sim N(\mu_1 + \mu_2, \sigma_1^2 + \sigma_2^2) \]Notice that variances add, not standard deviations.
Example: Two independent machines produce parts.
Machine A produces parts with weights \( N(50, 4) \) grams.
Machine B produces parts with weights \( N(30, 9) \) grams.What is the distribution of the total weight of one part from each machine?
Solution:
Let \( X_A \sim N(50, 4) \) and \( X_B \sim N(30, 9) \).
Total weight \( W = X_A + X_B \).
Mean: \( \mu_W = 50 + 30 = 80 \) grams
Variance: \( \sigma_W^2 = 4 + 9 = 13 \), so \( \sigma_W = \sqrt{13} \approx 3.61 \) grams
The total weight is distributed as \( N(80, 13) \), with mean 80 grams and standard deviation approximately 3.61 grams.
While z-tables are useful for understanding the mechanics of normal distribution calculations, modern practice relies heavily on technology. Graphing calculators, spreadsheet software (like Excel or Google Sheets), and statistical programs (such as R or Python) have built-in functions to find normal probabilities and inverse values quickly and accurately.
For example, many calculators have functions like normalcdf(lower, upper, μ, σ) to find probabilities and invNorm(probability, μ, σ) to find values corresponding to percentiles.
Learning to use these tools efficiently is an important skill, but understanding the underlying concepts ensures you can interpret results correctly and catch errors.
The normal distribution is a fundamental model in statistics characterized by its symmetric, bell-shaped curve. It is completely determined by its mean μ and standard deviation σ. The standard normal distribution \( N(0, 1) \) serves as a reference, and any normal variable can be converted to a z-score for easier probability calculations.
The Empirical Rule provides quick estimates: about 68% of data lies within one standard deviation, 95% within two, and 99.7% within three. For more precise probabilities, we use z-tables or technology.
Understanding normal distributions allows you to model real-world variation, make informed predictions, and interpret data across countless applications in science, business, education, and beyond.