Normal (Gaussian) Distribution
The Normal distribution, often called the "bell curve," is the single most important probability distribution in statistics. It appears everywhere in nature and forms the backbone of machine learning, hypothesis testing, and inferential statistics.
Core Concepts
If a random variable follows a Normal distribution, we write:
Where:
- (mu) is the mean (which dictates the center of the bell).
- (sigma squared) is the variance (which dictates the spread). The square root, , is the standard deviation.
Probability Density Function (PDF)
Don't panic! The equation below looks terrifying, but it perfectly defines the elegant, smooth shape of the bell curve.
While complex, let's break down why this works mathematically:
- The part creates the exponential decay that forms the "bell" shape.
- The part centers the curve at and scales its width by .
- The part is a normalizing constant. It ensures that the total area under the entire curve equals exactly .
The Empirical Rule (68-95-99.7)
One of the most useful heuristics in statistics applies to all Normal distributions:
- of data falls within 1 standard deviation () of the mean.
- of data falls within 2 standard deviations () of the mean.
- of data falls within 3 standard deviations () of the mean.
Real-World Examples
The Normal distribution is everywhere because of the Central Limit Theorem (covered later).
- Human Heights: The heights of adult males in a country cluster around a mean and taper off symmetrically.
- Measurement Error: The tiny errors a machine makes when precisely measuring weights or lengths.
- Test Scores: Standardized tests like the SAT or GRE are explicitly designed to force scores into a normal distribution.
Interactive Visualization
Below is an interactive graph.
- Drag the Mean () to shift the entire curve left or right along the x-axis. The shape does not change.
- Drag the Standard Deviation () to see how variance impacts the curve. A smaller makes the curve tall and spiked (data is highly predictable). A larger makes the curve flat and wide (data is highly spread out).
Normal Distribution
Drag μ and σ to shift the curve and change its spread.
Test Your Knowledge
Example: Normal Curve Empirical Rule
A population of adult male heights is normally distributed with a mean inches and a standard deviation inches. What percentage of men are between 64 inches and 76 inches tall?
View Step-by-Step Solution
Notice that 64 inches is exactly (). Notice that 76 inches is exactly ().
According to the Empirical Rule (68-95-99.7), approximately 95% of normally distributed data falls within 2 standard deviations of the mean.