Student’s t-Distribution

🍺

The Student's t-distribution is one of the most widely used distributions in statistics, specifically designed for situations where you have a small sample size and you do not know the population's true standard deviation.

Fun Fact: It is called "Student's" because it was published by William Sealy Gosset in 1908 under the pseudonym "Student" while he was working for the Guinness brewery. He developed it to monitor the quality of small batches of stout!

Core Concepts

When you have a massive amount of data, the Central Limit Theorem allows you to use the Normal (Z) Distribution to calculate confidence intervals and p-values.

However, when your sample size is small (typically $n < 30$ ), using the Normal distribution gives you false confidence. Your estimate of the standard deviation is wobbly.

⚖️

The t-distribution compensates for this uncertainty by having "heavier tails" (more probability mass in the extremes) and a lower peak than the Normal distribution. This makes your confidence intervals wider and more conservative.

The t-statistic

Similar to the Z-score, you calculate a t-score:

t = \frac{\bar{x} - \mu}{s / \sqrt{n}}

(Where $s$ is your sample standard deviation, rather than the true population $\sigma$ ).

Degrees of Freedom (df)

The exact shape of the t-distribution is controlled entirely by a single parameter: degrees of freedom ( $df$ ), which is simply your sample size minus one ( $n - 1$ ).

Low df (e.g., 1 or 2): Very heavy tails. High uncertainty.
High df (e.g., 30+): The t-distribution becomes mathematically indistinguishable from the Standard Normal Distribution.

Key Applications

A/B Testing (t-tests): Comparing the average conversion rate of Design A vs. Design B when you only have traffic data from a few days.
Medical Trials: Testing the efficacy of a new drug on a small cohort of 15 patients against a control group.

Interactive Visualization

Use the slider below to adjust the Degrees of Freedom.

The solid purple curve is the t-Distribution.
The dashed grey line is the Standard Normal Distribution (Z) for reference.

Notice how at $df = 1$ , the purple t-distribution has a much lower peak and much thicker tails. As you slide $df$ towards 30, watch the t-distribution morph upwards and inwards until it perfectly overlaps the Normal distribution!

Student’s t-Distribution

As df increases, it approaches the normal distribution.

Degrees of Freedom (df)

Test Your Knowledge

Example: Calculating a t-Statistic

A sample of $n=16$ batteries has a mean lifespan of $\bar{x} = 95$ hours and a sample standard deviation of $s = 8$ hours. Test the hypothesis that the true population mean is $\mu = 100$ hours by calculating the t-statistic.

View Step-by-Step Solution

Because the population standard deviation is unknown and the sample is small ( $n<30$ ), we use the t-statistic.

$t = \frac{\bar{x} - \mu}{s / \sqrt{n}}$

$t = \frac{95 - 100}{8 / \sqrt{16}} = \frac{-5}{8 / 4} = \frac{-5}{2} = -2.5$

The sample mean is 2.5 standard errors below the hypothesized mean. (With $df = 15$ , this is highly significant).

Chi-Square Distribution F-Distribution