Probability
1. Foundations
Math Descriptions

The Mathematical Descriptions

πŸ“

To work with probability distributions, statisticians have developed three core mathematical functions. Depending on whether your random variable is discrete (counting) or continuous (measuring), you will use different functions.

1. Probability Mass Function (PMF)

πŸ“Š

Used for: Discrete Random Variables.

The PMF tells you the exact probability that a discrete random variable is exactly equal to some value. Because the variable is discrete, you can think of the PMF as assigning a "mass" of probability to specific, isolated points.

Notation:

P(X=x)=p(x)P(X = x) = p(x)

Rules:

  1. Every individual probability must be between 0 and 1: 0≀p(x)≀10 \le p(x) \le 1
  2. The sum of all probabilities across all possible outcomes must equal exactly 1: βˆ‘p(x)=1\sum p(x) = 1

Example: The PMF of rolling a fair 6-sided die is p(x)=1/6p(x) = 1/6 for x∈{1,2,3,4,5,6}x \in \{1, 2, 3, 4, 5, 6\}.


2. Probability Density Function (PDF)

🌊

Used for: Continuous Random Variables. Warning: The PDF does not output a direct probability!

The PDF is the continuous equivalent of the PMF, but it requires a paradigm shift. For a continuous variable (like exact human height), the probability of someone being exactly 180.00000000... cm tall is technically zero!

Instead of looking at exact points, the PDF describes the probability density. To find an actual probability, you must calculate the area under the curve over a specific range using calculus (integration).

Notation:

f(x)f(x)

To find the probability of falling between aa and bb:

P(a≀X≀b)=∫abf(x)dxP(a \le X \le b) = \int_{a}^{b} f(x) dx

Rules:

  1. Density can never be negative: f(x)β‰₯0f(x) \ge 0
  2. The total area under the entire curve must equal exactly 1: βˆ«βˆ’βˆžβˆžf(x)dx=1\int_{-\infty}^{\infty} f(x) dx = 1

3. Cumulative Distribution Function (CDF)

πŸ“ˆ

Used for: Both Discrete and Continuous Random Variables.

The CDF is arguably the most practical function. It answers the question: "What is the probability that my variable will be less than or equal to a specific value xx?"

It is the running total (cumulative sum or integral) of the probabilities up to that point.

Notation:

F(x)=P(X≀x)F(x) = P(X \le x)

Rules:

  1. It always starts at 0: lim⁑xβ†’βˆ’βˆžF(x)=0\lim_{x \to -\infty} F(x) = 0
  2. It always ends at 1: lim⁑xβ†’βˆžF(x)=1\lim_{x \to \infty} F(x) = 1
  3. It is monotonically non-decreasing (it never goes down, it only goes up or stays flat).

Interactive Visualization: PDF vs CDF

Switch between the PDF and CDF views below. Notice how the Area under the curve in the PDF view perfectly matches the exact Y-axis height in the CDF view!

Normal distribution: PDF vs CDF

PDF: probability is area under the curve.

Probability P(X ≀ 0.0)
50.00%
In a PDF, this is the shaded area up to x.
Upper bound (x)
0.0
Drag to move the cutoff.

Test Your Knowledge

Example: PDF vs CDF

Let the continuous random variable XX have a probability density function (PDF) f(x)=2xf(x) = 2x for 0≀x≀10 \le x \le 1. Find the cumulative distribution function (CDF) F(x)F(x) and calculate P(X≀0.5)P(X \le 0.5).

View Step-by-Step Solution

The CDF F(x)F(x) is the integral of the PDF from the lower bound to xx:

F(x)=∫0x2t dt=[t2]0x=x2F(x) = \int_{0}^{x} 2t \, dt = \left[ t^2 \right]_0^x = x^2

To find the probability P(X≀0.5)P(X \le 0.5), we just plug 0.50.5 into the CDF: F(0.5)=(0.5)2=0.25F(0.5) = (0.5)^2 = 0.25

There is a 25% probability that XX is less than or equal to 0.5.