Geometric Distribution

The Geometric distribution models waiting time until the first success in repeated independent Bernoulli trials.

Each trial succeeds with probability $p$ (same $p$ every time)
Trials are independent
The random variable $X$ is the trial index of the first success ( $X = 1,2,3,\dots$ )

PMF

P(X = k) = (1-p)^{k-1}p,\qquad k = 1,2,3,\dots

Intuition

Think of “keep trying until it works”. The probability shrinks geometrically as $k$ grows, because you must fail $k-1$ times in a row, then succeed.

When to use

Use Geometric when:

You are counting how many attempts until the first success
The success probability is constant each attempt
Attempts are independent

Common examples: first sale, first defect, first conversion, first time an event occurs.

The Memoryless Property: "No Progress"

🧠

The Geometric distribution is the only discrete distribution with this property.

The Intuition

Think of it this way: the coin doesn't "remember" its previous flips. If you have already flipped a coin 10 times and gotten Tails every time, the probability that your next flip is Heads is still $p$ . The probability that you will have to wait another 3 flips for your first Heads is exactly the same as if you were starting from scratch.

The Mathematical Proof

To make this proof elegant, we use the Survival Function: the probability that we haven't seen a success yet in $k$ trials.

P(X > k) = (1-p)^k

(Intuition: $P(X > k)$ means you failed $k$ times in a row).

We want to prove that the probability of waiting $t$ more trials, given you've already waited $s$ trials, is the same as the original probability of waiting $t$ trials.

Step 1: Set up the Conditional Probability

P(X > s + t \mid X > s) = \frac{P(X > s + t \text{ and } X > s)}{P(X > s)}

Step 2: Simplify the Numerator
Since $X > s + t$ implies $X > s$ :

P(X > s + t \mid X > s) = \frac{P(X > s + t)}{P(X > s)}

Step 3: Plug in the Survival Function

\frac{(1-p)^{s+t}}{(1-p)^s}

Step 4: Use Exponent Rules

\frac{(1-p)^s \cdot (1-p)^t}{(1-p)^s} = (1-p)^t

Conclusion:

P(X > s + t \mid X > s) = P(X > t)

The trials you have already "wasted" ( $s$ ) have no impact on your future waiting time.

The "Continuous" Cousin

The continuous counterpart to this property is found in the Exponential Distribution, which models time between events in a continuous process.

Common pitfall

Some books define $X$ as the number of failures before the first success (support $0,1,2,\dots$ ). In that parameterization:

P(X = k) = (1-p)^k p,\qquad k = 0,1,2,\dots

This guide uses trial count (support starts at 1).

Test Your Knowledge

Example: Geometric waiting time

A basketball player makes 80% of his free throws ( $p=0.8$ ). What is the probability that he misses his first two shots, but makes his first successful shot on the 3rd attempt?

View Step-by-Step Solution

This is a Geometric distribution where we want the first success on trial $k=3$ .

Formula: $P(X = k) = (1-p)^{k-1}p$

$P(X = 3) = (1 - 0.8)^{3-1}(0.8) = (0.2)^2 (0.8)$

$P(X = 3) = 0.04 \times 0.8 = 0.032$

There is a 3.2% chance his first success happens exactly on the 3rd shot.

Binomial Distribution Multinomial Distribution