Probability
Multivariate Probability
Covariance & Correlation

Covariance & Correlation

While Joint Distributions tell us where the probability is, Covariance and Correlation tell us how two variables move together.


1. Covariance: The "Raw" Movement

Covariance measures the direction of the linear relationship between two variables.

Cov(X,Y)=E[(Xμx)(Yμy)]\text{Cov}(X, Y) = E[(X - \mu_x)(Y - \mu_y)]

The Computational Formula (The Shortcut)

In practice, we rarely use the formula above. We use this much simpler "shortcut": Cov(X,Y)=E[XY]E[X]E[Y]\text{Cov}(X, Y) = E[XY] - E[X]E[Y]

⚖️

Interpretation:

  • Positive Covariance: When XX is high, YY is also high.
  • Negative Covariance: When XX is high, YY is low.
  • Zero Covariance: No linear relationship.

Properties of Covariance:

  1. Symmetry: Cov(X,Y)=Cov(Y,X)\text{Cov}(X, Y) = \text{Cov}(Y, X).
  2. Covariance with Self: Cov(X,X)=Var(X)\text{Cov}(X, X) = \text{Var}(X).
  3. Linearity: Cov(aX+b,cY+d)=acCov(X,Y)\text{Cov}(aX + b, cY + d) = ac \cdot \text{Cov}(X, Y).
  4. Variance of a Sum: Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) + 2\text{Cov}(X, Y).

2. Correlation: The "Standardized" Movement

The problem with Covariance is that its units are "units of XX times units of YY." If you measure height in meters vs. centimeters, your covariance will change.

Correlation (rr) solves this by dividing Covariance by the standard deviations, creating a unitless number between 1-1 and 11.

ρX,Y=Corr(X,Y)=Cov(X,Y)σXσY\rho_{X,Y} = \text{Corr}(X, Y) = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y}

ValueMeaning
+1Perfect positive linear relationship.
0No linear relationship (they are uncorrelated).
-1Perfect negative linear relationship.

3. The Covariance Matrix (Σ\Sigma)

In fields like Finance and AI, we deal with vectors of random variables. We organize all their relationships into a square matrix:

Σ=[Var(X)Cov(X,Y)Cov(Y,X)Var(Y)]\Sigma = \begin{bmatrix} \text{Var}(X) & \text{Cov}(X, Y) \\ \text{Cov}(Y, X) & \text{Var}(Y) \end{bmatrix}

In a large matrix Σ\Sigma, the diagonal represents the variances of the individual variables, and the off-diagonal entries represent the relationships between them.

Interactive Correlation Explorer

Correlation scatter plot

Adjust r and see how the cloud tightens/loosens. Toggle heteroscedasticity for cone-shaped variance.

Correlation (r)
0.80
-1 is perfect negative, +1 is perfect positive.
Heteroscedasticity
Off
Variance increases with |x| (cone shape).

Test Your Knowledge

Example: Calculating Covariance from a Table

Given the results from our previous example: E[X]=1.5,E[Y]=1.5,E[XY]=2.5E[X]=1.5, E[Y]=1.5, E[XY]=2.5. Calculate the Covariance Cov(X,Y)\text{Cov}(X, Y).

View Step-by-Step Solution

Using the computational formula: Cov(X,Y)=E[XY]E[X]E[Y]\text{Cov}(X, Y) = E[XY] - E[X]E[Y] Cov(X,Y)=2.5(1.5×1.5)\text{Cov}(X, Y) = 2.5 - (1.5 \times 1.5) Cov(X,Y)=2.52.25=0.25\text{Cov}(X, Y) = 2.5 - 2.25 = 0.25

Since the covariance is positive, the variables move together.