Machine Learning
Linear Classification
Fisher's Linear Discriminant

Fisher's Linear Discriminant

While linear regression aims to predict continuous values, Linear Discriminant Analysis (LDA) aims to project data into a lower-dimensional space while preserving the maximum class separability. Fisher's Linear Discriminant is a specific approach to finding this projection.


1. The Goal: Separation and Compactness

The idea is to find a projection vector w\mathbf{w} such that when we project the data points onto it:

  1. The distance between the class means is as large as possible.
  2. The variance within each class is as small as possible.

2. Fisher's Criterion

We define the between-class scatter SB\mathbf{S}_B and the within-class scatter SW\mathbf{S}_W:

J(w)=wTSBwwTSWwJ(\mathbf{w}) = \frac{\mathbf{w}^T \mathbf{S}_B \mathbf{w}}{\mathbf{w}^T \mathbf{S}_W \mathbf{w}}

Fisher's goal is to find the vector w\mathbf{w} that maximizes this ratio J(w)J(\mathbf{w}).

For two classes:

wSW1(m2m1)\mathbf{w} \propto \mathbf{S}_W^{-1} (\mathbf{m}_2 - \mathbf{m}_1)

where m1,m2\mathbf{m}_1, \mathbf{m}_2 are the class means.


3. Projection and Classification

Once we find the optimal w\mathbf{w}, we project any input x\mathbf{x} onto it: y=wTxy = \mathbf{w}^T \mathbf{x}. We then set a threshold on yy to classify the point.

Fisher's Linear Discriminant

Data points are projected onto a line (w) that maximizes class separation while minimizing within-class spread.


LDA vs. PCA

FeaturePCA (Principal Component Analysis)LDA (Linear Discriminant Analysis)
TypeUnsupervisedSupervised
GoalMaximize variance (Signal)Maximize class separability
LabelsIgnores labelsUses class labels
Use CaseGeneral dimensionality reductionPre-processing for classification

Multiple Classes: Fisher's discriminant can be generalized to K>2K > 2 classes by seeking a projection into a (K1)(K-1)-dimensional space.