The Perceptron
The Perceptron is one of the oldest forms of a neural network (1958). It is a linear classifier that attempts to find a hyperplane that separates two classes of data points.
Core Concepts
If a random variable represents our input vector and our target label, the Perceptron learns a mapping:
Where:
- is the Parameter Vector (determines the orientation of the hyperplane).
- is the Bias (determines the offset from the origin).
The Hypothesis Function
The final prediction is determined by passing a linear combination of inputs through a Heaviside Step Function:
The decision boundary is the set of points where , forming a linear hyperplane.
2. The Perceptron Learning Rule
The Perceptron uses an Online Learning approach, updating its weights for each misclassified sample .
The Update Formula
Whenever :
Where is the Learning Rate.
Interactive Visualization
Visualizing the Update: Imagine a 2D plane where points are red (+1) and blue (-1). Every time the Perceptron misclassifies a point, it "tilts" the decision boundary towards that point to correct the error.
Perceptron Learning (Step-by-Step)
Watch the hyperplane shift to correct its mistakes. Each 'Train' click identifies a misclassified point and rotates the boundary.
w:[0.20, 0.80]bias: -0.50Test Your Knowledge
Example: Manual Perceptron Update
Initial weights , bias , and learning rate . You receive a training point with label . What are the updated weights and bias?
View Step-by-Step Solution
-
Calculate current prediction: (Assuming 0 is misclassified).
-
Assume misclassification: If , but our model predicted :
The new decision boundary is .
3. Convergence Theorem
A key mathematical property of the Perceptron is its Convergence Theorem: If the data is Linearly Separable (i.e., a hyperplane exists that can perfectly separate the two classes), the Perceptron algorithm is guaranteed to converge to a solution in a finite number of steps.
4. Limitations: The XOR Problem
The Perceptron is a linear model. It can only learn patterns that can be separated by a straight line or hyperplane.
A famous failure of the Perceptron is the XOR (Exclusive OR) Problem, which is not linearly separable. This realization in the 1960s led to the "AI Winter" until multi-layer neural networks (which could solve non-linear problems) were developed.
| (XOR) | ||
|---|---|---|
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 1 | 0 |
Non-Separable Data: If the data is NOT linearly separable, the Perceptron will never stop; it will keep oscillating back and forth as it tries to fix misclassifications.