Machine Learning
Supervised vs Unsupervised
Index

Supervised and Unsupervised Learning

Machine learning algorithms are broadly categorized based on how they utilize data. The two most prominent paradigms are Supervised Learning and Unsupervised Learning.

The Learning Lifecycle

Before focusing on specific paradigms, it's essential to understand the general lifecycle of a learning agent. This cycle involves building knowledge from data to enable informed actions.

1. The Learning Loop

Knowledge
What the model currently knows
Refining
Learning
Updates parameters from data
Acting
Action
Makes predictions and decisions
Feedback Loop: Outcomes from actions refine future knowledge.

"Learning builds knowledge, and knowledge enables action. The results of actions refine future learning."

Phase A: Training
Input Dataset (Labeled)
Learning Algorithm
LEARNED MODEL
Weights, Parameters, Rules
Phase B: Testing
New / Unseen Data
LEARNED MODEL
PREDICTIONS
Labels, Values, Decisions

Supervised Learning

In Supervised Learning, models are trained on labeled data. This means for every input example x(i)\mathbf{x}^{(i)}, the algorithm is also given a "ground truth" target label y(i)y^{(i)}.

The goal is to learn a mapping f:xyf: \mathbf{x} \to y that generalizes well to unseen data.

🎓

PAC Learning (Probably Approximately Correct): This is a framework for analyzing the learnability of tasks. A learning algorithm is PAC if, with high probability (Probably), the learned model will have a low error rate (Approximately Correct) on new data.

Key Tasks:

  1. Regression: Predicting a continuous value (e.g., predicting house prices based on square footage).
  2. Classification: Predicting a discrete category (e.g., identifying if an email is "Spam" or "Not Spam").

Common Algorithms:

  • Linear Regression
  • Logistic Regression
  • Decision Trees & Random Forests
  • Support Vector Machines (SVM)
  • Neural Networks

Unsupervised Learning

Unsupervised Learning works with unlabeled data. The algorithm is only given inputs xi\mathbf{x}_i and must discover hidden patterns, structures, or distributions within the data p(x)p(\mathbf{x}).

There is no "right answer" provided; the model focuses on finding similarities or differences.

Key Tasks:

  1. Clustering: Grouping similar data points together (e.g., customer segmentation).
  2. Dimensionality Reduction: Simplifying data by reducing features while preserving essential information (e.g., PCA).
  3. Anomaly Detection: Finding unusual data points that deviate from the norm.
  4. Density Estimation: Estimating the underlying distribution of the data.

Common Algorithms:

  • k-means Clustering
  • Principal Component Analysis (PCA)
  • Autoencoders
  • Gaussian Mixture Models (GMM)

Visualizing the Difference

Observe how supervised learning utilizes labels (colors) to understand class distinctions, while unsupervised learning sees only raw data distribution.

Supervised Learning (Labeled Data)

Data points are associated with known target classes.

Unsupervised Learning (Unlabeled Data)

Raw data points without any target labels or categories.

Semi-Supervised Learning: A hybrid approach where the model is trained on a small amount of labeled data and a large amount of unlabeled data. This is common when labeling data is expensive but raw data is abundant.