Supervised and Unsupervised Learning

Machine learning algorithms are broadly categorized based on how they utilize data. The two most prominent paradigms are Supervised Learning and Unsupervised Learning.

The Learning Lifecycle

Before focusing on specific paradigms, it's essential to understand the general lifecycle of a learning agent. This cycle involves building knowledge from data to enable informed actions.

1. The Learning Loop

Knowledge

What the model currently knows

Refining

Learning

Updates parameters from data

Acting

Action

Makes predictions and decisions

Feedback Loop: Outcomes from actions refine future knowledge.

"Learning builds knowledge, and knowledge enables action. The results of actions refine future learning."

Phase A: Training

Input Dataset (Labeled)

Learning Algorithm

LEARNED MODEL

Weights, Parameters, Rules

Phase B: Testing

New / Unseen Data

LEARNED MODEL

PREDICTIONS

Labels, Values, Decisions

Supervised Learning

In Supervised Learning, models are trained on labeled data. This means for every input example $\mathbf{x}^{(i)}$ , the algorithm is also given a "ground truth" target label $y^{(i)}$ .

The goal is to learn a mapping $f: \mathbf{x} \to y$ that generalizes well to unseen data.

🎓

PAC Learning (Probably Approximately Correct): This is a framework for analyzing the learnability of tasks. A learning algorithm is PAC if, with high probability (Probably), the learned model will have a low error rate (Approximately Correct) on new data.

Key Tasks:

Regression: Predicting a continuous value (e.g., predicting house prices based on square footage).
Classification: Predicting a discrete category (e.g., identifying if an email is "Spam" or "Not Spam").

Common Algorithms:

Linear Regression
Logistic Regression
Decision Trees & Random Forests
Support Vector Machines (SVM)
Neural Networks

Unsupervised Learning

Unsupervised Learning works with unlabeled data. The algorithm is only given inputs $\mathbf{x}_i$ and must discover hidden patterns, structures, or distributions within the data $p(\mathbf{x})$ .

There is no "right answer" provided; the model focuses on finding similarities or differences.

Key Tasks:

Clustering: Grouping similar data points together (e.g., customer segmentation).
Dimensionality Reduction: Simplifying data by reducing features while preserving essential information (e.g., PCA).
Anomaly Detection: Finding unusual data points that deviate from the norm.
Density Estimation: Estimating the underlying distribution of the data.

Common Algorithms:

k-means Clustering
Principal Component Analysis (PCA)
Autoencoders
Gaussian Mixture Models (GMM)

Visualizing the Difference

Observe how supervised learning utilizes labels (colors) to understand class distinctions, while unsupervised learning sees only raw data distribution.

Supervised Learning (Labeled Data)

Data points are associated with known target classes.

Unsupervised Learning (Unlabeled Data)

Raw data points without any target labels or categories.

Semi-Supervised Learning: A hybrid approach where the model is trained on a small amount of labeled data and a large amount of unlabeled data. This is common when labeling data is expensive but raw data is abundant.

Overview Parametric vs Non-Parametric