Machine Learning

Machine learning is the discipline of developing algorithms that enable computational systems to learn from data and make informed predictions.

This course provides a rigorous survey of machine learning techniques, from fundamental supervised and unsupervised paradigms to advanced probabilistic and sequential models.

Prerequisites

To fully grasp the concepts in this course, a solid foundation in the following mathematical areas is recommended:

Calculus: Multivariable functions, continuity, derivatives, partial derivatives, the chain rule, and understanding convexity/concavity for optimization.
Linear Algebra: Matrix manipulation, vector operations, and understanding the Jacobian and Hessian matrices.
Probability: Probability Density Functions (PDF), Independence, Conditional Probability, and Bayes' Theorem.

Historical Evolution

The field of Machine Learning has evolved through several distinct eras, driven by the availability of data and computing power:

1980s: The Connectionist Era: Early Neural Networks and the introduction of Backpropagation. Decision Trees (like ID3/C4.5) also became popular for symbolic learning.
1990s: The Statistical Era: A shift towards rigorous mathematical foundations.
- 1992: Introduction of Statistical Learning Theory.
- 1995-1997: The rise of Support Vector Machines (SVMs) and Kernel Methods.
- 1996: Ensemble methods like Random Forests (Bagging).
- 1999: Boosting algorithms (AdaBoost).
2000s - Present: The Deep Learning Revolution: The "Big Data" era. The combination of the Internet (massive datasets), powerful GPUs (parallel compute), and improved algorithms fueled the dominance of Deep Neural Networks.

Performance vs. Data Size

Illustrating why Deep Learning (Neural Networks) became dominant as data volume exploded.

Key Insight: Traditional algorithms tend to plateau in performance once they reach their capacity to represent complexity. In contrast, Neural Networks (especially deep ones) continue to improve as they are fed more data, leveraging modern computing power.

Topics Covered

Supervised and Unsupervised Learning: The two primary paradigms of learning.
Issues in Machine Learning: Fundamental challenges like overfitting and the curse of dimensionality.
Linear Models for Regression: Predicting continuous values using maximum likelihood and Bayesian approaches.
Linear Models for Classification: Separating classes using projections and generative models.
Probabilistic Discriminative Models: Directly modeling posterior probabilities with logistic regression.
Kernel Methods and SVMs: Mapping data to higher dimensions to solve non-linear problems.
Clustering and Mixture Models: Finding patterns in unlabeled data using EM and GMMs.
Sequential Data: Modeling time-dependent data with Markov Chains and HMMs.

Entropy & Information Index