Naive Bayes
Naive Bayes is a probabilistic Generative Model for multi-class classification. It treats both inputs and outputs as random variables.
1. Bayes' Rule and Generative ML
For generative models, we use Bayes' Rule to find the posterior:
In classification, we want to find the class that maximizes the posterior:
2. The Naive Assumption: Feature Independence
The "Naive" part comes from the assumption that given the class , all features are independent:
Example: Likelihood from Data
Consider a dataset with features and target :
| 1 | 3 | 5 | 0 |
| 2 | 6 | 1 | 1 |
| 3 | 1 | 9 | 0 |
| 4 | 10 | 1 | 1 |
To classify a new point, we calculate and and pick the maximum.
Class-Conditional Densities
P(x | Class): The distribution of features for Spam vs Ham. Naive Bayes estimates these independently.
Posterior Probabilities
P(Class | x): The final probability used for classification, derived via Bayes' Theorem.
3. Laplace Smoothing
If a feature value never appears with a class in the training set (e.g., ), the entire product becomes zero. We fix this by smoothing:
Where is the number of possible values for feature , and is the smoothing parameter (usually ).
4. Summary of Generative Approach
| Generative Feature | Description |
|---|---|
| Probability | Models the joint probability |
| Independence | Assumes features are independent given class |
| Rule | Uses |
| Smoothing | Necessary to handle unseen feature values |
Goal: We want to find the class that is most likely to have generated the observed features .