Logistic Regression
Despite its name, Logistic Regression is a classification algorithm. Unlike generative models which model the class-conditional distributions, logistic regression directly models the posterior probability of the classes.
1. The Probabilistic Model
Logistic Regression assumes a Discriminative approach. We model the probability of a binary outcome as a Bernoulli distribution:
Where is the Sigmoid (Logistic) Function:
The probability for a single sample is:
Interactive Logistic Sigmoid
Adjust the weight and bias to see how the decision boundary (p=0.5) shifts and how the 'certainty' of the model changes.
2. Training: Maximum Likelihood
Given a dataset , the Likelihood is the product of individual probabilities (assuming i.i.d.):
Log-Likelihood Derivation
To find the optimal , we maximize the Log-Likelihood:
In practice, we often minimize the Negative Log-Likelihood (NLL), which is the same as the Binary Cross-Entropy Loss:
3. Decision Boundary
The decision boundary is the set of points where the probability of both classes is equal:
This results in a linear decision boundary.
Goal: We find the parameters that maximize the likelihood of our observed data. There is no closed-form solution, so we use iterative optimization like Gradient Descent.