Gaussian discriminant analysis
(1.5 hours to learn)
Summary
Gaussian discriminant analysis (GDA) is a generative model for classification where the distribution of each class is modeled as a multivariate Gaussian.
Context
This concept has the prerequisites:
- binary linear classifiers (GDA is a binary linear classifier.)
- mixture of Gaussians models (GDA uses a mixture of Gaussians model.)
- maximum likelihood (GDA is fit using maximum likelihood.)
- covariance matrices (The GDA solution is given in terms of covariance matrices.)
- dot product
Goals
- Derive the form of the decision boundary in the two-class case when the within-class covariance matrices are shared. In particular, show that it is always a hyperplane.
- What do the decision regions look like when there are more than two classes?
- Show that the decision boundary is quadratic when the covariance matrices are not shared between classes.
- Optional: show that the decision boundary in the two-class case is equivalent to performing classification with linear least squares .
Core resources (read/watch one of the following)
-Free-
→ The Elements of Statistical Learning
A graudate-level statistical learning textbook with a focus on frequentist methods.
Location:
Section 4.3, "Linear discriminant analysis," up through 4.3.2, "Computations for LDA," pages 106-113
-Paid-
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location:
Section 4.2, "Probabilistic generative models," pages 196-203
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location:
Section 4.2, "Gaussian discriminant analysis," up through 4.2.5, "Strategies for preventing overfitting," pages 101-106
Supplemental resources (the following are optional, but you may find them useful)
-Free-
→ Stanford's Machine Learning lecture notes
Lecture notes for Stanford's machine learning course, aimed at graduate and advanced undergraduate students.
See also
- GDA is an example of a generative model .
- GDA is closely related to Fisher's linear discriminant , a method for data visualization.
- In the two-class case, GDA is equivalent to linear regression .