SVM optimality conditions
(1.1 hours to learn)
Summary
Using Lagrange duality, we can formulate a set of conditions that characterize the optimal solution to the SVM objective. These conditions show that the weight vector is a linear combination of a (hopefully small) subset of the training points, those for which the margin constraint is tight.
Context
This concept has the prerequisites:
- the support vector machine
- Langrange duality (The optimality conditions for SVMs are derived from the Lagrange dual.)
- KKT conditions (The optimality conditions are an instance of the KKT conditions.)
Core resources (read/watch one of the following)
-Free-
→ Stanford's Machine Learning lecture notes
Lecture notes for Stanford's machine learning course, aimed at graduate and advanced undergraduate students.
Supplemental resources (the following are optional, but you may find them useful)
-Free-
→ The Elements of Statistical Learning
A graudate-level statistical learning textbook with a focus on frequentist methods.
-Paid-
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location:
Section 7.1, up to 7.1.1, pages 326-331
See also
-No Additional Notes-