soft margin SVM
(50 minutes to learn)
The standard SVM objective function, which maximizes the margin, only makes sense when the training set is linearly separable. The soft margin SVM gives more flexibility by allowing some of the training points to be misclassified. In addition to handling non-separable training sets, it also can be more robust to outliers or mislabeled data.
This concept has the prerequisites:
Core resources (read/watch one of the following)
→ Stanford's Machine Learning lecture notes
Lecture notes for Stanford's machine learning course, aimed at graduate and advanced undergraduate students.
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location: Section 7.1.1, pages 331-336
Supplemental resources (the following are optional, but you may find them useful)
→ The Elements of Statistical Learning
A graudate-level statistical learning textbook with a focus on frequentist methods.
Location: Section 12.2, "The support vector classifier," pages 417-422, not including subsection "Computing the support vector classifier"
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location: 14.5-14.5.2, pages 496-502
-No Additional Notes-