(1.8 hours to learn)
The Lasso is a form of regularized linear regression. Unlike ridge regression, it puts an L1 penalty on the weights, which encourages sparsity, i.e. it encourages most of the weights to be exactly zero. The general trick of using L1 norms to encourage sparsity is widely used in machine learning.
This concept has the prerequisites:
Core resources (read/watch one of the following)
→ The Elements of Statistical Learning
A graudate-level statistical learning textbook with a focus on frequentist methods.
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location: Sections 13.3-13.3.4, pgs. 429-438
Supplemental resources (the following are optional, but you may find them useful)
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location: Section 3.1.4, pgs. 144-146
- Ridge regression is another regularized version of linear regression, using an L2 penalty instead of L1.
- The LASSO encourages sparsity of the weight vector. If we believe certain features are likely to be important as a group, we can use group sparsity instead.
- Some algorithms for optimizing the LASSO objective include:
- stochastic gradient descent
- least angle regression (LARS)
- Fast Iterative Shrinkage-Thresholding Algorithm (FISTA)
- create concept: shift + click on graph
- change concept title: shift + click on existing concept
- link together concepts: shift + click drag from one concept to another
- remove concept from graph: click on concept then press delete/backspace
- add associated content to concept: click the small circle that appears on the node when hovering over it
- other actions: use the icons in the upper right corner to optimize the graph placement, preview the graph, or download a json representation