linear regression as maximum likelihood
(45 minutes to learn)
One way to solve a standard linear regression problem, y=w*x, is to assume the likelihood of the observed y, p(y; w*x, sigma^2) is Gaussian. This assumption means that we believe the observed values of y are a deterministic function of w*x plus some random Gaussian noise: y = w*x + e, where e is random Gaussian noise. If we assume a known sigma, the maximum likelihood estimator for w is obtained by minimizing the sum-of-squares error, Sum[(y-w*x)^2] for all y and x pairs, which has a closed form solution.
This concept has the prerequisites:
Core resources (read/watch one of the following)
→ Mathematical Monk: Machine Learning (2011)
Online videos on machine learning.
Location: Lecture 9.4: MLE for linear regression
- detailed derivation of maximum likelihood estimator
→ Stanford's Machine Learning lecture notes
Lecture notes for Stanford's machine learning course, aimed at graduate and advanced undergraduate students.
- quick summary of maximum likelihood estimator
Supplemental resources (the following are optional, but you may find them useful)
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location: Sections 3.1.1-3.1.2, pgs. 140-144
- create concept: shift + click on graph
- change concept title: shift + click on existing concept
- link together concepts: shift + click drag from one concept to another
- remove concept from graph: click on concept then press delete/backspace
- add associated content to concept: click the small circle that appears on the node when hovering over it
- other actions: use the icons in the upper right corner to optimize the graph placement, preview the graph, or download a json representation