gradient descent

(50 minutes to learn)

Summary

Gradient descent, also known as steepest descent, is an iterative optimization algorithm for finding a local minimum of differentiable functions. At each iteration, gradient descent operates by moving the current solution in the direction of the negative gradient of the function (the direction of "steepest descent").

Context

This concept has the prerequisites:

Goals

  • Be able to apply gradient descent to functions of several variables
  • Why is gradient descent not guaranteed to find the global optimum?
  • Why is gradient descent guaranteed to converge? What can we say about the solution it obtains?

Core resources (read/watch one of the following)

-Free-

Convex Optimization
A graduate-level textbook on convex optimization.
Location: pages 467 - 475
Authors: Stephen Boyd,Lieven Vandenberghe
Coursera: Machine Learning (2013)
An online machine learning course aimed at a broad audience.
Author: Andrew Y. Ng
Other notes:
  • Click on "Preview" to see the videos.

Supplemental resources (the following are optional, but you may find them useful)

-Free-

Coursera: Machine Learning
An online machine learning course aimed at advanced undergraduates.
Author: Pedro Domingos
Additional dependencies:
  • perceptron algorithm
Other notes:
  • Click on "Preview" to see the videos.
Bayesian Reasoning and Machine Learning
A textbook for a graudate machine learning course.
Location: A.6
Author: David Barber

See also