restricted Boltzmann machines

(2.8 hours to learn)


Restricted Boltzmann machines (RBMs) are a type of undirected graphical model typically used for learning binary feature representations. The structure consists of a bipartite graph with a layer of visible units to represent the inputs and a layer of hidden units to represent more abstract features. Training is intractable, but approximations such as contrastive divergence work well in practice. RBMs are a building block of many models in deep learning.


This concept has the prerequisites:


  • Know what an RBM is and what distributions it can represent.
  • Understand why training an RBM is intractable. In particular,
    • why is it intractable to compute the gradient?
    • why does the likelihood function have local optima?
  • Know about the contrastive divergence training criterion and understand what approximation is being made.
  • Why does the structure of the model simplify the Gibbs sampling update?
  • Be able to implement an RBM training algorithm such as contrastive divergence.

Core resources (read/watch one of the following)


Learning deep architectures for AI (2009)
A review paper on deep learning techniques written by one of the leaders in the field.
Author: Yoshua Bengio
Other notes:
  • Skim chapters 3 and 4 for motivation.
Coursera: Neural Networks for Machine Learning (2012)
An online course by Geoff Hinton, who invented many of the core ideas behind neural nets and deep learning.
Author: Geoffrey E. Hinton
Other notes:
  • You may want to first skim the lectures on Boltzmann machines.

Supplemental resources (the following are optional, but you may find them useful)


The Elements of Statistical Learning
A graudate-level statistical learning textbook with a focus on frequentist methods.
Authors: Trevor Hastie,Robert Tibshirani,Jerome Friedman


See also