Boltzmann machines
(1.5 hours to learn)
Summary
Boltzmann machines are a kind of probabilistic neural network used in density modeling. They can be viewed as an MRF with only pairwise connections between units, and where the units are typically binary-valued. Restricted Boltzmann machines (RBMs) are a widely used special case.
Context
This concept has the prerequisites:
- Hopfield networks (Boltzmann machines are a probabilistic version of Hopfield networks.)
- maximum likelihood (Boltzmann machines are trained using maximum likelihood.)
- gradient (The gradient is needed for the maximum likelihood update.)
- Gibbs sampling (Gibbs sampling can be used to approximately sample from the equilibrium distribution.)
Goals
- Know the definition of a Boltzmann machine (i.e. what distribution it represents)
- Be able to (approximately) sample from a Boltzmann machine using Gibbs sampling.
- Derive the fact that the model correlations must match the data correlations at the maximum likelihood solution.
- Why can it be beneficial to add hidden units to the network?
- Be aware of the analogies between Boltzmann machine updates and Hopfield network updates.
Core resources (read/watch one of the following)
-Free-
→ Information Theory, Inference, and Learning Algorithms
A graudate-level textbook on machine learning and information theory.
→ Coursera: Neural Networks for Machine Learning (2012)
An online course by Geoff Hinton, who invented many of the core ideas behind neural nets and deep learning.
See also
- Restricted Boltzmann machines (RBMs) are a special case of Boltzmann machines often used in practice.
- The model distribution can also be approximated using the mean field approximation .