deep belief networks
(50 minutes to learn)
Deep belief networks (DBNs) are a kind of deep, multilayer graphical model which contains both directed and undirected edges. The bottom layer represents the inputs, and the higher layers are meant to represent increasingly abstract features of the data. DBNs can be trained in a layerwise fashion, and are often used to initialize deep discriminative neural networks, a procedure known as generative pre-training.
This concept has the prerequisites:
- Markov random fields (Part of a deep belief net is an MRF.)
- Bayesian networks (Part of a deep belief net is a Bayes net.)
- restricted Boltzmann machines (The top two layers of a DBN form an RBM, and DBNs can be trained by training a sequence of RBMs.)
- Know the graphical model structure of a DBN and understand what the combination of directed and undirected edges represents.
- Understand why the explaining away effect makes exact inference in a DBN intractable.
- Know how to train a DBN in a layerwise fashion.
- Optional: understand mathematically why layerwise training is guaranteed to improve the likelihood.
Core resources (read/watch one of the following)
→ Learning deep architectures for AI (2009)
A review paper on deep learning techniques written by one of the leaders in the field.
- Skim chapters 3 and 4 for motivation
→ Coursera: Neural Networks for Machine Learning (2012)
An online course by Geoff Hinton, who invented many of the core ideas behind neural nets and deep learning.
- You may want to skim the lectures on learning sigmoid belief nets (Lecture13)
→ A fast learning algorithm for deep belief nets (2006)
The research paper which introduced layerwise training of DBNs.
Supplemental resources (the following are optional, but you may find them useful)
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location: Section 28.2, "Deep generative models," pages 995-998
- Deep belief nets are commonly used for unsupervised pre-training , where one first trains a generative model, and uses it to initialize a discriminative model.
- Deep Boltzmann machines are another closely related deep architecture.
- create concept: shift + click on graph
- change concept title: shift + click on existing concept
- link together concepts: shift + click drag from one concept to another
- remove concept from graph: click on concept then press delete/backspace
- add associated content to concept: click the small circle that appears on the node when hovering over it
- other actions: use the icons in the upper right corner to optimize the graph placement, preview the graph, or download a json representation