unsupervised pre-training

(1.3 hours to learn)


Training deep feed-forward neural networks can be difficult because of local optima in the objective function and because complex models are prone to overfitting. Unsupervised pre-training initializes a discriminative neural net from one which was trained using an unsupervised criterion, such as a deep belief network or a deep autoencoder. This method can sometimes help with both the optimization and the overfitting issues.


This concept has the prerequisites:


  • Understand why training a deep neural network discriminatively with backpropagation is difficult.
  • Know how a DBN (or a deep autoencoder) can be converted to a discriminative neural net.
  • Understand the justifications of generative pre-training: that it is supposed to find better local optima and prevent overfitting.
    • What evidence supports these claims?

Core resources (read/watch one of the following)


Learning deep architectures for AI (2009)
A review paper on deep learning techniques written by one of the leaders in the field.
Author: Yoshua Bengio
Coursera: Neural Networks for Machine Learning (2012)
An online course by Geoff Hinton, who invented many of the core ideas behind neural nets and deep learning.
Author: Geoffrey E. Hinton
To recognize shapes, first learn to generate images (2006)
The research paper which introduced unsupervised pre-training.
Author: Geoffrey Hinton

Supplemental resources (the following are optional, but you may find them useful)


See also