unsupervised pre-training
(1.3 hours to learn)
Summary
Training deep feed-forward neural networks can be difficult because of local optima in the objective function and because complex models are prone to overfitting. Unsupervised pre-training initializes a discriminative neural net from one which was trained using an unsupervised criterion, such as a deep belief network or a deep autoencoder. This method can sometimes help with both the optimization and the overfitting issues.
Context
This concept has the prerequisites:
- feed-forward neural nets (Unsupervised pre-training is a way of training feed-forward neural nets.)
- deep belief networks (DBNs are learned in the pre-training step.)
- backpropagation (The fine-tuning step is done with backpropagation.)
Goals
- Understand why training a deep neural network discriminatively with backpropagation is difficult.
- Know how a DBN (or a deep autoencoder) can be converted to a discriminative neural net.
- Understand the justifications of generative pre-training: that it is supposed to find better local optima and prevent overfitting.
- What evidence supports these claims?
Core resources (read/watch one of the following)
-Free-
→ Learning deep architectures for AI (2009)
A review paper on deep learning techniques written by one of the leaders in the field.
→ Coursera: Neural Networks for Machine Learning (2012)
An online course by Geoff Hinton, who invented many of the core ideas behind neural nets and deep learning.
→ To recognize shapes, first learn to generate images (2006)
Supplemental resources (the following are optional, but you may find them useful)
-Paid-
→ Reducing the dimensionality of data with neural networks (2006)
See also
- The advantages of unsupervised pre-training relate to the distinction between generative and discriminative models