hidden Markov models
(1.5 hours to learn)
Summary
Hidden Markov models (HMMs) are a kind of probabilistic model widely used in speech and language processing. There is a discrete latent state which evolves over time as a Markov chain, and the current observations depend stochastically on the current latent state. HMMs are popular because they support efficient exact inference algorithms.
Context
This concept has the prerequisites:
- random variables (HMMs are a way of organizing information about random variables.)
- Markov chains (HMMs are based on Markov chains.)
- matrix multiplication (HMMs can be conveniently represented using transition matrices.)
- conditional distributions (HMMs are defined in terms of the transition distribution, which is a conditional distribution.)
- conditional independence (HMMs can be defined in terms of a conditional independence property.)
- Bayesian networks (An HMM can be represented in terms of a Bayes net.)
Core resources (read/watch one of the following)
-Free-
→ A Revealing Introduction to Hidden Markov Models
→ Mathematical Monk: Machine Learning (2011)
Online videos on machine learning.
→ A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition
-Paid-
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location:
Sections 13.1-13.2 (up to 13.2.1), pages 607-615
Supplemental resources (the following are optional, but you may find them useful)
-Paid-
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location:
Sections 17.2-17.2.1 (pages 589-590) and 17.3-17.3.1 (pages 603-606)
→ Artificial Intelligence: a Modern Approach
A textbook giving a broad overview of all of AI.
Location:
Sections 15.1-15.3, pages 537-551
See also
- Some common applications of HMMs: Hidden semi-Markov models are an elaboration of HMMs which explicitly model the duration distribution of each state
- We can perform exact inference efficiently in an HMM using the forward-backward algorithm .
- We can learn the parameters of an HMM using the Baum-Welch algorithm .
- HMMs can be seen as a kind of Bayesian network .
- Dynamic Bayes nets are a kind of time series model which has multiple variables at each time step which influence each other
- The particle filter is a way of performing inference in HMMs when the state space is too large to represent exactly.
- Kalman filters are a widely used special case of HMMs where all of the variables are Gaussian.
- Recurrent neural networks are another class of model often applied to sequence data.