Markov chains
(45 minutes to learn)
Summary
In a Markov chain, a system transitions stochastically from one state to another. It is a memoryless process, in the sense that the distribution over the next state depends only on the current state, and not on the state at any past time. Markov chains are useful models of many natural processes and the basis of powerful techniques in probabilistic inference and randomized algorithms.
Context
This concept has the prerequisites:
- conditional distributions (Markov chains are defined in terms of conditional distributions.)
- conditional independence (Markov chains are defined in terms of conditional independence.)
- matrix multiplication (The transition operator is conveniently represented in terms of matrix multiplication.)
Core resources (read/watch one of the following)
-Free-
→ Coursera: Probabilistic Graphical Models (2013)
An online course on probabilistic graphical models.
Location:
Lecture "Markov Chain Monte Carlo"
Other notes:
- Click on "Preview" to see the videos.
→ Information Theory, Inference, and Learning Algorithms
A graudate-level textbook on machine learning and information theory.
-Paid-
→ Probabilistic Graphical Models: Principles and Techniques
A very comprehensive textbook for a graduate-level course on probabilistic AI.
Location:
Section 12.3.2, pages 507-512
Supplemental resources (the following are optional, but you may find them useful)
-Paid-
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location:
Sections 17.2-17.2.1 (pages 589-590) and 17.2.3 (pages 596-600)
→ Randomized Algorithms
A graduate-level textbook on randomized algorithms.
Location:
Section 6.2, pages 129-132
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location:
Section 11.2.1, pages 539-541
See also
- Markov models are a probabilistic model which models the data as Markov chains.
- Markov chain Monte Carlo (MCMC) is a powerful class of techniques for approximate inference in probabilistic models.