(2.1 hours to learn)
Bayesian networks are a graphical formalism for representing the structure of a probabilistic model, i.e. the ways in which the random variables may depend on each other. Intuitively, they are good at representing domains with a causal structure, and the edges in the graph determine which variables directly influence which other variables. They can be equivalently viewed as representing a factorization structure of the joint probability distribution, or as encoding a set of conditional independence assumptions about the distribution.
This concept has the prerequisites:
Core resources (read/watch one of the following)
→ Coursera: Machine Learning
→ Coursera: Probabilistic Graphical Models (2013)
→ Artificial Intelligence II (IIT video lectures)
- Youtube comment from user "SiddharthBhaiVideos" provides a nice outline of the lecture
→ Probabilistic Graphical Models: Principles and Techniques
A very comprehensive textbook for a graduate-level course on probabilistic AI.
Location: Sections 3.1-3.2.2, pages 45-60
→ Artificial Intelligence: a Modern Approach
A textbook giving a broad overview of all of AI.
Location: Sections 14.1-14.2, pages 492-499
Supplemental resources (the following are optional, but you may find them useful)
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location: Sections 10.1-10.2.4, pages 307-318
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location: Sections 8-8.1.3, pages 359-369
- linear regression
- Bayes nets are closely related to Markov random fields (MRFs) , a graphical model formalism which is good at representing soft constraints between variables.
- Neither Bayes nets nor MRFs are strictly more powerful than the other.
- Bayes nets can also be characterized in terms of their conditional independencies. The conditional independencies can be found using d-separation .
- Often we are interested in
- inferring the conditional probabilities of some variables given others
- learning the parameters of the network from data