(1.7 hours to learn)
Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. In other words, it is possible, for example, that variations in three or four observed variables mainly reflect the variations in fewer unobserved variables. Factor analysis searches for such joint variations in response to unobserved latent variables. The observed variables are modelled as linear combinations of the potential factors, plus "error" terms. [from Wikipedia]
This concept has the prerequisites:
- Understand the probabilistic interpretation of factor analysis and how the model is closely related to a number of other common probabilistic models used in machine learning.
- How can latent variables capture higher-order correlations? How does this apply to factor analysis?
Core resources (read/watch one of the following)
→ Bayesian Reasoning and Machine Learning
A textbook for a graudate machine learning course.
→ Machine Learning: a Probabilistic Perspective
A very comprehensive graudate-level machine learning textbook.
Location: Section 12.1, pages 381-387
Supplemental resources (the following are optional, but you may find them useful)
→ Pattern Recognition and Machine Learning
A textbook for a graduate machine learning course, with a focus on Bayesian methods.
Location: Section 12.2.4, pages 583-586
- probabilistic PCA
- Other models related to factor analysis:
- Principal component analysis (PCA) , which finds the maximum variance directions by solving an eigenvalue problem
- probabilistic PCA , a similar generative model, but where the noise covariance is spherical rather than diagonal
- probabilistic matrix factorization (PMF) , a Bayesian model for predicting missing entries of a matrix