# probabilistic PCA

(1.6 hours to learn)

## Summary

Probabilistic principal component analysis (PCA) is a formulation of PCA as a latent variable model. Each data point is assumed to be generated as a linear function of Gaussian latent variables, plus noise. Like PCA, it has a closed form solution in terms of the truncated SVD of the covariance matrix.

## Context

This concept has the prerequisites:

- principal component analysis
- computations on multivariate Gaussians (Probabilistic PCA involves performing inference in a model involving multivariate Gaussians.)
- maximum likelihood (Fitting probabilistic PCA is done using maximum likelihood.)
- principal component analysis (proof) (The maximum likelihood solution is derived using a variation of the proof of PCA correctness.)
- optimization problems (Finding the maximum likelihood solution requires solving an optimization problem.)

## Core resources (read/watch one of the following)

## -Paid-

→ Pattern Recognition and Machine Learning

A textbook for a graduate machine learning course, with a focus on Bayesian methods.

Location:
Sections 12.2-12.2.1, pages 570-577

## Supplemental resources (the following are optional, but you may find them useful)

## -Free-

→ Bayesian Reasoning and Machine Learning

A textbook for a graudate machine learning course.

Location:
Section 21.4, page 436

Additional dependencies:

- factor analysis

## -Paid-

→ Machine Learning: a Probabilistic Perspective

A very comprehensive graudate-level machine learning textbook.

Location:
Section 12.2.4, pages 395-396

## See also

- Probabilistic latent semantic analysis (pLSA) is another probabilistic model similar to PCA, but better geared toward discrete data.
- Bayesian PCA is a Bayesian version of pPCA.