A
- A* search
- abstract data types
- AdaBoost
- adaptive rejection sampling
- agglomerative clustering
- Akaike information criterion
- alpha-beta pruning
- analyzing recursive algorithms
- annealed importance sampling
- arc consistency
- asymptotic complexity
- asymptotics of maximum likelihood
- automatic differentiation
- autoregressive generative models
- AVL trees
- Axiom of Choice
B
- b-trees
- backpropagation
- backpropagation for second-order methods
- bagging
- bases
- basis function expansions
- Baum-Welch algorithm
- Bayes Ball
- Bayes net parameter learning
- Bayes net structure learning
- Bayes' rule
- Bayesian decision theory
- Bayesian estimation of Bayes net parameters
- Bayesian information criterion
- Bayesian linear regression
- Bayesian logistic regression
- Bayesian model averaging
- Bayesian model comparison
- Bayesian naive Bayes
- Bayesian networks
- Bayesian parameter estimation
- Bayesian parameter estimation in exponential families
- Bayesian parameter estimation: Gaussian distribution
- Bayesian parameter estimation: multinomial distribution
- Bayesian parameter estimation: multivariate Gaussians
- Bayesian PCA
- Bellman equations
- Bellman-Ford algorithm
- beta distribution
- beta process
- BFGS
- bias-variance decomposition
- binary linear classifiers
- binary search
- binary search trees
- binomial distribution
- Bloom filter
- Boltzmann machines
- Boolean algebras
- boosting as optimization
- bootstrap
- breadth-first search
- Brouwer's fixed point theorem
C
- C generics
- C pointers
- C strings
- C struct representation
- call stack
- cardinality
- Central Limit Theorem
- Chain Rule
- change of basis
- Chernoff bounds
- Chinese restaurant franchise
- Chinese restaurant process
- Cholesky decomposition
- Chow-Liu trees
- Church-Turing thesis
- classes (programming)
- collapsed Gibbs sampling
- column space and nullspace
- commuting vector fields
- compactness of first-order logic
- compactness of propositional logic
- comparing Gaussian mixtures and k-means
- comparing normal populations
- completeness of first-order logic
- complex numbers
- complex vectors and matrices
- computational complexity of graphical model inference
- computations on multivariate Gaussians
- computing matrix inverses
- computing probabilities by counting
- computing the nullspace
- conditional distributions
- conditional expectation
- conditional independence
- conditional probability
- conditional random fields
- conjugate gradient
- conservative vector fields
- constraint satisfaction problems
- constructing kernels
- constructing the integers
- constructing the rationals
- constructing the reals
- context-free grammars
- context-free languages
- convergence of conjugate gradient
- convergence of gradient descent
- converting between graphical models
- convex functions
- convex optimization
- convex sets
- convolutional neural nets
- cotangent bundle
- countable sets
- covariance
- covariance matrices
- Cramer's rule
- Cramer-Rao bound
- cross product
- cross validation
- CRP clustering
- cumulative distribution function
- curse of dimensionality
D
- d-separation
- decidability
- decision trees
- deep belief networks
- defining the cardinals
- depth-first search
- determinant
- determinant and volume
- diagonalization
- differentiable manifolds
- differentiable maps between manifolds
- differential entropy
- differential forms
- Dijkstra's algorithm
- Dirichlet diffusion trees
- Dirichlet distribution
- Dirichlet process
- Divergence Theorem
- dot product
- DPLL procedure
- dropout
- dynamic memory allocation in C
E
- early stopping
- eigenvalues and eigenvectors
- EM algorithm for PCA
- entropy
- equivalence relations
- Euler's formula
- evaluating multiple integrals: change of variables
- evaluating multiple integrals: polar coordinates
- evidence approximation
- exceptions (programming)
- expectation and variance
- expectation propagation
- Expectation-Maximization algorithm
- expectimax search
- exploding and vanishing gradients
- exponential distribution
- exponential families
- exterior derivative
F
- F measure
- factor analysis
- factor graphs
- feed-forward neural nets
- fields
- finite automata
- finite-difference approximations to derivatives
- first-order logic
- first-order resolution
- first-order unification
- Fisher information
- Fisher information matrix
- Fisher kernel
- Fisher metric
- Fisher's linear discriminant
- fitting logistic regression with iterative reweighted least squares
- floating point representation
- flows on manifolds
- forward-backward algorithm
- four fundamental subspaces
- function pointers in C
- functions and relations as sets
- functions of several variables
G
- gamma distribution
- gamma function
- Gauss-Newton algorithm
- Gaussian BP on trees
- Gaussian discriminant analysis
- Gaussian distribution
- Gaussian elimination
- Gaussian MRFs
- Gaussian process classification
- Gaussian process regression
- Gaussian processes
- Gaussian variable elimination
- Gaussian variable elimination as Gaussian elimination
- generalization
- generalized linear models
- generative vs. discriminative models
- generic collections in Java
- Gibbs sampling
- Gibbs sampling as a special case of Metropolis-Hastings
- Godel numbering
- Godel's Incompleteness Theorems
- GP classification with the Laplace approximation
- gradient
- gradient descent
- graph representations
- Green's Theorem
H
- Hamiltonian flows
- Hamiltonian Monte Carlo
- hash tables
- heap (data structure)
- heavy-tailed distributions
- hidden Markov models
- hierarchical Dirichlet process
- higher-order partial derivatives
- HMM inference as belief propagation
- Hopfield networks
I
- IBP linear-Gaussian model
- importance sampling
- incompleteness of set theory
- independent component analysis
- independent events
- independent random variables
- Indian buffet process
- inference in MRFs
- information form for multivariate Gaussians
- inheritance (programming)
- inner product
- integration on manifolds
- interfaces and abstract classes in Java
- interpretations between theories
- Isomap
- iterators
J
K
- K nearest neighbors
- k-means
- k-means++
- Kalman filter
- Kalman filter derivation
- Kalman smoother
- Kalman smoothing as forward-backward
- kernel PCA
- kernel ridge regression
- kernel SVM
- kernel trick
- KKT conditions
- KL divergence
L
- Lagrange duality
- Lagrange multipliers
- Laplace approximation
- LASSO
- latent Dirichlet allocation
- latent semantic analysis
- learning Bayes net parameters with missing data
- learning GP hyperparameters
- learning invariances in neural nets
- learning linear dynamical systems
- Lie derivatives
- limited memory BFGS
- limits and continuity in R^n
- line integrals
- line search
- linear approximation
- linear dynamical systems
- linear least squares
- linear regression
- linear regression as maximum likelihood
- linear regression with multiple outputs
- linear regression: closed-form solution
- linear systems as matrices
- linear transformations as matrices
- linear-Gaussian models
- linked lists
- Lob's Theorem
- local search
- log-linear MRFs
- logistic regression
- long short-term memory (LSTM)
- loopy belief propagation
- loopy BP as variational inference
- loss function
- Lowenheim-Skolem theorems
- lower bound on sorting
- LU factorization
M
- machine representations of integers and characters
- MAP parameter estimation
- Markov and Chebyshev inequalities
- Markov chain Monte Carlo
- Markov chains
- Markov decision process (MDP)
- Markov models
- Markov random fields
- matrix inverse
- matrix multiplication
- matrix transpose
- max-product on trees
- maximum likelihood
- maximum likelihood in exponential families
- maximum likelihood: multivariate Gaussians
- MCMC convergence
- mean field approximation
- merge sort
- method of moments
- Metropolis-Hastings algorithm
- minimax search
- minimum spanning trees
- mixture of Bernoullis
- mixture of Gaussians models
- moment generating functions
- Monte Carlo estimation
- MRF parameter learning
- multi-dimensional arrays in C
- multidimensional scaling
- multinomial coefficients
- multinomial distribution
- multiple integrals
- multiplicity of eigenvalues
- multivariate CDF
- multivariate distributions
- multivariate Gaussian distribution
- mutual information
N
- n-gram language models
- naive Bayes
- natural gradient
- natural numbers as sets
- neural probabilistic language models
- Newton's method (optimization)
- nondeterministic finite automata
- nondeterministic Turing machines
- nonlinear conjugate gradient
- nonnegative matrix factorization
- nonparametric density estimation
- NP complexity class
- NP-completeness
O
- optimization problems
- order relations
- ordinal numbers
- oriented manifolds
- orthogonal subspaces
- orthonormal bases
P
- PAC learning
- parameterizing lines and planes
- parametric curves
- parametric surfaces
- partial derivatives
- particle filter
- PDFs of functions of random variables
- Peano axioms
- perceptron algorithm
- Pitman-Yor process
- Poisson distribution
- policy iteration
- positive definite matrices
- precision and recall
- preconditioned conjugate gradient
- principal component analysis
- principal component analysis (proof)
- probabilistic Latent Semantic Analysis
- probabilistic matrix factorization
- probabilistic PCA
- probability
- probit function
- probit regression
- projection onto a subspace
- proofs in first-order logic
- propositional logic
- propositional proofs
- propositional resolution
- propositional satisfiability
- pullback
- pushdown automata
Q
R
- random forests
- random variables
- real numbers
- reasoning with Horn clauses
- recurrent neural networks
- recursion (programming)
- recursion theorem
- recursive backtracking
- recursive functions
- red-black trees
- register machines
- regular expressions
- regular languages
- regularization
- rejection sampling
- representability in arithmetic
- representation invariants
- restricted Boltzmann machines
- reversible generative models
- reversible jump MCMC
- ridge regression
- ridge regression as SVD
- Riemann integral
- Riemannian metrics
- roots of polynomials
- Russell's Paradox
S
- sampling from a Gaussian
- Schur product theorem
- search problems
- second derivative test
- semantics of first-order logic
- sequential minimal optimization
- sequential Monte Carlo
- set operations
- simulated annealing
- singular value decomposition
- slice sampling
- soft margin SVM
- soft weight sharing in neural nets
- softmax regression
- solution sets of linear systems
- solving difference equations with matrices
- sorting
- sparse coding
- specifications (programming)
- spectral decomposition
- stack
- statistical hypothesis testing
- statistical manifolds
- Stirling's approximation
- stochastic gradient descent
- Stokes' Theorem (three dimensions)
- strong law of large numbers
- strongly connected components
- structural induction
- structural risk minimization
- structured mean field
- Student-t distribution
- subspaces
- sufficient statistics
- sum-product on trees
- support vector machine
- support vector regression
- surface integrals
- SVM optimality conditions
- SVM vs. logistic regression
- Swedsen-Wang algorithm
- symplectic manifolds
T
- tangent bundle
- tangent propagation
- Taylor approximations
- tensor fields on manifolds
- Tikhonov regularization
- topological sort
- topology of R^n
- transformation method
- tree (data structure)
- trie (data structure)
- truncated Newton
- trust regions
- Turing machines
U
- ultrafilters
- ultraproduct
- undefinability of truth
- uninformative priors
- uninformed search
- unions of events
- unit testing
- unitary matrices
- unsupervised pre-training
V
- value iteration
- variable elimination
- variational Bayes
- variational Bayes EM
- variational characterization of eigenvalues
- variational inference
- variational inference and convex duality
- variational inference and exponential families
- variational interpretation of EM
- variational linear regression
- variational logistic regression
- variational mixture of Gaussians
- VC dimension
- vector fields
- vector spaces
- vectors
- Viterbi algorithm
- von Mises distribution