# PAC learning

(2 hours to learn)

## Summary

Probably approximately correct (PAC) learning is a theoretical framework for analyzing the generalization error of a learning algorithm in terms of its error on a training set and some measure of complexity. The goal is typically to show that an algorithm achieves low generalization error with high probability.

## Context

This concept has the prerequisites:

- generalization (PAC learning is a way of analyzing the generalization performance of learning algorithms.)
- unions of events (The union bound is an important tool for analyzing PAC learning.)
- independent events (The analysis assumes that the training examples are independent draws from the distribution.)
- Chernoff bounds (The Chernoff bounds are an important tool for analyzing PAC learning.)

## Goals

- Know what it means for a hypothesis class to be PAC-learnable.

- Derive the bound on the probability of having generalization error greater than epsilon, in terms of the number of training examples and the size of the hypothesis class.
- Try rearranging the terms of the formula to view each variable as a function of the others. E.g., how many training examples do you need in order to guarantee a certain generalization accuracy?

- Carry out the analysis for a particular hypothesis class, such as boolean functions.

- Try to get a sense for how large hypothesis spaces are. E.g., what is the size of the hypothesis space for:
- classifying based on a single Boolean variable (from a set of D)
- all possible functions of D Boolean variables

## Core resources (read/watch one of the following)

## -Free-

→ Coursera: Machine Learning

An online machine learning course aimed at advanced undergraduates.

Other notes:

- Click on "Preview" to see the videos.

→ Stanford's Machine Learning lecture notes

Lecture notes for Stanford's machine learning course, aimed at graduate and advanced undergraduate students.

## -Paid-

→ Artificial Intelligence: a Modern Approach

A textbook giving a broad overview of all of AI.

Location:
Section 18.5, "Why learning works: computational learning theory," pages 668-673

→ An Introduction to Computational Learning Theory (1994)

A graduate level textbook on computational learning theory.

- Section 1.1, "A rectangle learning game," pages 1-6
- Section 1.2, "A general model," pages 6-16
- Section 1.3, "Learning Boolean conjunctions," pages 16-18

## See also

- The Occam learning formalism justifies why simple classifiers should be expected to perform well.
- VC dimension is a measure of complexity which allows these results to be generalized to continuous hypothesis spaces.
- Some examples of PAC learning: PAC-Bayesian analysis allows us to derive PAC-style bounds for Bayesian classifiers