# perceptron algorithm

(55 minutes to learn)

## Summary

The perceptron is a simple algorithm for binary classification where the weights are adjusted in the direction of each misclassified example.

## Context

This concept has the prerequisites:

- binary linear classifiers (The perceptron is a binary linear classifier.)

## Goals

- Know the perceptron update rule

- Optional: show that the algorithm terminates if the data are separated by some margin

- Why can't the algorithm terminate if the data are not linearly separable?

## Core resources (read/watch one of the following)

## -Free-

→ Stanford's Machine Learning lecture notes

Lecture notes for Stanford's machine learning course, aimed at graduate and advanced undergraduate students.

→ Coursera: Machine Learning

An online machine learning course aimed at advanced undergraduates.

Other notes:

- Click on "Preview" to see the videos.

## -Paid-

→ Pattern Recognition and Machine Learning

A textbook for a graduate machine learning course, with a focus on Bayesian methods.

Location:
Section 4.1.7, "The perceptron algorithm," pages 192-196

## Supplemental resources (the following are optional, but you may find them useful)

## -Paid-

→ Machine Learning: a Probabilistic Perspective

A very comprehensive graudate-level machine learning textbook.

Location:
Section 8.5.4, "The perceptron algorithm," pages 265-266

## See also

- The perceptron was proposed in the 50s, although it's still in use. More modern algorithms have a similar form, but are put on a more mathematical footing:
- logistic regression , which is formulated as a probabilistic model
- support vector machines (SVMs) , which are formulated as an optimization problem

- The perceptron convergence proof requires the assumption that the data are linearly separable by a nonzero margin. Support vector machines (SVMs) are geared towards the same case.
- The perceptron can be kernelized in order to capture nonlinear dependencies.