precision and recall

(50 minutes to learn)

Summary

In pattern recognition and information retrieval, precision (also called positive predictive value) is the fraction of retrieved instances that are relevant, while recall (also called sensitivity) is the fraction of relevant instances that are retrieved (Wikipedia). For instance, if there were 50 relevant documents in a corpus where 20 of the 50 documents were relevant to a user, and an information retrieval (IR) system returned 20 documents, where 6 of the documents were relevant, the recall would be 6/50 = 0.12, and the precision would be 6/20 = 0.3.

Context

-this concept has no prerequisites-

Goals

  • Understand the definition of precision and recall and why optimizing an information retrieval system solely for precision or recall is a bad idea.
  • What is the difference between precision and recall?
  • Should you use precision or recall for ranked retrieval sets?

Core resources (read/watch one of the following)

-Free-

Introduction to Information Retrieval
A textbook on information retrieval techniques.
Authors: Christopher D. Manning,Prabhakar Raghavan,Hinrich Schütze

Supplemental resources (the following are optional, but you may find them useful)

See also

  • F measure is the weighted harmonic mean of the recall and precision and provides a measure of overall quality of an information retrieval system