F measure
(50 minutes to learn)
Summary
The F measure (F1 score or F score) is a measure of a test's accuracy and is defined as the weighted harmonic mean of the precision and recall of the test.
Context
This concept has the prerequisites:
- precision and recall (the F measure is defined as the weighted harmonic mean of the precision and recall)
Goals
- Understand why the harmonic mean of the precision and recall is used instead of the arithmetic mean
- Should a web search engine, such as Google, favor high precision or high recall for its top 10 search results? What weights should they then use for the F measure?
Core resources (read/watch one of the following)
-Free-
→ Introduction to Information Retrieval
Supplemental resources (the following are optional, but you may find them useful)
-Free-
→ Wikipedia
See also
- accuracy is the "traditional" way to measure the performance of a system but equally weights the positive and negative results, which may not be desirable in an information retrieval system, as the number of negative results (non relevant results) can vastly outweigh the number of positive results (relevant results). However, the F measure is not invariant to label swapping (switching the positive and negative classes) which may not be desirable when e.g. using the F measure with a classifier.