This is the first day of my participation in the Gwen Challenge in November. Check out the details: the last Gwen Challenge in 2021

I. Accuracy and recall rate

Precision & Recall are two metrics widely used in information retrieval and statistical classification to evaluate the quality of results. The accuracy rate refers to the proportion of the number of correctly classified positive samples to the number of positive samples determined by the classifier. Accuracy rate is the statistics of some samples, focusing on the statistics of the data determined as positive class by the classifier. In general, the accuracy rate is the number of items retrieved that are correct and can be regarded as a measure of accuracy (the percentage of tuples marked as positive classes that are actually positive). Recall rate refers to the proportion of the number of correctly classified positive samples to the number of true positive samples. Recall rate is also a statistic of some samples, focusing on the statistics of real positive samples. Recall is how many correct items are retrieved, and its measure of completeness (the percentage of positive tuples marked as positive) is sensitivity.

F1 =2 ∗(correct rate ∗ recall rate)/(correct rate + recall rate). It is a comprehensive evaluation index of the above two indicators, used to reflect the overall indicators.

The values of these indicators are between 0 and 1, and the closer the value is to 1, the better the effect is.

Example 1:

There are 1,400 carp, 300 shrimp and 300 turtles in one pond. Now the aim is to catch carp. A large net was cast and 700 carp, 200 shrimp and 100 turtles were caught. Then, these indicators are as follows:

Example 2:

There are 1,400 carp, 300 shrimp and 300 turtles in one pond. Now the aim is to catch carp. Cast a wide net and caught all the fish, shrimps and turtles:

We hope that the higher the Precision of the retrieval result is, the better it is, and the higher the Recall is, but in fact the two are contradictory in some cases. For example, in extreme cases, we search only one result and it is accurate, then Precision is 100%, but Recall is very low. If we return all the results, then for example Recall would be 100%, but Precision would be low. Therefore, in different situations, you need to judge whether you want higher Precision or higher Recall.

2. Comprehensive evaluation indicators

The accuracy rate and recall rate are sometimes contradictory, so they need to be considered comprehensively. The most common method is F-measure (also known as F-score) :

F1-Score

F1-score is a good evaluation model, which is mainly used for dichotomous problems. The calculation formula is as follows: