In the recommendation system of Internet advertising and the practice of anti-cheating, I often encountered several things such as accuracy rate, recall rate and F value. When I just started working in the industry, I often got confused until I gradually understood. Here’s an example! We use a model to judge the cheating traffic in a period of time. Assume that the traffic in this period is 100, 25 and 75. Assume that the positive sample here is the traffic without cheating. Then we used the LSTM model to predict, and it turned out that 70 of the positive samples were not cheating, but after examination, we predicted that 69 of the positive samples were not cheating. If you take one negative sample and you predict that there is no cheating then in my column p=69/70 (the number of positive samples that are predicted to be positive/the number of positive samples that are predicted to be positive) then R=69/75 (how many positive examples in the sample are predicted to be positive) and the F value is a lot of people might ask, With the recall rate and quasi-removal rate of these two evaluation indicators, it is very good, why should there be F value of the existence of the evaluation quantity?
According to the high school politics teacher, existence is reasonable. Since F value exists, it must be necessary to exist. Ha ha ha!
In the evaluation, of course, we hope that the higher the Precision of the retrieval result is, the better it is, and the higher the Recall is, but in fact, the two are in conflict in some cases.
For example, in my example, this is the case, so to make a comprehensive evaluation of the effect of the model, so the value of F is F= (2PR) /P+R. In fact, I can’t remember this formula, more often you just need to read the reference page.