This is the 11th day of my participation in the August More Text Challenge
PR curve and ROC curve are both important indicators of performance evaluation in machine learning. This paper mainly focuses on these two curves.
Preliminary knowledge
- Familiar with basic definition TP, FN, FP, TN.
- Understand the concepts of Precision, Recall, FNR, FPR, TPR, TNR, etc.
The sample data
Sample number | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|---|---|
Real category | P | P | P | P | P | P | N | N | N | N |
Predict – 1
Sample number | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|---|---|
The prediction is a positive sample The probability of |
0.9 | 0.8 | 0.7 | 0.6 | 0.6 | 0.4 | 0.5 | 0.4 | 0.3 | 0.2 |
Predict – 2
Sample number | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|---|---|
The prediction is a positive sample The probability of |
0.8 | 0.7 | 0.7 | 0.8 | 0.7 | 0.4 | 0.6 | 0.3 | 0.3 | 0.1 |
PR curve
Meaning: IN PR curve, P is Precision and R is Recall. The curve drawn with Recall as abscissa and Precision as ordinate is PR curve. If Precision and Recall are not the same values, how do you draw the curves?
Origin of curves
Machine learning in fact trained model in the prediction of test sample, the output of the original results are models to predict the probability of Positive samples, sample rather than we often hear about the Positive or Negative, and how the probability into the output of the machine learning by a threshold, when the probability is greater than the threshold value is judge it as a sample, Otherwise, it is negative sample.
However, this very critical threshold itself is not fixed, and in the process of threshold change, the model’s prediction category of samples will also change accordingly. Therefore, the constant change of threshold will lead to the change of Precision and Recall of the model, and these points will be connected into a line, which is PR curve.
Draw a PR curve for the sample data
In order to display the PR curve drawing process intuitively, the PR curve was drawn according to the sample data (threshold value was selected as 0.05-0.95, span 0.1 for simplicity).
The threshold value | Predict – 1 | Predict – 2 | ||||||
---|---|---|---|---|---|---|---|---|
0.05 | TP | FP | TN | FN | TP | FP | TN | FN |
6 | 4 | 0 | 0 | 6 | 4 | 0 | 0 | |
Precision | 0.6 | Recall | 1 | Precision | 0.6 | Recall | 1 | |
0.15 | TP | FP | TN | FN | TP | FP | TN | FN |
6 | 4 | 0 | 0 | 6 | 3 | 1 | 0 | |
Precision | 0.6 | Recall | 1 | Precision | 0.67 | Recall | 1 | |
0.25 | TP | FP | TN | FN | TP | FP | TN | FN |
6 | 3 | 1 | 0 | 6 | 3 | 1 | 0 | |
Precision | 0.67 | Recall | 1 | Precision | 0.67 | Recall | 1 | |
0.35 | TP | FP | TN | FN | TP | FP | TN | FN |
6 | 2 | 2 | 0 | 6 | 1 | 3 | 0 | |
Precision | 0.75 | Recall | 1 | Precision | 0.86 | Recall | 1 | |
0.45 | TP | FP | TN | FN | TP | FP | TN | FN |
5 | 1 | 3 | 1 | 5 | 1 | 3 | 1 | |
Precision | 0.83 | Recall | 0.83 | Precision | 0.83 | Recall | 0.83 | |
0.55 | TP | FP | TN | FN | TP | FP | TN | FN |
5 | 0 | 4 | 1 | 5 | 1 | 3 | 1 | |
Precision | 1 | Recall | 0.83 | Precision | 0.83 | Recall | 0.83 | |
0.65 | TP | FP | TN | FN | TP | FP | TN | FN |
3 | 0 | 4 | 3 | 5 | 0 | 4 | 1 | |
Precision | 1 | Recall | 0.5 | Precision | 1 | Recall | 0.83 | |
0.75 | TP | FP | TN | FN | TP | FP | TN | FN |
2 | 0 | 4 | 4 | 2 | 0 | 4 | 4 | |
Precision | 1 | Recall | 0.33 | Precision | 1 | Recall | 0.33 | |
0.85 | TP | FP | TN | FN | TP | FP | TN | FN |
1 | 0 | 4 | 5 | 0 | 0 | 4 | 6 | |
Precision | 1 | Recall | 0.17 | Precision | – | Recall | 0 | |
0.95 | TP | FP | TN | FN | TP | FP | TN | FN |
0 | 0 | 4 | 6 | 0 | 0 | 4 | 6 | |
Precision | – | Recall | 0 | Precision | – | Recall | 0 |
Results the coordinates
Serial number | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
---|---|---|---|---|---|---|---|---|---|---|---|
Predict – 1 | Recall | 1 | 1 | 1 | 1 | 0.83 | 0.83 | 0.5 | 0.33 | 0.17 | 0 |
Precision | 0.6 | 0.6 | 0.67 | 0.75 | 0.83 | 1 | 1 | 1 | 1 | – | |
Predict – 2 | Recall | 1 | 1 | 1 | 1 | 0.83 | 0.83 | 0.83 | 0.33 | 0 | 0 |
Precision | 0.6 | 0.67 | 0.67 | 0.86 | 0.83 | 0.83 | 1 | 1 | – | – |
PR curve:
If the p-R curve of one learner is completely covered by the p-R curve of the other learner, it can be asserted that the performance of the latter is better than that of the former. It can be seen from the PR curve in the sample data that prediction 2 is better than prediction 1
In the case of crossover, you can judge the area under the curve and you can judge that the larger area is better
Equilibrium point (BEP) is the value of P=R. If this value is large, it indicates that the performance of the learner is good
Reference PR curve and ROC curve of performance evaluation.
The ROC curve
The English “Receiver operating characteristic curve” is literally translated as “receiver operating characteristic curve”, also known as sensitivity curve. The reason for the name is that each point on the curve reflects the same sensitivity, they are all responses to the same signal stimulus, but they are results obtained under several different criteria.
ROC curve is a curve drawn with false alarm probability (FPR) as the horizontal axis and hit probability (TPR) as the vertical axis in the process of changing thresholds.
Draw the ROC curve for the sample data
In order to visually display the ROC curve drawing process, the ROC curve was drawn according to the sample data (threshold value was selected as 0.05-0.95, span 0.1 for simplicity).
The threshold value | Predict – 1 | Predict – 2 | ||||||
---|---|---|---|---|---|---|---|---|
0.05 | TP | FP | TN | FN | TP | FP | TN | FN |
6 | 4 | 0 | 0 | 6 | 4 | 0 | 0 | |
FPR | 1 | TPR | 1 | FPR | 1 | TPR | 1 | |
0.15 | TP | FP | TN | FN | TP | FP | TN | FN |
6 | 4 | 0 | 0 | 6 | 3 | 1 | 0 | |
FPR | 1 | TPR | 1 | FPR | 0.75 | TPR | 1 | |
0.25 | TP | FP | TN | FN | TP | FP | TN | FN |
6 | 3 | 1 | 0 | 6 | 3 | 1 | 0 | |
FPR | 0.75 | TPR | 1 | FPR | 0.75 | TPR | 1 | |
0.35 | TP | FP | TN | FN | TP | FP | TN | FN |
6 | 2 | 2 | 0 | 6 | 1 | 3 | 0 | |
FPR | 0.5 | TPR | 1 | FPR | 0.25 | TPR | 1 | |
0.45 | TP | FP | TN | FN | TP | FP | TN | FN |
5 | 1 | 3 | 1 | 5 | 1 | 3 | 1 | |
FPR | 0.25 | TPR | 0.83 | FPR | 0.25 | TPR | 0.83 | |
0.55 | TP | FP | TN | FN | TP | FP | TN | FN |
5 | 0 | 4 | 1 | 5 | 1 | 3 | 1 | |
FPR | 0 | TPR | 0.83 | FPR | 0.25 | TPR | 0.83 | |
0.65 | TP | FP | TN | FN | TP | FP | TN | FN |
3 | 0 | 4 | 3 | 5 | 0 | 4 | 1 | |
FPR | 0 | TPR | 0.5 | FPR | 0 | TPR | 0.83 | |
0.75 | TP | FP | TN | FN | TP | FP | TN | FN |
2 | 0 | 4 | 4 | 2 | 0 | 4 | 4 | |
FPR | 0 | TPR | 0.33 | FPR | 0 | TPR | 0.33 | |
0.85 | TP | FP | TN | FN | TP | FP | TN | FN |
1 | 0 | 4 | 5 | 0 | 0 | 4 | 6 | |
FPR | 0 | TPR | 0.17 | FPR | 0 | TPR | 0 | |
0.95 | TP | FP | TN | FN | TP | FP | TN | FN |
0 | 0 | 4 | 6 | 0 | 0 | 4 | 6 | |
FPR | 0 | TPR | 0 | FPR | 0 | TPR | 0 |
Results the coordinates
Serial number | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
---|---|---|---|---|---|---|---|---|---|---|---|
Predict – 1 | FPR | 1 | 1 | 0.75 | 0.5 | 0.25 | 0 | 0 | 0 | 0 | 0 |
TPR | 1 | 1 | 1 | 1 | 0.83 | 0.83 | 0.5 | 0.33 | 0.17 | 0 | |
Predict – 2 | FPR | 1 | 0.75 | 0.75 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 |
TPR | 1 | 1 | 1 | 1 | 0.83 | 0.83 | 0.83 | 0.33 | 0 | 0 |
ROC curve:
In the ROC curve, the coordinate (0,0) represents that the false alarm probability is 0, neither negative samples are judged as positive samples, nor recall rate is 0, neither positive samples are judged as positive samples — that is, all samples are judged as negative samples here. This is because the threshold selection is close to 1, so no sample is delimited into the positive sample area, which does not mean that the model is bad. In fact, all models go through this point.
Similarly, the coordinate (1,1) indicates that the threshold is close to 0, and all samples are judged to be positive.
The coordinate (0,1) means a perfect classifier that correctly judges all positive samples when the false alarm is 0, which is also our lifelong pursuit of machine learning people. Therefore, the closer the ROC curve is to the upper left corner, the better the model performance is.
AUC
AUC is the abbreviation of English Area under Curve, which represents the Area under ROC Curve, that is, the integral of ROC Curve on [0,1]
This is an indicator used to evaluate the performance of model classifier. For different models, the larger AUC is, the better the classification performance is
The value range of AUC is [0,1]. The effective classifier will be greater than 0.5, and the poor classifier will be close to 0.5
If someone asks why is it not that close to zero is the worst? In fact if you get a classifier with AUC zero, you just take the inverse of the output and you get a perfect classifier.