This is the 12th day of my participation in the August More Text Challenge. For details, see:August is more challenging
Model effect evaluation method
It is divided into prediction problem and classification problem
-
What are the common evaluation methods for prediction problems? MSE (Mean Squared Error) : The expected value of the degree of variation in the data – the smaller the variation – the more accurate the model (parameter estimate – parameter true) ²
MSE = σ (estimated – true) ² /N
RMSE: The arithmetic square root of the mean square error
MAE (mean absolute error) : The actual error of the predicted value of the mean absolute error
-
What are the common evaluation methods for dichotomies? Classification problems: Dichotomous and multi-classification problems
-
Explain accuracy and recall rates – important evaluation indicators for dichotomies
Accuracy: Precision – Number of positive cases judged and actually positive cases/Number of positive cases judged – TP/(TP+FP)
Recall rate: Recall rate – Number of positive cases judged and actually positive cases/number of all positive cases actually – TP/(TP+FN)
N(negative sample) | P(positive sample) | |
---|---|---|
F(False prediction error) | FN – predicted N, but predicted wrong | FP – The prediction is P, but the prediction is wrong |
T(True) | TN – The prediction is N, the prediction is correct | TP- the prediction is P, and the prediction is correct |
PR curve – visualized model accuracy and recall rate graph – fixed one index, improve another rate
-
Briefly explain the correct rate and explain the difference between correct rate and accurate rate
Correct rate: the number of correctly judged – (TP+TN)/(TP+TN+FP+FN) – considering both positive and negative sample predictions
Accuracy is used more frequently
-
Explain accuracy and recall with succinct language or examples
Police catch thieves: Focus on thieves – thief samples as a positive example
Accuracy: The percentage of thieves caught
Recall rate: The percentage of all thieves caught
ROC curve: Recall Rate – FPR(False Positive Rate):TP/TP+FN Accuracy – TPR(True Positive Rate):TP/TP+ FP
-
Briefly introduce the concept of ROC and AUC and their correlation. The ROC curve must pass through the two points (0, 0) and (1, 1). On this basis, maximize the area under the curve – AUC
AUC – A measure of the effectiveness of a dichotomous model
-
What are the evaluation methods for multiple classification problems?
- Multi-classification problem – dichotomy problem – the classification of most concern is taken as positive example, the others are taken as negative example – PR curve
- The confusion matrix is -22 The matrix corresponding to the predicted value and the actual result is extended by -nN matrix – n number of classifications
Diagonal – Correct result
In the multi-classification problem – if a particular classification is concerned – transform into a binary classification – if the classification effect of the model as a whole is concerned – accuracy description
Dichotomies: a special case of the multiclassification problem