Model Evaluation
Confusion Matrix
| Actual\Predicted | C | ¬C |
|---|
| C | True Positives (TP) | False Negatives (FN) |
| ¬C | False Positives (FP) | True Negatives (TN) |
- Accuracy = (TP + TN) / All
- Error Rate = (FP + FN) / All
- Sensitivity = TP / P
- Specificity = TN / N
- Precision = TP / (TP + FP)
- Recall = TP / (TP + FN) = TP / P
- F Measure / F Score = harmonic mean of precision and recall
F=β2P+R(β2+1)PR
- F1-measure = F score with β=1
F=P+R2PR
ROC Curve
- ROC = Receiver Operating Characteristics
- See how a classifier performs with different threshold
- Visualizes tradeoff between precision and recall
- Procedure
- Rank the test tuples with likelihood to be true in decreasing order
- Horizontal axis as False Positive Rate, vertical as True
- Interpretation
- The area under ROC curve measures the accuracy of the model
- Similarly, we have precision-recall curve
MAE and RMSE
- Mean Absolute Error (MAE) = ∑i∣si−ci∣/n
- Root Mean Squared Error (RMSE) = ∑i(si−ci)2/n
Kendall’s Tau
- tau = (# concordant pairs - # discordant pairs) / number of pairs
- Concordant pair means a positive tuple appears before a negative one in terms
of prediction score ranking
- Total number of pairs is (2n)