...visualizing classifier performance Tobias Sing Dept. of Modeling & Simulation Novartis Pharma AG Joint work with Oliver Sander (MPI for Informatics,

1 ...visualizing classifier performance Tobias Sing Dept. of Modeling & Simulation Novartis Pharma AG Joint work with Ol...
Author: Rhiannon Wadhams
0 downloads 2 Views

1 ...visualizing classifier performance Tobias Sing Dept. of Modeling & Simulation Novartis Pharma AG Joint work with Oliver Sander (MPI for Informatics, Saarbrücken)

2 2 | ROCR | Tobias Sing | July 2, 2007 Classification  Binary classification (Instances, Class labels): (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) y i {1,-1} - valued Classifier: provides class prediction Ŷ for an instance  Outcomes for a prediction: 1 1True positive (TP) False positive (FP) False negative (FP) True negative (TN) True class Predicted class

3 3 | ROCR | Tobias Sing | July 2, 2007 Some basic performance measures  P(Ŷ = Y): accuracy  P(Ŷ = 1 | Y = 1): true positive rate  P(Ŷ = 1 | Y = -1): false positive rate  P(Y = 1 | Ŷ = 1): precision 1 1True positive (TP) False positive (FP) False negative (FP) True negative (TN) True class Predicted class

4 4 | ROCR | Tobias Sing | July 2, 2007 Performance trade-offs  Often: Improvement in measure X  measure Y becomes worse  Idea: Visualize trade-off in a two-dimensional plot  Examples: True pos. rate vs. false pos. rate Precision vs. recall Lift charts …

5 5 | ROCR | Tobias Sing | July 2, 2007 Scoring classifiers  Output: continuous (instead of actual class prediction)  Discretized by choosing a cut-off f(x) ≥ c  class „1“ f(x) < c  class „-1“  Trade-off visualizations: cutoff-parameterized curves

6 6 | ROCR | Tobias Sing | July 2, 2007 ROCR  Only three commands pred

7 7 | ROCR | Tobias Sing | July 2, 2007 Examples (1/8): ROC curves  pred

8 8 | ROCR | Tobias Sing | July 2, 2007 Examples (2/8): Precision/recall curves  pred

12 12 | ROCR | Tobias Sing | July 2, 2007 Examples (6/8): Cutoff labeling – multiple runs  plot(perf, print.cutoffs.at=seq(0,1,by=0.2), text.cex=0.8, text.y=lapply(as.list(seq(0,0.5,by=0.05)), function(x) { rep(x,length([email protected][[1]]))}), col= as.list(terrain.colors(10)), text.col= as.list(terrain.colors(10)), points.col= as.list(terrain.colors(10)))

14 14 | ROCR | Tobias Sing | July 2, 2007 Examples (8/8): Some other examples  perf

15 15 | ROCR | Tobias Sing | July 2, 2007 Extending ROCR: An example  Extend environments assign("auc", "Area under the ROC curve", envir = long.unit.names) assign("auc", ".performance.auc", envir = function.names) assign("auc", "fpr.stop", envir=optional.arguments) assign("auc:fpr.stop", 1, envir=default.values)  Implement performance measure (predefined signature).performance.auc

16 16 | ROCR | Tobias Sing | July 2, 2007 Thank you!  http://rocr.bioinf.mpi-sb.mpg.de http://rocr.bioinf.mpi-sb.mpg.de  Sing et al. (2005) Bioinformatics