Volume 31, Number 4—April 2025
Research
Attribution of Salmonella enterica to Food Sources by Using Whole-Genome Sequencing Data
Table 2
Out-of-bag and train-test performance statistics for a random forest model trained on Salmonella isolates collected from single food sources and using the top 7,360 loci determined from feature selection*
Characteristic | Out-of-bag | Train-test, 75–25† |
---|---|---|
Accuracy | 0.81 | 0.74 |
κ | 0.77 | 0.68 |
Balanced accuracy | 0.61 | 0.52 |
AUC-ROC, uniform distribution‡ | 0.93 | 0.92 |
AUC-ROC, a priori distribution‡ | 0.97 | 0.95 |
*AUC-ROC, area under the receiver operating characteristic curve. †Indicates 75% of the data was used to train the model, whereas 25% was used to test the model. ‡Calculation described at https://www.math.ucdavis.edu/~saito/data/roc/ferri-class-perf-metrics.pdf.
1These first authors contributed equally to this article.