Volume 31, Number 4—April 2025
Research
Attribution of Salmonella enterica to Food Sources by Using Whole-Genome Sequencing Data
Table 2
Out-of-bag and train-test performance statistics for a random forest model trained on Salmonella isolates collected from single food sources and using the top 7,360 loci determined from feature selection*
Characteristic | Out-of-bag | Train-test, 75–25† |
---|---|---|
Accuracy | 0.81 | 0.74 |
κ | 0.77 | 0.68 |
Balanced accuracy | 0.61 | 0.52 |
AUC-ROC, uniform distribution‡ | 0.93 | 0.92 |
AUC-ROC, a priori distribution‡ | 0.97 | 0.95 |
*AUC-ROC, area under the receiver operating characteristic curve. †Indicates 75% of the data was used to train the model, whereas 25% was used to test the model. ‡Calculation described at https://www.math.ucdavis.edu/~saito/data/roc/ferri-class-perf-metrics.pdf.
1These first authors contributed equally to this article.
Page created: March 02, 2025
Page updated: March 24, 2025
Page reviewed: March 24, 2025
The conclusions, findings, and opinions expressed by authors contributing to this journal do not necessarily reflect the official position of the U.S. Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors' affiliated institutions. Use of trade names is for identification only and does not imply endorsement by any of the groups named above.