Challenges in Forecasting Antimicrobial Resistance

Antimicrobial resistance is a major threat to human health. Since the 2000s, computational tools for predicting infectious diseases have been greatly advanced; however, efforts to develop real-time forecasting models for antimicrobial-resistant organisms (AMROs) have been absent. In this perspective, we discuss the utility of AMRO forecasting at different scales, highlight the challenges in this field, and suggest future research priorities. We also discuss challenges in scientific understanding, access to high-quality data, model calibration, and implementation and evaluation of forecasting models. We further highlight the need to initiate research on AMRO forecasting using currently available data and resources to galvanize the research community and address initial practical questions.

Antimicrobial resistance is a major threat to human health. Since the 2000s, computational tools for predicting infectious diseases have been greatly advanced; however, efforts to develop real-time forecasting models for antimicrobial-resistant organisms (AMROs) have been absent. In this perspective, we discuss the utility of AMRO forecasting at different scales, highlight the challenges in this field, and suggest future research priorities. We also discuss challenges in scientific understanding, access to high-quality data, model calibration, and implementation and evaluation of forecasting models. We further highlight the need to initiate research on AMRO forecasting using currently available data and resources to galvanize the research community and address initial practical questions.

Forecasting AMROs at Different Scales
Operational forecasting of AMR could have implications for public health and patient care. Depending on the intended use, AMRO forecasting can be done at population-level and facility-level scales. Forecasting at the population level aims to predict the trend of infection or carriage prevalence in the general population for relatively long periods of months to years. For AMR pathogens, the forecast target might be the number of AMR infections or the proportion of isolates exhibiting resistance. Those predictions would estimate future AMR burden (e.g., deaths, hospitalization, days of work lost, or direct and indirect economic costs) and the evolution of resistance. If used in real time, those predictions would support situational awareness and inform public health policies such as antimicrobial drug stewardship and more targeted antimicrobial prescription guidelines to slow down AMR spread.
At the facility level, the forecast target of interest might be the number of AMR infections with clinical symptoms within a hospital or hospital system. Such predictions would support control of nosocomial AMRO transmission and resource planning for equipment, medications, staffing, and space in response to potential patient surges. Depending on the clinical relevance, the forecast horizon might be days or months. Of note, predictive models connecting multiple healthcare facilities in a region could elucidate the risk for AMR introduction through interhospital patient transfer and support decision making for preemptive measures in facilities without ongoing transmission.

Challenges in AMRO Forecasting
Although models and data differ considerably for forecasts at various scales, some common challenges impede the development and operational use of predictive models for AMR. Here we summarize these issues and highlight several research priorities to address these challenges in future studies.

Scientific Understanding
For forecasting using mathematical and statistical models, it is critical to understand the key processes affecting AMR spread. Those processes are often represented as nonlinear effects in forecasting models and, if not properly specified, will produce forecasts that quickly diverge from the truth. As of 2023, many questions on AMR remain open (27). For instance, the role of antibiotic use in driving AMR is not fully understood, particularly the effects of coselection (i.e., selection of resistance that is broader than the specific target of an antimicrobial prescription) (28) and the relationship between outpatient use of antimicrobial drugs and resistant infections of hospitalized patients. More generally, it is not yet known which type of antimicrobial drug use (e.g., community use, hospital use, or veterinary use) has the greatest effect on AMR emergence (29). After the emergence of AMROs, it is unclear how competition with susceptible strains affects the incidence of resistant strains and how to explain their coexistence over long time periods (30). Likewise, the issue of spillover (i.e., transmission of AMR across locations) is arguably a substantial challenge for forecasting that has not been addressed (31).
In healthcare facilities, it is unknown how contact networks and heterogeneity of exposure to antibiotics shape the spread of AMR; it is hard to disentangle the roles of community importation and nosocomial transmission; and it is difficult to quantify the relative transmissibility among classes of persons (patients, healthcare workers) and the environment. In addition, individual-level causal relationships between the type and duration of therapy and resistance emergence remain unknown in most instances. The human microbiome serves as a reservoir of antimicrobial resistance (32)(33)(34); however, many outstanding scientific questions on microbiome effects are still under active research as of January 2023. Further studies are needed to examine the role of bystander selection (i.e., selection of resistance on microbes that are not the target pathogen) in AMR emergence (35,36), the reason treatment with cephalosporins is a risk factor for vancomycin-resistant Enterococcus colonization (37), and the difference between detectable colonization and high-level colonization.
To date, infectious disease forecasting has primarily focused on acute viral infections for which the pathogen and its disease or clinical outcome can be directly linked. For instance, viral load is generally correlated with infectivity and disease phenotype (mild to severe) and, therefore, with illness and death rates. Those correlations make definition of the forecasting target (e.g., incident rates of cases, hospital admissions, or deaths) relatively straightforward. However, for bacterial or fungal species, relationships between pathogen load and clinical outcomes are unclear. Because many bacterial species are commensal with their human host and have varying probabilities of presence across body sites, it is challenging to definitively determine whether a person is colonized. Without accurate observation of colonization, AMR burden is not well resolved and, consequentially, is more difficult to forecast.

Accessing High-Quality Data
Forecasting is fundamentally a data-driven task. Without sufficient data, predictive models cannot be properly trained and evaluated. As of 2023, data that can inform operationally useful forecasts of AMR remain scarce. At the population level, several surveillance systems do exist. For instance, the US National Antimicrobial Resistance Monitoring System for Enteric Bacteria tracks changes in antimicrobial susceptibility for certain enteric bacteria in ill persons, retail meats, and food animals (38). However, consistent long-term records of AMR pathogen profiles are lacking in most countries, particularly in low-and middle-income countries and for emerging AMROs with limited cases (39). In addition, several major pathogens responsible for healthcare-associated infections have not been included in surveillance.
At the facility level, AMR data from EHR have become increasingly available to researchers in recent years. In healthcare settings, more attention has been given to infected patients with clinical manifestations. Surveillance for asymptomatic AMRO carriage is not prioritized because it is not of immediate clinical interest, although such carriers play an important role in onward transmission (20). Such incomplete observation hinders estimation of overall AMRO prevalence and may lead to biased prediction targets. In addition, data on nonbiologic processes driving AMR pathogen transmission, such as patient behavior and interactions with healthcare workers, are difficult to collect. In cases for which relevant data are available, data quality may be poor because records can include errors and misclassification. Even for structured EHR data, both predictive variables and outcomes (e.g., colonization) can suffer from missing data.

Model Calibration
Model calibration is the process by which a mathematical model is tuned to reproduce empirical observations. Although this process does not guarantee accurate prediction, model calibration provides an initial check that the model can closely replicate historical data. Studies that calibrate AMR models to empirical data have been published (23,24,(40)(41)(42). However, as the structure of AMR models becomes increasingly complex, computational difficulties arise in fitting these models to observations of different types and at various scales. For instance, populationlevel prevalence, individual-level test results, and genomic sequences of pathogens convey different pieces of information on AMRO transmission, and calibrating AMR models to these observations simultaneously is a challenge. AMRO transmission is intrinsically stochastic with large uncertainty. Quantifying the uncertainty of predictions generated by complex AMR models is difficult, especially for models that track individual persons and their contacts. Calibration approaches, and their success, usually depend on the specific model construct and the form of observations.

Implementation and Evaluation
One prominent challenge for AMRO forecasting is the operational implementation and prospective evaluation of predictive models (i.e., generating forecasts in real time and evaluating those forecasts once prediction targets are observed). There are no guidelines on such implementation for AMRO forecasting, such as appropriate data collection and forecast targets. Questions remain open on the proper time scale of forecast horizon, the frequency at which models need to be updated, and the fundamental limit of predictability of models. For long-lead forecasting, evaluation requires data collection in a consistent manner over a long time period. In healthcare facilities, the practice of testing and reporting AMR infections may change over time, which further complicates using such data records and forecast evaluation. A collaborative effort that standardizes training datasets, forecast targets, forecast horizons, and proper scoring rules for evaluating forecast performance (e.g., the FluSight influenza forecasting challenge [43][44][45], the dengue forecasting challenge [4], and the RAPIDD Ebola forecasting challenge [46] can potentially stimulate advances in operational AMR forecasting. A particular challenge for implementing AMRO forecasting is to handle uncertainty in predictions; uncertainty exists because of imperfect data and a notable degree of variability in many AMR-related processes. Quantifying such uncertainty is critical in other predictive fields, such as numerical weather prediction. For AMROs, whether at the facility level (e.g., determining which patients need to be on contact precautions) or the community level (e.g., public health officials making recommendations for prescribing guidelines because of AMR), decision makers must make decisions that leverage uncertain information. This truism holds for observations as well as forecasts. Designing optimal decision frameworks and architectures that best use forecasts, given their uncertainty, is a needed long-term goal.
Effective communications between modelers and stakeholders such as public health officials, healthcare institutions, and individual practitioners are critical to learn their practical needs from AMR modeling. However, formal reports recording such communications are lacking in scientific journals, which is another factor limiting the generation and use of operational forecasts in real-world settings.
To illustrate the interconnected challenges faced by AMRO forecasting across scales, we use methicillin-resistant Staphylococcus aureus (MRSA) as a concrete example (Figure). Several key issues on MRSA forecasting at the facility scale and population scale and across scales are unresolved; one is that the specific data needed for modeling at those different scales are unknown, as is the role of co-selection and competition with methicillin-susceptible S. aureus (MSSA) in affecting the dynamics of MRSA. Answering those questions would improve methods to reduce MRSA burden in both community and hospital settings.
In this perspective, we focus on real-time forecasting of AMROs. A parallel line of research is scenario-based simulations that project AMR infections conditional on postulated changes in prescribed interventions or expected conditions. Previous studies for HIV and tuberculosis control show that such scenariobased projections can substantially affect health policies and save lives (47,48). For AMR, scenario-based projections should be designed to address practical questions in public health and inform operational policy decision-making in real time, possibly using ensemble approaches that combine multiple models to reflect cross-model variation. Real-time forecasting and scenario-based projections complement each other and should be developed in tandem to control AMR burden and improve human health.

What Can Be Done Now?
Despite all those challenges, research can still be conducted using currently available data and resources. For instance, the feasibility and utility of real-time forecasting of population-level AMR prevalence could be tested using existing surveillance data. Such an exercise might galvanize the research community to address initial practical questions on forecast design (e.g., What variables should be included? What is an appropriate forecast horizon? How forecast skill be evaluated?).
Increasing the availability of existing data could also accelerate progress. Electronic health records contain a wealth of AMR data, each of which reflects a certain aspect of AMR-related processes. Synthesizing previously siloed datasets into mathematical models can potentially answer scientific questions that are otherwise challenging to address using each dataset separately. Privacy-preserved data sharing across facilities can increase the amount of data for modeling and support the development of generalizable methods. When data sharing is not practical, models and algorithms can be shared, trained, and implemented with defined standards and quality control.

Future Opportunities
Given existing gaps in forecasting AMR, predictive models are still not mature enough for operational application. To push forward advances in this burgeoning area, several research directions should be prioritized. First, better communication among multiple sectors and stakeholders, including academic researchers, public health agencies, healthcare providers, and the public, will help identify key questions and the needs of end users of predictive models. Developing and applying AMR forecasting will be a collective effort that should address real-world questions in public health and patient care. Second, studies should make better use of existing data and guide the collection of new data that are essential to understand AMR. Investing in consistent surveillance and data collection is of utmost importance for improving understanding of the emergence, spread, and outcomes of AMR. Third, more effective, computationally efficient algorithms are needed to calibrate complex AMR models to multitype and multiscale data. Better interpretability of models can infuse confidence in clinicians when using those tools. Further, research on computational methods that are tailored to AMR prediction could help bridge theoretical models and real-world applications. Fourth, predictive AMR models should be implemented in real-world settings in real time so that operational utility can be assessed by validating realtime operational predictions, as is done for numerical weather predictions. Forecasting skill, including forecast accuracy and uncertainty, should be evaluated to confirm that predictive models can produce useful predictions despite noisy and incomplete data.
In summary, despite lessons learned from recent advances in forecasting for other acute infectious diseases, AMRO prediction has its own set of challenges, including wide and prolonged asymptomatic carriage, longer time scales, continuing evolution due to strain competition and antimicrobial drug use, and poorly observed disease burden. It will be critical to set appropriate expectations for the performance of AMRO predictions and establish sensible criteria for successful forecasting.