Estimate of Burden and Direct Healthcare Cost of Infectious Waterborne Disease in the United States

Provision of safe drinking water in the United States is a great public health achievement. However, new waterborne disease challenges have emerged (e.g., aging infrastructure, chlorine-tolerant and biofilm-related pathogens, increased recreational water use). Comprehensive estimates of the health burden for all water exposure routes (ingestion, contact, inhalation) and sources (drinking, recreational, environmental) are needed. We estimated total illnesses, emergency department (ED) visits, hospitalizations, deaths, and direct healthcare costs for 17 waterborne infectious diseases. About 7.15 million waterborne illnesses occur annually (95% credible interval [CrI] 3.88 million–12.0 million), results in 601,000 ED visits (95% CrI 364,000–866,000), 118,000 hospitalizations (95% CrI 86,800–150,000), and 6,630 deaths (95% CrI 4,520–8,870) and incurring US $3.33 billion (95% CrI 1.37 billion–8.77 billion) in direct healthcare costs. Otitis externa and norovirus infection were the most common illnesses. Most hospitalizations and deaths were caused by biofilm-associated pathogens (nontuberculous mycobacteria, Pseudomonas, Legionella), costing US $2.39 billion annually.

Provision of safe drinking water in the United States is a great public health achievement. However, new waterborne disease challenges have emerged (e.g., aging infrastructure, chlorine-tolerant and biofilm-related pathogens, increased recreational water use). Comprehensive estimates of the health burden for all water exposure routes (ingestion, contact, inhalation) and sources (drinking, recreational, environmental) are needed. We estimated total illnesses, emergency department (ED) visits, hospitalizations, deaths, and direct healthcare costs for 17 waterborne infectious diseases. About 7.15 million waterborne illnesses occur annually (95% credible interval [CrI] 3.88 million-12.0 million), results in 601,000 ED visits (95% CrI 364,000-866,000), 118,000 hospitalizations (95% CrI 86,800-150,000), and 6,630 deaths (95% CrI 4,520-8,870) and incurring US $3.33 billion (95% CrI 1.37 billion-8.77 billion) in direct healthcare costs. Otitis externa and norovirus infection were the most common illnesses. Most hospitalizations and deaths were caused by biofilm-associated pathogens (nontuberculous mycobacteria, Pseudomonas, Legionella), costing US $2.39 billion annually.
activities and setting public health goals (14,15). Quantifying the burden of infectious waterborne disease in the United States would also be beneficial.
Previous studies have attempted to estimate the burden of gastrointestinal illness (16,17) or all illness associated with drinking water (18) and untreated recreational water (19) in the United States, but the burden of disease from all water sources (drinking, recreational, environmental) and exposure routes (ingestion, contact, inhalation) has not been estimated. We present an estimate of the burden of waterborne disease in the United States that includes gastrointestinal, respiratory, and systemic disease; accounts for underdiagnosis; and includes all water sources and exposure routes.

Methods
We defined waterborne disease as disease in which water was the proximate vehicle for exposure to an infectious pathogen. Thus, diseases such as Legionnaires' disease (typically transmitted via inhaled water droplets containing Legionella bacteria) were considered waterborne. In contrast, arboviral diseases like malaria, for which standing water can increase the population of mosquitoes that transmit the parasite that causes malaria, were not considered waterborne. Algal toxins and chemical exposures were not considered. We determined the proportion of disease totals that were attributed to domestic waterborne exposure.
For this estimate, we chose diseases for which surveillance data, administrative data, or literature reports indicated that waterborne transmission for the disease in the United States was plausible, the disease was likely to cause substantial illness or death, and data were available to quantify associated health outcomes. Diseases included in this analysis were campylobacteriosis, cryptosporidiosis, giardiasis, Legionnaires' disease, NTM infection, norovirus infection, acute otitis externa, Pseudomonas pneumonia and septicemia, Shiga toxin-producing Escherichia coli (STEC) infection serotype O157, non-O157 serotype STEC infection, salmonellosis, shigellosis, and vibriosis (including infection by Vibrio alginolyticus, V. parahaemolyticus, V. vulnificus, and other species). To aid in quantifying the burden of respiratory diseases and enteric disease separately, we considered Legionnaires' disease, NTM infection, and Pseudomonas pneumonia primarily respiratory diseases, whereas we considered campylobacteriosis, cryptosporidiosis, giardiasis, norovirus infection, salmonellosis, and shigellosis primarily enteric diseases.
Data were for 2000-2015. All estimates were based on the 2014 US population (318.6 million persons); 2014 was the most recent year for which data were available for all surveillance sources. Estimates were derived from statistical models; each model input had uncertainty represented by a distribution of plausible values. Inputs are described in Appendix 1 and more details on the modeling process are described in Appendix 2. All estimates were rounded to 3 significant figures.

Illnesses
The initial model input was the number of reported or documented cases of illness for each disease, selected hierarchically: data from active surveillance systems were preferred, passive surveillance data were used if active surveillance data were not available, and administrative data were used if no active or passive surveillance system for the disease existed (Table 1). Administrative data sources included the Health Care Utilization Project (HCUP) National Inpatient Sample (HCUP NIS) hospitalization database, the HCUP National Emergency Department Sample (HCUP NEDS) ED visit database, and, in the case of otitis externa, the National Ambulatory Medical Care Survey (NAMCS), which surveys visits to physicians' offices. These administrative data sources use complex sample survey weighting methods and are considered nationally representative. We multiplied the initial reported or documented number of cases for each disease by a series of multipliers that accounted for underreporting and underdiagnosis (including illness severity, medical care-seeking, likelihood of specimen submission, proportion of laboratories capable of performing a diagnostic test, and test sensitivity).

Emergency Department Visits
The surveillance systems used do not tally treat-andrelease ED visits but do capture the proportion of patients hospitalized with a given disease; we combined this proportion with the ratio of treat-and-release ED visits for each disease (reported in HCUP NEDS) to hospitalizations for that disease (in HCUP NIS) to calculate the estimated proportion of reported cases with an ED visit. Although not all patients who visited the ED would have been reported or received a diagnosis, they were assumed to be more likely to receive a diagnosis than patients without an ED visit. Instead of applying the higher underdiagnosis factor used for illness, we used an underdiagnosis factor with a modal value of 2, consistent with previous estimates, and supported by a recent analysis comparing the incidence of bacterial gastroenteritis captured in surveillance and hospital discharge data (14,22,23).

Hospitalizations
We applied the proportion of patients hospitalized according to surveillance data to the estimated number of reported cases to calculate the estimated number of reported hospitalized patients. If surveillance data were not available, the number of hospitalizations reported in HCUP NIS for a particular disease was used. Hospitalized case-patients were assumed to be more likely to have received a diagnosis than nonhospitalized case-patients. Instead of applying the higher underdiagnosis factor used for illness, we used an underdiagnosis factor with a modal value of 2, consistent with previous estimates, and, for some bacterial enteric diseases, supported by recent work (14,22,23).

Deaths
We applied the proportion of case-patients who died, as reported by surveillance data, to the estimated number of reported cases to calculate the estimated number of reported deaths. If surveillance data were not available, we used the method of Gargano et al. (24). In brief, we combined the number of in-hospital deaths for each disease reported in HCUP NIS with the number of out-of-hospital deaths reported in death certificate records. We assumed that patients who died were more likely have received a diagnosis than patients who did not die. Instead of applying the higher underdiagnosis factor used for illness, we used an underdiagnosis factor with a modal value of 2, consistent with previous estimates (14,22).

Domestically Acquired Waterborne Disease
We used surveillance data, when available, to determine the proportion of persons with a given disease who traveled outside the United States during the incubation period. The remaining proportion of cases was considered domestically acquired. When this information was not available, we used literature estimates and expert consultation. We used recent attribution estimates for each disease (25; E.M. Beshearse, unpub. data), derived through structured expert judgment (SEJ), a formal process that answers questions for which data are sparse using expert opinions (26,27), to determine the proportion of disease attributable to waterborne transmission.

Uncertainty Estimates
For each input and multiplier in the model, we used a distribution that accounted for low, high, and midpoint estimates. This distribution accounted for the uncertainty in each input and multiplier and facilitated calculation of uncertainty intervals for final estimates. For diseases with surveillance data available, we used the methods of Scallan et al. to produce model inputs (14). For diseases with administrative data only (e.g., NTM infection and Pseudomonas pneumonia and septicemia), we used the mean hospitalization count from HCUP NIS and computed the illness count as the ratio of hospitalization count to hospitalization rate. We assumed the distribution of the hospitalization count to be normal, with the SD calculated from the reported 95% CI. As we did with surveillance data, we included the variation of hospitalization count over time in the model and assumed that the distribution for each multiplier followed the 4-parameter Program Evaluation and Review Technique (PERT) distribution (28), with disease-specific parameter values based on available publications.
Uncertainty in the final estimates is a cumulative effect of the uncertainty of each model input. Each multiplier was generated independently. Using 100,000 iterations, we obtained distributions of counts and used them to generate point estimates of means and the corresponding 95% credible interval (CrI, the 2.5th percentile through the 97.5th percentile of the empirical distribution). We generated all-disease totals for each outcome by sampling from the distributions generated for each individual disease, using SAS 9.4 (https://www.sas.com) and R 3.5.1 (29).

Direct Healthcare Cost per ED Visit and Hospitalization
We used methods described previously (30,31)

Total Direct Health Care Costs of Domestically Acquired Waterborne Hospitalizations and ED Visits
We estimated the total direct healthcare cost of ED visits and hospitalizations attributed to waterborne transmission in the United States using the total number of ED visits and hospitalizations attributed to waterborne transmission in the United States. We multiplied these figures by the weighted average cost per ED visit or hospitalization, using 100,000 iterations, with uncertainty distributions as described (Appendix 1).

Emergency Department Visits
An estimated 601,000 (95% CrI 364,000-866,000) treatand-release emergency department visits for the included diseases were attributed to waterborne transmission in the United States in 2014 (Table 3). Otitis externa caused the largest number of visits (567,000; 95% CrI 337,000-823,000).

Hospitalizations
We estimate that these diseases were responsible for 118,000 (95% CrI 86,800-150,000) hospitalizations attributed to waterborne transmission in the United States (   recreational, and environmental water exposures. Although the risk of illness from enteric pathogens readily controlled by water treatment processes still exists, this analysis highlights the expanding role of environmental pathogens (e.g., mycobacteria, Pseudomonas, Legionella) that can grow in drinking water distribution systems; plumbing in hospitals, homes, and other buildings; recreational water venues; and industrial water systems (e.g., cooling towers). This snapshot of waterborne disease transmission in the United States circa 2014 contrasts with historical waterborne disease transmission before the implementation of drinking water treatment and sanitation systems (e.g., cholera, typhoid fever, and other enteric pathogens) (1).
Few comparable waterborne disease burden estimates exist for the United States or other high-income countries. The World Health Organization (WHO) has estimated water, sanitation, and hygiene-related disease and injury (i.e., diarrhea, drowning, malnutrition) (32). WHO's estimated 6,600 annual US deaths from nondiarrheal infectious diseases is within the range of our estimate, although the infectious diseases included were not specified, making direct comparison difficult. Work from Australia used the WHO estimates to calculate the waterborne burden of 5 enteric pathogens, whereas estimates from Canada assessed the burden of AGI from drinking water and the burden of 5 enteric pathogens from private wells and small water systems (33)(34)(35). Work in Europe estimated the proportion of 9 primarily enteric diseases attributable to water (36). Prior estimates of the burden of waterborne disease in the United States focused on the burden of gastrointestinal illness associated with drinking water and an estimated 4-32 million cases of illness each year (16)(17)(18). Our estimate differs from previous work because it focuses on specific pathogens, includes nongastrointestinal diseases, and considers all waterborne exposure routes.
A previous estimate of foodborne disease found fewer illness, hospitalizations, and deaths from foodborne disease due to known pathogens (14), although it found more illness when unspecified agents were considered (15). For pathogens included in both estimates, underdiagnosis multipliers did not differ substantially, except for decreases in STEC multipliers because of improved laboratory capacity. The higher totals in this analysis reflect the diseases selected for inclusion, some of which cause severe respiratory diseases more likely to result in hospitalization and death than the diseases with primarily enteric effects that were included in the foodborne estimate. When estimates for the enteric pathogens included in both analyses are compared, the waterborne burden is lower than the foodborne burden. This difference could be because drinking and treated recreational water systems were designed to prevent enteric illness, and the intervention (disinfection) is relatively simple compared with the manifold interventions needed to prevent foodborne illness.
This work is subject to several limitations. First, we used a series of multipliers to generate estimates of disease, and accuracy of these estimates relies on the accuracy of the multipliers. Although we attempted to account for the uncertainty of each data point using uncertainty intervals, any systematic errors in multipliers will produce a biased estimate. For example, waterborne transmission is not the sole route of transmission for any of the diseases in this work; many of the included diseases can be transmitted through multiple pathways (e.g., cryptosporidiosis can be waterborne, foodborne, or transmitted directly from animals or humans). We also relied on structured expert judgment (SEJ) to estimate the proportions of diseases attributed to waterborne transmission. SEJ is an approach used when primary data are not available, and is subject to limitations including expert bias (26,27). For norovirus infection, the uncertainty interval for the waterborne attribution percentage was large, reflecting a lack of consensus among experts, and resulting in an estimate of illness with a wide credibility interval (1,330,000 [95% CrI 5,310-5,510,000] illnesses). Second, this analysis is limited to 17 infectious diseases with adequate surveillance or administrative data available and does not include all disease associated with waterborne transmission in the United States. Insufficient data were available to quantify the contribution of many viral diseases, including sapovirus, rotavirus, and astrovirus; or freeliving ameba infections, which cause deaths in the United States each year (5). Noninfectious diseases (e.g., from exposure to harmful algal blooms, heavy metals, disinfection byproducts) were not considered. Third, these estimates used administrative data and relied on coding from the International Classification of Diseases, 9th Revision, Clinical Modification, which might not accurately capture the actual disease of the ill person. Fourth, the cost estimates consider only outof-pocket and insurer payments and do not account for the total amount of time or wages lost to ill health, disability, early death, or other indirect costs. Physicians' office visits were not included, because data were not available. Payment totals might not reflect the actual cost incurred by healthcare providers. Fifth, this work did not make separate estimates for different age, demographic, or risk groups. Risks could differ by group (e.g., children swim more often and have higher rates of cryptosporidiosis), resulting in over-or underestimation of waterborne disease (37,38). Cost estimates did not consider the contribution of immunosuppressing conditions or other concurrent conditions to the healthcare costs incurred. Finally, some estimates used data from FoodNet. In 2007, Hispanic persons were underrepresented in FoodNet sites (39). Appendix 1 contains additional pathogen-specific limitations. Analytic strengths of these burden estimates include the use of active surveillance data when possible, estimates from a comprehensive structured expert judgment, and credible intervals to acknowledge the inherent uncertainty in the model inputs and outputs.
The data presented here reflect the changing picture of waterborne disease in the United States and underscore the role of environmental pathogens that grow in biofilms. An estimated 7.15 million (95% CrI 3.88 million-12.0 million) domestically acquired waterborne illnesses occur in the United States each year, highlighting the need to focus public health resources on the prevention and control of these diseases, including surveillance for the diseases in this estimate that do not have a dedicated national case surveillance system (e.g., NTM infections). These findings should serve as a foundation for improved disease surveillance, inform waterborne disease prevention priorities, and help measure progress in the prevention of waterborne disease in the United States.