Using laboratory-based surveillance data for prevention: an algorithm for detecting Salmonella outbreaks.

By applying cumulative sums (CUSUM), a quality control method commonly used in manufacturing, we constructed a process for detecting unusual clusters among reported laboratory isolates of disease-causing organisms. We developed a computer algorithm based on minimal adjustments to the CUSUM method, which cumulates sums of the differences between frequencies of isolates and their expected means; we used the algorithm to identify outbreaks of Salmonella Enteritidis isolates reported in 1993. By comparing these detected outbreaks with known reported outbreaks, we estimated the sensitivity, specificity, and false-positive rate of the method. Sensitivity by state in which the outbreak was reported was 0%(0/1) to 100%. Specificity was 64% to 100%, and the false-positive rate was 0 to 1.

By applying cumulative sums (CUSUM), a quality control method commonly used in manufacturing, we constructed a process for detecting unusual clusters among reported laboratory isolates of disease-causing organisms. We developed a computer algorithm based on minimal adjustments to the CUSUM method, which cumulates sums of the differences between frequencies of isolates and their expected means; we used the algorithm to identify outbreaks of Salmonella Enteritidis isolates reported in 1993. By comparing these detected outbreaks with known reported outbreaks, we estimated the sensitivity, specificity, and false-positive rate of the method. Sensitivity by state in which the outbreak was reported was 0%(0/1) to 100%. Specificity was 64% to 100%, and the false-positive rate was 0 to 1.
Effective surveillance systems provide baseline information on incidence trends and geographic distribution of known infectious agents. The ability to provide such information is a prerequisite to detecting new or reemerging threats (1). Laboratory-based surveillance can provide data on the location and frequency of isolation of specific pathogens, which can be used to rapidly detect unusual increases or clusters. These data can be transmitted electronically from multiple public health sites to a central location for analysis.
Many acute outbreaks of infectious diseases are detected by astute clinical observers, local public health authorities, or the affected persons themselves. However, outbreaks dispersed over a broad geographic area, with relatively few cases in any one jurisdiction, are much more difficult to detect locally. Rapid analysis of data to detect unusual disease clusters is the first step in recognizing outbreaks. We developed an algorithm for the Public Health Laboratory Information System (PHLIS) (2) that detects unusual clusters by using a statistical quality control method called cumulative sums (CUSUM), a method commonly used in manufacturing. CUSUM has also been applied to medical audits of influenza surveillance in England and Wales (3,4).

The Algorithm
The statistical problem of detecting unusual disease clusters in public health surveillance is similar to that of detecting clusters of defective items in manufacturing. In both cases, the aim is to detect an unusual number of occurrences. Manufacturing operations use several existing quality control methods, e.g., Shewhart Charts, moving average control, and CUSUM, to indicate abnormalities in data collected (5,6). Of these methods, CUSUM has two unique attributes that make it especially suitable for disease outbreak detection. CUSUM detects smaller shifts from the mean, and it detects similar shifts in the mean more quickly (6)(7)(8). The computational simplicity of this method also makes it especially well suited for use on personal computers. Other published methods (9-11) require more personal interactions, e.g., model building, and use more intense computations.

Applying the Algorithm to Surveillance Data
To evaluate how well the CUSUM algorithm detects unusual clusters of disease, we applied it to the Centers for Disease Control and Prevention (CDC) National Salmonella Surveillance System dataset. Since 1962, this surveillance system has collected reports of laboratoryconfirmed Salmonella isolates from human sources from all U.S. state public health laboratories and the District of Columbia (12). The laboratories serotype clinical isolates of Salmonella by the Kauffman-White methods, which subdivide this diverse bacterial genus into more than 2,000 named serotypes (13). Each week, laboratories report to CDC each Salmonella strain they have serotyped, along with the age, sex, county of residence of the person from whom Dispatches . Figure 1. Algorithm for outbreak detection for one serotype for 1 week. a a Since we are interested in detecting only increases in the number of isolates of Salmonella serotypes, we based our algorithm on a one-sided CUSUM. The numbers vary by serotype, and we assume the numbers of individual serotypes to be normally distributed for any given week in the past 5 years. A one-sided CUSUM determines a positive shift from the expected mean. The

CUSUM (S t ) is
where This simplifies to The standard deviation was used in our calculations instead of the standard error. S t cumulates both the positive deviations of counts greater than k standard deviations from the mean and zero for the negative deviation of counts (8,10,14). The central reference, k, determines how many standard deviations are added to the mean. Setting k=1 helped control the variability in counts due to reporting errors, seasonality, and outbreaks.
To detect any count above delta standard deviations from the mean, a CUSUM decision value, h, was set to ensure an appropriate average run length (ARL). The values h=0.5, k=1, and delta=0.5 yielded an ARL=6 years. This ARL allowed consideration of 5 past years of counts and the count for the current year before the CUSUM signals become out of control (15,16). , S 0 = 0, and k >0. it was isolated, and date of specimen collection. The algorithm uses date of specimen collection, which we consider the nearest reliable date to the date the infection began.
A one-sided CUSUM was calculated for every reported Salmonella serotype and week by using several values for the expected mean. Different expected means were used in the algorithm to identify which value accurately represented the historical data. First we calculated the mean of 5 weeks and the median of 5 weeks for each Salmonella serotype for the same week over the previous 5 years. We then calculated the mean of 15 weeks, which is the mean over a 3-week interval over the past 5 years. For example, for surveillance of the sixth week of 1993, we would use weeks 5 through 7 for each year from 1988 through 1992 to calculate the mean over a 3-week interval. The results of each calculation were compared to identify which value for the expected mean provided the best sensitivity, specificity, and false-positive rate. To minimize the time needed to process the outbreak detection algorithm for each reported serotype for each reported week, the algorithm was processed only for those Salmonella serotypes having a potential outbreak, an expected mean greater than zero, and counts greater than the expected mean ( Figure 1). Since the entire algorithm is processed when the count for a given serotype exceeds the expected mean, the probability structure of CUSUM is not affected.

Testing the Algorithm
The outbreak detection algorithm was tested retrospectively to determine how well it discovered known outbreaks. To identify outbreaks, 52 weekly counts were calculated by serotype for each of the reporting sites over 5 years. The algorithm compared x t , the current weekly count of each Salmonella serotype reported to the National Salmonella Surveillance System, with summary information from the same week over the previous 5 years. The summary information includes N t , the total number of each Salmonella serotype reported over the past 5 years for a given week, and the expected mean over the past 5 years for a serotype for a given week. Each week, except week 52, was defined to contain 7 days. The first week of each year included January 1 through January 7; the last week contained 9 days on a leap year and 8 days otherwise.

Dispatches
A rare or uncommon serotype, i.e., a serotype that had not been reported from a state during the past 5 years, was flagged immediately as a serotype of interest. We compared flags generated by the algorithm by state and week with occurrences of reported outbreaks. We considered the sensitivity, specificity, and false-positive rate for three outbreak sizes: 1) any isolates, 2) at least three isolates, and 3) at least five isolates. Data were limited to reports during 1993 and, because we had information about previously reported outbreaks involving this serotype, CDC's Salmonella serotype Enteritidis (SE) Outbreak Surveillance System (17). Sensitivity was calculated as the number of outbreaks flagged by the algorithm that matched SE outbreaks reported to CDC by state and by week. Because an outbreak could have received several flags corresponding to different weeks, flags in consecutive weeks were counted as both being correct. Specificity was defined as the number of weeks without flags that corresponded to weeks without reported outbreaks. The false-positive rate was defined as the proportion of flags that did not correspond to outbreaks.

Results of the Test
The SE Outbreak Surveillance System had 63 outbreaks reported during 1993 from 20 states and one U.S. territory. Of these 63 outbreaks, 38 reports included date of collection. Two of the reported 38 SE outbreaks occurred in the same state in the same week, and multiple outbreaks occurred 1 week apart in the same state. Therefore, it is difficult to distinguish all 38 reported outbreaks as individual outbreaks.
When we used the mean of 5 weeks as the expected mean in the algorithm, 35 states had 230 flags for clusters with ≥3 isolates (Table 1). For clusters of ≥5 isolates, 25 states had 121 flags. Sensitivity calculations on these flags were 0% (0/1) to 100%, specificity was 64% to 100%, and the overall false-positive rate was 77% (Table 2).
When the median of 5 weeks was used for the expected mean in the algorithm, the algorithm flagged SE in 35 states with 380 unusual clusters with ≥3 isolates. Twenty-five states had 210 flags with ≥5 isolates (these states were the same ones that were flagged when the mean of 5 weeks and counts of ≥5 isolates were used). In each instance in which using the median of 5 weeks resulted in an unusual cluster being flagged that had not been flagged using the mean of 5 weeks, the median of 5 weeks was smaller than the mean of 5 weeks. Clusters flagged by using the median of 5 weeks but not flagged by using the mean of 5 weeks were three to 37 isolates, with a mean of seven per cluster. Three of these clusters with five or more isolates were known outbreaks. Thus, using the median of 5 weeks would have detected three more outbreaks than using the mean of 5 weeks, but at the expense of lower specificity.
Evaluating the algorithm by using the mean of 15 weeks for the expected mean, we found 125 SE flags in 25 states on clusters with ≥5 isolates. These were the same states flagged when the mean of 5 weeks was used for the expected mean. Each time a flag occurred using the mean of 15 weeks, while no corresponding flag occurred using the mean of 5 weeks, the mean of 15 weeks was smaller than the mean of 5 weeks. In this scenario, the sizes of the clusters were 3 to 8 isolates, with an average of 5 isolates per cluster. In comparison, the mean of 5 weeks was associated with a higher specificity than the mean of 15 weeks.
Without a way to calculate an overall specificity for all serotypes, the decision about which value to use as the expected mean in the algorithm was based on the data gathered about SE. Using the median of 5 weeks produced the largest number of flags and the lowest specificity; a mean of 15 weeks generated the second highest number of flags and the second lowest specificity; and using the mean of 5 weeks produced the fewest flags and the highest specificity. Even though using both the median of 5 weeks or the mean of 15 weeks produced additional early flags, this negligible increase in sensitivity was associated with a decrease in specificity. Therefore, we elected to use the mean of 5 weeks for the expected mean in the algorithm, to obtain the highest specificity.

An Assessment of the Algorithm
The CUSUM algorithm provides a simple method to evaluate surveillance data as they are being gathered and provides sensitive and rapid identification of unusual clusters of disease. In this algorithm, a mean of 5 weeks was a better value for the expected mean than a median of 5 weeks or a mean of 15 weeks. Using a mean of 5 weeks, the algorithm failed to flag reported outbreaks only three times. In addition, a median of 5 weeks and a mean of 15 weeks were associated  7  7  8  0  0  0  0  0  0  Ohio  11  13  25  10  12  19  7  8  11  Oklahoma  3  3  3  0  0  0  0  0  0  Oregon  16  17  21  4  4  4  0  0  0  Pennsylvania  1  1  1  1  1  1  1  1  1  Rhode Island  18  18  22  7  7  8  2  2  2  South Carolina  10  10  13  6  6  6  1  1  1  South Dakota  15  15  18  1  1  1  0  0  0  Tennessee  4  4  14  3  3  6  0  0  0  Texas  15  16  24  9  9  10  5  5  5  Utah  13  14  16  2  2  2  0  0  0  Vermont  28  29  30  15  15  16  9  9  10  Virginia  12  13  37  12  13  32  8  9  16  West Virginia  3  3  5  1  1  1  0  0  0  Wisconsin  14  15  31  14  14  25  13  13  15  Total  468  494  763  230  238  380  121  125  210 with lower specificity than the mean of 5 weeks. Therefore, to achieve the best specificity we used a mean of 5 weeks. The sensitivity, specificity, and false-positive rate results indicate that the algorithm works well. However, there are several potential limitations to calculating sensitivity, specificity, and the false-positive rate as we did. Some of these include outbreak size, lack of reporting of isolates, duplicate isolate reports, and underreporting of outbreaks. Constraints on public health resources may limit investigation of small outbreaks of SE. Therefore, we did not include these in the calculation of sensitivity. Underreporting of isolates could cause the algorithm to miss an outbreak, regardless of its size. Underreporting of known SE outbreaks could also inflate our estimates of specificity. An outbreak detection algorithm must have high specificity (i.e., few false flags). The algorithm can be adjusted to achieve better specificity, which would benefit state health departments that may choose to investigate small clusters.

Dispatches
Seasonal shifts in the incidence of Salmonella can interfere with the sensitivity of the outbreak detection algorithm. In our study, we examined only unusual clusters of Salmonella that were above the normal seasonal patterns. Thus, we may have missed smaller outbreaks that were obscured by seasonality. For example, we could have overlooked an outbreak of three cases if it occurred in a season with a high background number of reported cases.
The ability of the algorithm to detect outbreaks rapidly is also affected by the speed with which serotyping is done and the results reported by state public health laboratories.

Dispatches
In early spring 1995, we implemented the algorithm on a weekly basis, looking for unusual clusters at the state, regional, and national levels among Salmonella isolate data reported each week from state public health laboratories to CDC. An international outbreak of Salmonella serotype Stanley was flagged in May 1995 ( Figure 2). S. Stanley is an unusual serotype in the United States, with only 219 cases reported in 1994. The ensuing epidemiologic investigation