Volume 23, Number 3—March 2017
Spatiotemporal Fluctuations and Triggers of Ebola Virus Spillover
Because the natural reservoir of Ebola virus remains unclear and disease outbreaks in humans have occurred only sporadically over a large region, forecasting when and where Ebola spillovers are most likely to occur constitutes a continuing and urgent public health challenge. We developed a statistical modeling approach that associates 37 human or great ape Ebola spillovers since 1982 with spatiotemporally dynamic covariates including vegetative cover, human population size, and absolute and relative rainfall over 3 decades across sub-Saharan Africa. Our model (area under the curve 0.80 on test data) shows that spillover intensity is highest during transitions between wet and dry seasons; overall, high seasonal intensity occurs over much of tropical Africa; and spillover intensity is greatest at high (>1,000/km2) and very low (<100/km2) human population densities compared with intermediate levels. These results suggest strong seasonality in Ebola spillover from wild reservoirs and indicate particular times and regions for targeted surveillance.
Emerging infectious diseases, a persistent threat to global public health, are often linked to rapid environmental change and increasing human mobility (1,2). Notable for its unprecedented size and geographic extent, the 2013–2015 West Africa Ebola epidemic was also the first major human Ebola outbreak outside central Africa and underscored the need for improved methods to forecast emergence in novel regions. Because the natural reservoir of the Ebola virus has not been identified (3) and spillovers present an irregular pattern (4,5), it remains unclear how the probability of Ebola virus disease (EVD) in human populations varies in space and time. Particularly, whether EVD follows a seasonal pattern (6,7) and which historically unaffected geographic regions may also be at risk for EVD outbreaks (8) are 2 important questions that remain largely unanswered. Likewise, how expanding human activities, changing settlement patterns, and increasing population density affect the probability of spillovers remains poorly resolved. Despite the absence of an obvious explanation for the timing and location of past EVD outbreaks, a set of associated social and environmental conditions that anticipate viral spillover may be broadly identifiable. Identifying the environmental correlates that bring us closer to forecasting when and where EVD risk is elevated is critical for improving surveillance and rapid response to future spillovers.
Research on Ebola during the past 2 decades has investigated spatiotemporal disease probability by using conventional time series analysis (9) and geostatistical models (10). By using time series of satellite imagery, multiple studies have suggested that Ebola spillover to humans is more likely to occur at the onset of the dry season (7,11,12). Noting that this pattern is not universal, Lash et al. (13) analyzed patterns in the time series of vegetation greenness and land surface moisture (by using a normalized difference vegetation index) for 5 spillover events and found anomalies (i.e., extreme climatic fluctuations) at a temporal scale of 20 days preceding this subset of spillover events. More recently, species distribution models have been used to map the potential geographic extent of disease probability, as in the work of Pigott et al. (4), who used these models to identify spatial covariates that associate with the occurrence of Ebola virus infection in humans, primates, and bats.
Despite these advances, notable technology gaps remain. For example, we know that spatiotemporal variation and seasonality are key characteristics of EVD regions, but we lack integrative models that reliably incorporate spatiotemporally varying indicators of interannual and intraannual fluctuations into the calculation of spillover probability. Further, although socioeconomic factors are believed to be important drivers of spillover for numerous zoonotic diseases, including Ebola (14,15), the relationship between human population growth and the increasing frequency of EVD outbreaks since the early 1990s remains largely unexplored. We also know that the biology of this region is strongly influenced by climatic seasonality (e.g., the timing of fruit and forage availability and animal migrations). Although such seasonality is widely suspected to affect viral amplification and transmission from wild reservoirs, time-series of climate or vegetation have not been investigated across the region of documented EVD events. In this study, we combine spatial data on changing human population density and distribution during the past 4 decades, satellite-derived estimates of monthly rainfall for most of the same period, and summary measures of climate in a statistical model that dynamically captures the timing of past EVD events. Our model predicts human EVD outbreaks with an estimated accuracy of 80% and shows how EVD risk shifts seasonally as a function of environmental triggers and has varied over the last 3 decades because of increases in human population and changing settlement patterns.
Ebola Spillover Origin Points and Dates
We compiled a table of all known Ebola epizootics and human outbreaks from primary sources and filtered the entries to isolate primary dates and precise locations of distinct spillover events. For human Ebola spillovers, we began with chronological lists compiled by the World Health Organization and the Centers for Disease Control and Prevention. Key sources were Lahm et al. (16) and Leroy et al. (17), who compiled reports of wildlife mortality in Gabon and the Democratic Republic of the Congo; reports by ethnologists observing great ape populations in other regions; coordinates of locations from Mylne et al. (18); and locations and associated information from Kuhn’s compendium (5).
To divide incident reports into discrete spillover events, we separated primary spillovers from secondary occurrences on the basis of widely accepted chronological, geographic, or genetic distances. For example, where viral sequence data indicated that multiple spillover events had occurred, we considered them as such even if they overlapped spatially or temporally. Most events were reported as points. When events were reported as polygons (3 cases), we used polygon centroids as point locations. In contrast to Pigott et al. (4), we excluded data derived from sampling of healthy bats not associated with a spillover event. Because we were seeking to identify potential climatologic triggers, the timing of the spillover was taken to be the earliest report (often unconfirmed) of either human or animal disease rather than the first date of confirmed infection in either humans or animals. Following this procedure, a primary list of 66 spatiotemporal candidate spillover points was reduced to a final list of 44 spillover events (Figure 1; Technical Appendix 1).
To exclude arid and semi-arid regions, which are unlikely to harbor potential Ebola reservoir species and differ sharply in climate from locations where human EVD has occurred, we defined the region of interest as the portion of Africa receiving >500 mm rainfall annually. For this region, we assembled spatial data that capture the major sources of variation in climate and landcover. Following Pigott et al. (4), we chose an enhanced vegetation index (19) and potential evapotranspiration (20) to represent composite axes of coarse environmental variation.
To characterize spatiotemporal variation at seasonal, interannual, and decadal scales, we compiled 3 datasets. First, we compiled population count grids for Africa for 1960, 1970, 1980, 2000, 2005, 2010, and 2015 at 2.5 arc-minutes scale (≈25 km2 at the equator) from the Gridded Population of the World version 3, produced by the Columbia University Center for International Earth Science Information Network (http://sedac.ciesin.columbia.edu/data/collection/gpw-v3). After linearly interpolating counts by grid cell for intervening years, we log10-transformed values of human population and created 3 population bins according to x<102, 102<x<103, and x>103.
Second, we aggregated monthly rainfall from daily rainfall estimates obtained from the Rainfall Estimator (21). This data product was developed in 1998 by the Climate Prediction Center at the National Oceanic and Atmospheric Administration and is available at high (0.1°) spatial resolution for January 1983 to the present.
Third, in addition to actual monthly rainfall, we created a rainfall anomaly index as a means of incorporating the potential importance of relative rainfall. For the time series of 384 monthly rainfall rasters, we divided the value of each month-location by the maximum value for that location to create a set of 384 scaled raster images corresponding to the original monthly rainfall raster images.
Model Fitting and Validation
We restricted analysis to the 37 (80.5%) of 44 EVD events occurring since 1982, the period for which monthly rainfall estimates across Africa were available. We sampled 100,000 random background points from within the portion of Africa receiving >500 mm rainfall annually and randomly assigned each to 1 of 384 months during 1983–2014. Spillover occurrence points and background points were divided into 2/3 training and 1/3 test sets. Because actual monthly rainfall at month and site of EVD outbreak varied considerably, we stratified by rainfall, first ranking points by rainfall amount and then assigning every third point to the test set.
We modeled Ebola spillover intensity, the average density or expected number of points per unit area and time, using bagged logistic regression models with main effects only (22). Bagging (bootstrap aggregating) is a machine learning approach that uses the predictive power generated from ensembles of models based on small subsets of the data (23). By using all 5 predictors described, we fit 1,000 models in which we randomly sampled 10 of the 22 outbreaks in the training dataset and 100 of 100,000 training background points. We predicted each of the 1,000 fitted models on both the training and test datasets. Taking the mean of predicted spillover intensity across model iterations, we compared average predicted spillover intensity for training and test points with labels at each point (known EVD event vs. otherwise) in each dataset to gauge the accuracy of the models.
The set of known EVD events represent a spatiotemporal point process. Point processes are described by an intensity function (i.e., the average density or expected number of points per unit area and time). Therefore, after validation, we used the complete dataset (37 spillover and 100,000 background points) to retroactively predict Ebola spillover intensity across the entire portion of Africa receiving >500 mm rainfall annually for all 384 months for which gridded rainfall data were available (January 1983–December 2014) using human population estimates for 2015. We then averaged the resulting 384 monthly rasters to map seasonal shifts in predicted Ebola spillover intensity across Africa. To map the change in spillover intensity as a function of changes in human population size and distribution across 4 decades, we averaged predicted intensity across all months of 1975 and 2015, then took the difference between annual spillover intensity in 2015 and annual spillover intensity in 1975 across the region of Africa receiving >500 mm rainfall annually.
Detailed methods are provided in Technical Appendix 2, and the R code used is provided in Technical Appendix 3. All data and code are available online (https://figshare.com/articles/ebola_spillover_intensity_final_Rmd/4234280).
Predictive accuracy of the bagged model of EVD intensity trained on the 2/3 training dataset was high. Area under the receiver-operator curve was 0.83 when evaluated on the training dataset and 0.80 when evaluated on 1/3 of the data that were withheld from model training. Overall accuracy ([true positives + true negatives] / total points) was 53% for prediction on the test set. Mean annual Ebola spillover intensity was highest where the enhanced vegetation index is highest in the wettest portions of tropical Africa. For locations within the humid tropics of Africa, predicted spillover intensity was generally, but not always, lowest in dry months (rainfall <50 mm) (Figure 2; Video). Across sites, modeled spillover intensity in months of intermediate rainfall (100–250 mm) was equal to or exceeded that in high (>250 mm) rainfall months (Figure 3). Whereas central Africa exhibits relatively constant spillover intensity throughout the year (particularly within the narrow equatorial region of 15°–30° longitude), we found spillover intensity to be highly seasonal in southern Africa and somewhat variable in West Africa. These results extend the proposed potential range of spillover far beyond the locations of past outbreaks. Compared with previously published spatial models of spillover that did not include temporally varying predictors, our results suggest that a much larger area of Africa is at moderate to high risk for spillover during some months of the year, including much of East Africa, Madagascar, and south central Africa (e.g., Angola and Zambia) and a large portion of West Africa (online video). Although Ebola spillover intensity in seasonally at-risk regions peripheral to central Africa is much lower than in high-intensity central Africa itself, predicted spillover intensity at sites in Angola, Mozambique, and Ethiopia is comparable to that predicted at known spillover locations in South Sudan and Gabon. Predicted spillover intensity over a large portion of Madagascar is similar to that of central Africa (Figure 3). Thus, within the African tropics, the potential for Ebola spillover appears to be geographically widespread.
The effect of human population on Ebola spillover intensity is much smaller than climatic or seasonal effects. The change in average annual spillover intensity did not change markedly for much of Africa as population increased during 1975–2015 (Figure 4), whereas spillover intensity exhibited striking shifts with climate and seasonality (Video). Nevertheless, our model does show that spillover intensity differs by human population density. Mean annual spillover intensity was lowest where population size per 25 km2 grid cell was intermediate (102<x<103) and highest where population density was low (x<100) (Technical Appendix 2 Figure 1). Large changes in spillover intensity (±5%) during 1975–2015 appear to result mainly from population increases. In comparing 2015 to 1975 population density, shifts from intermediate-to-high population densities have generated increased Ebola spillover intensity, particularly in West Africa and the region surrounding Lake Victoria, and shifts from low-to-medium population densities have reduced spillover risk. Similarly, settlements along transportation corridors have increased in population to intermediate densities, leading to substantial declines in predicted spillover intensity. However, as a result of population consolidation over large areas of central Africa, some remote districts have declined in population, typically increasing predicted spillover intensity (Figure 4; Figure 5).
These results quantify a spatiotemporal pattern in the risk for Ebola spillover in 2 specific ways: first, as raw estimates by the model algorithm that can be directly compared between months and across locations (Figure 3); and second, as percentile ranks of these estimates or relative spillover intensity (online video). Percentile ranking adjusts for model miscalibration because some spillovers may not have been observed and because of the overrepresentation of spillover events in base logistic regression models. As such, percentile ranking preserves discriminability (i.e., classification accuracy as measured by area under the receiver-operator curve performance), even when probabilities are not well calibrated. Whether raw or ranked, spillover intensities are a measurement of risk, with values proportional to the probability of a spillover that changes as a function of environmental conditions based on the best information available on the location and timing of unique spillover events. Although the transmission, dynamics, and possibly the seasonality of different viral strains may differ, our approach, constrained by the small number of spillovers, properly considers EVD as a syndrome caused by all known strains of the Ebola virus. By constructing models to compare the covariates associated with this set of known spillover events to the background possibilities from which they might have been drawn, which we accurately approximate by using a sample of 100,000 random points, we have robustly determined how the intensity of Ebola spillovers changes with observable covariates.
These results indicate that 1) there is a geographic gradient of annual Ebola spillover intensity that peaks in central Africa but extends during at least some months of the year through a large portion of tropical Africa not previously considered to be at high risk (4,24), including the tropical/subtropical forest/woodland regions of Ethiopia, Angola, Zambia, East Africa, and Madagascar; 2) there is substantial seasonal fluctuation in the spatial pattern of Ebola spillover intensity; 3) there is a temporal gradient in spillover intensity in which the driest months show the lowest intensity and intensity peaks or plateaus in months of intermediate rainfall; and 4) increases in human population density may increase Ebola spillover risk in West and central Africa. Ebola spillover intensity is greatest when regions that are typically very wet make the transition to or from dry periods. This result corroborates the finding from previous studies (7,11) linking EVD events to preceding dry-to-wet transitions through time series analysis of data from a normalized difference vegetation index. Within predominantly or seasonally wet climate zones in particular, our results show Ebola spillover intensity to be highest in moderately dry months and lowest in extremely dry months.
Seasonal dynamics in spillover intensity are most pronounced where rainfall seasonality is greatest (i.e., outside the less seasonal and wetter rainforest biome of central Africa, where EVD events have been most frequent and spillover intensity is most steady throughout the year). Strong seasonal patterns may be related not only to seasonal drivers, such as rainfall, but to migration patterns and seasonal competence of wildlife reservoirs. Seasonal effects on resource availability may drive migrations or other changes in movement patterns that, in turn, may affect population density, social behaviors, and contact rates among hosts (25). Seasonal changes may also alter the frequency of host encounters with infective agents or material in the environment, and host immune defenses can shift with annual reproductive cycles (26). Seasonality is also likely to alter human behavior, including hunting effort, level of bushmeat consumption, or, more generally, the degree and kind of contact with wildlife.
Our model finds that Ebola spillover intensity varies temporally as a function of climate variables without explicitly incorporating sociocultural dimensions, such as land use, which was not available as a time series, or biotic features, such as the ranges of suspected reservoir hosts. Therefore, the degree of human disease intensity at locations far from documented EVD events may also depend on whether the range of a necessary reservoir also extends to these points. In recent work, species distribution models were used to predict the ranges of potential mammal reservoirs and the degree of overlap of predicted ranges with Ebola and Marburg spillovers to suggest likely mammalian reservoirs (27). Among the taxa that overlapped with all EVD sites were the sun squirrel genus (Heliosciurus) and the straw-colored fruit bat (E. helvum), both of which had predicted ranges covering nearly all of tropical Africa (with the exception of Madagascar, where E. dupreanum is present), where our models predicted high Ebola spillover intensity at least seasonally. Thus, our predictions across continental Africa may adequately reflect the biotic component of risk. However, an important next step would be to assess whether the presence of suitable animal hosts or cultural or socioeconomic factors in Madagascar and East Africa make this region a priority for surveillance.
Our model was trained by using great ape and human EVD events. Great ape spillover events (usually observations made by primatologists and wildlife researchers within reserves, in this dataset restricted to Gabon and the Democratic Republic of the Congo) are associated with low human population density, whereas our model associates human spillovers with high human population densities (>103 persons/25 km2). At low population densities, epidemic spread is less likely, and deaths in remote outposts may go unreported. The link between human population density and Ebola spillover intensity could be simply a function of increased reporting at high population densities (and some locations with very low population densities). Alternatively, increased contact with or consumption of wildlife as population density increases, or perhaps the increased abundance of reservoir or bridge reservoir species at either high or low human population density (or both) could drive the relationship. Substantially increased raw Ebola spillover intensity (>5%) as a result of population increases is most apparent in areas of West Africa but could eventually include central Africa if urbanization or population consolidation continues there. We note that human population was a much less important predictor than variables capturing climate and seasonality.
In conclusion, we developed a model that predicts a pattern of widespread but seasonally very dynamic Ebola spillover intensity in savannah and humid tropical regions of Africa from the set of known spatiotemporal EVD points (n = 37 since 1990) and spatially and temporally high-resolution rainfall and population data for Africa. Ebola virus, though not the strain that led to the recent outbreak, was known to be circulating in West Africa before 2014 (28–30). However, the potential for a major human outbreak, by far the most deadly Ebola outbreak to date, was not foreseen. Answering the need for improved forecasting, surveillance, and preparation for rapid response, our model uses the best available spatiotemporal predictors and an ensemble modeling approach to accurately identify geographic regions and seasons of elevated Ebola spillover intensity, and suggests that the socio-ecologic conditions that triggered the initial spillover in Guinea may prevail over a much larger area and at a higher temporal frequency. A key public health policy implication is that some level of Ebola surveillance should be extended to regions outside of central and West Africa. Furthermore, the spatiotemporal pattern of Ebola spillover intensity we report could be used as an early warning system to inform the design of surveillance activities.
Dr. Schmidt holds a PhD in ecology and is currently an assistant research scientist at the Odum School of Ecology at the University of Georgia. He is interested in the application of data science methods to ecological questions.
We especially thank Nicholas Novella for his help in accessing rainfall data for Africa.
A.W.P. acknowledges funding from the National Science Foundation under a Rapid Response Research grant (award no. 1515194). J.M.D. acknowledges funding from the Research and Policy for Infectious Disease Dynamics program of the Science and Technology Directorate of the US Department of Homeland Security and from the Fogarty International Center of the National Institutes of Health. This work was also supported by the National Institute of General Medical Sciences of the National Institutes of Health (award no. U01GM110744).
- Smith KF, Guégan JF. Changing geographic distributions of human pathogens. Annu Rev Ecol Evol Syst. 2010;41:231–50.
- Daszak P, Cunningham AA, Hyatt AD. Emerging infectious diseases of wildlife—threats to biodiversity and human health. Science. 2000;287:443–9.
- Heeney JL. Ebola: Hidden reservoirs. Nature. 2015;527:453–5.
- Pigott DM, Golding N, Mylne A, Huang Z, Henry AJ, Weiss DJ, et al. Mapping the zoonotic niche of Ebola virus disease in Africa. eLife. 2014;3:e04395.
- Kuhn JH. Filoviruses: a compendium of 40 years of epidemiological, clinical, and laboratory studies. New York: Springer; 2008.
- Groseth A, Feldmann H, Strong JE. The ecology of Ebola virus. Trends Microbiol. 2007;15:408–16.
- Pinzon JE, Wilson JM, Tucker CJ, Arthur R, Jahrling PB, Formenty P. Trigger events: enviroclimatic coupling of Ebola hemorrhagic fever outbreaks. Am J Trop Med Hyg. 2004;71:664–74.
- Changula K, Kajihara M, Mweene AS, Takada A. Ebola and Marburg virus diseases in Africa: increased risk of outbreaks in previously unaffected areas? Microbiol Immunol. 2014;58:483–91.
- Linthicum KJ, Anyamba A, Tucker CJ, Kelley PW, Myers MF, Peters CJ. Climate and satellite indicators to forecast Rift Valley fever epidemics in Kenya. Science. 1999;285:397–400.
- Anyamba A, Chretien JP, Small J, Tucker CJ, Formenty PB, Richardson JH, et al. Prediction of a Rift Valley fever outbreak. Proc Natl Acad Sci U S A. 2009;106:955–9.
- Tucker CJ, Wilson JM, Mahoney R, Anyamba A, Linthicum K, Myers MF. Climatic and ecological context of the 1994–1996 Ebola outbreaks. Photogramm Eng Remote Sensing. 2002;68:147–52.
- Pinzon E, Wilson JM, Tucker CJ. Climate-based health monitoring systems for eco-climatic conditions associated with infectious diseases. Bull Soc Pathol Exot. 2005;98:239–43.
- Lash RR, Brunsell NA, Peterson AT. Spatiotemporal environmental triggers of Ebola and Marburg virus transmission. Geocarto Int. 2008;23:451–66.
- Bausch DG, Schwarz L. Outbreak of ebola virus disease in Guinea: where ecology meets economy. PLoS Negl Trop Dis. 2014;8:e3056.
- Wolfe ND, Dunavan CP, Diamond J. Origins of major human infectious diseases. Nature. 2007;447:279–83.
- Lahm SA, Kombila M, Swanepoel R, Barnes RFW. Morbidity and mortality of wild animals in relation to outbreaks of Ebola haemorrhagic fever in Gabon, 1994-2003. Trans R Soc Trop Med Hyg. 2007;101:64–78.
- Leroy EM, Telfer P, Kumulungui B, Yaba P, Rouquet P, Roques P, et al. A serological survey of Ebola virus infection in central African nonhuman primates. J Infect Dis. 2004;190:1895–9.
- Mylne A, Brady OJ, Huang Z, Pigott DM, Golding N, Kraemer MU, et al. A comprehensive database of the geographic spread of past human Ebola outbreaks. Sci Data. 2014;1:140042.
- Huete A, Didan K, Miura T, Rodriguez EP, Gao X, Ferreira LG. Overview of the radiometric and biophysical performance of the MODros Inf Serv. vegetation indices. Remote Sens Environ. 2002;83:195–213.
- Hijmans RJ, Cameron SE, Parra JL, Jones PG, Jarvis A. Very high resolution interpolated climate surfaces for global land areas. Int J Climatol. 2005;25:1965–78.
- Novella NS, Thiaw WM. African rainfall climatology version 2 for famine early warning systems. J Appl Meteorol Climatol. 2013;52:588–606.
- Valentini G, Dietterich TG. Low bias bagged support vector machines. In: Proceedings of the 20th International Conference on Machine Learning (ICML-2003), Washington DC; 2003. p. 752–9 [cited 2014 Dec 15]. http://www.aaai.org/Papers/ICML/2003/ICML03-098.pdf
- Breiman L. Bagging predictors. Mach Learn. 1996;24:123–40.
- Peterson AT, Bauer JT, Mills JN. Ecologic and geographic distribution of filovirus disease. Emerg Infect Dis. 2004;10:40–7.
- Altizer S, Bartel R, Han BA. Animal migration and infectious disease risk. Science. 2011;331:296–302.
- Altizer S, Dobson A, Hosseini P, Hudson P, Pascual M, Rohani P. Seasonality and the dynamics of infectious diseases. Ecol Lett. 2006;9:467–84.
- Peterson AT, Papeş M, Carroll DS, Leirs H, Johnson KM. Mammal taxa constituting potential coevolved reservoirs of filoviruses. J Mammal. 2007;88:1544–54.
- Hayman DT, Yu M, Crameri G, Wang LF, Suu-Ire R, Wood JL, et al. Ebola virus antibodies in fruit bats, Ghana, West Africa. Emerg Infect Dis. 2012;18:1207–9.
- Formenty P, Boesch C, Wyers M, Steiner C, Donati F, Dind F, et al. Ebola virus outbreak among wild chimpanzees living in a rain forest of Côte d’Ivoire. J Infect Dis. 1999;179(Suppl 1):S120–6.
- Formenty P, Hatz C, Le Guenno B, Stoll A, Rogenmoser P, Widmer A. Human infection due to Ebola virus, subtype Côte d’Ivoire: clinical and biologic presentation. J Infect Dis. 1999;179(Suppl 1):S48–53.