Population-Based Geospatial and Molecular Epidemiologic Study of Tuberculosis Transmission Dynamics, Botswana, 2012–2016

Tuberculosis (TB) elimination requires interrupting transmission of Mycobacterium tuberculosis. We used a multidisciplinary approach to describe TB transmission in 2 sociodemographically distinct districts in Botswana (Kopanyo Study). During August 2012–March 2016, all patients who had TB were enrolled, their sputum samples were cultured, and M. tuberculosis isolates were genotyped by using 24-locus mycobacterial interspersed repetitive units–variable number of tandem repeats. Of 5,515 TB patients, 4,331 (79%) were enrolled. Annualized TB incidence varied by geography (range 66–1,140 TB patients/100,000 persons). A total of 1,796 patient isolates had valid genotyping results and residential geocoordinates; 780 (41%) patients were involved in a localized TB transmission event. Residence in areas with a high burden of TB, age <24 years, being a current smoker, and unemployment were factors associated with localized transmission events. Patients with known HIV-positive status had lower odds of being involved in localized transmission.

emergency by the World Health Assembly in 2014, and the development of an ambitious global strategy to eliminate TB by 2035 soon followed (1,2). Five years later, progress toward elimination remains slow (3), in part because of the lack of effective interventions to interrupt the cycle of TB transmission (4). Ongoing transmission is the main driver of TB prevalence in high-burden communities (5,6). Historically, TB was believed to be the result of prolonged exposure to infectious TB patients, such as household contacts (7). More recently, molecular epidemiologic studies highlighted the possible role of casual exposures in the community (8,9). TB incidence and rates of TB transmission vary considerably across communities and might be dependent on high-risk behaviors, social determinates of disease (e.g., malnutrition, overcrowding, poverty), population dynamics, and transmission venues (10)(11)(12). Accordingly, interest has been renewed in increasing yield and cost-effectiveness of geographically targeted interventions.
Incremental progress toward elimination is possible with careful evidence-guided policy development, planning, and implementation. The design of effective, targeted TB interventions should be tailored to local epidemiology and program performance. In this population-based study, named the Kopanyo Study, we used a multidisciplinary approach combining classic epidemiologic approaches (i.e., relying on the behavioral, clinical, demographical, geospatial, social, and temporal characteristics of cases) with mycobacterial genetics to describe TB transmission in 2 large districts in Botswana.

Study Objective and Design
Kopanyo means "people gathering together" in the local Tswana language in Botswana. Consistent with this name, the overarching goal of the Kopanyo Study was to use geospatial analysis and patient interviews to define TB transmission networks and locations of TB transmission in 2 districts in Botswana, a country characterized by high rates of TB and HIV. More precisely, we aimed to describe and compare the clinical and microbiological characteristics of TB patients given a diagnosis in Gaborone and Ghanzi districts; describe and compare the spatial clustering and genotype clustering of patients and strains across and within districts; and determine the factors associated

Population-Based Geospatial and Molecular Epidemiologic Study of Tuberculosis Transmission Dynamics, Botswana, 2012-2016
Tuberculosis (TB) elimination requires interrupting transmission of Mycobacterium tuberculosis. We used a multidisciplinary approach to describe TB transmission in 2 sociodemographically distinct districts in Botswana (Kopanyo Study). During August 2012-March 2016, all patients who had TB were enrolled, their sputum samples were cultured, and M. tuberculosis isolates were genotyped by using 24-locus mycobacterial interspersed repetitive units-variable number of tandem repeats. Of 5,515 TB patients, 4,331 (79%) were enrolled. Annualized TB incidence varied by geography (range 66-1,140 TB patients/100,000 persons). A total of 1,796 patient isolates had valid genotyping results and residential geocoordinates; 780 (41%) patients were involved in a localized TB transmission event. Residence in areas with a high burden of TB, age <24 years, being a current smoker, and unemployment were factors associated with localized transmission events. Patients with known HIV-positive status had lower odds of being involved in localized transmission.
with genotype clustering, spatial clustering, as well as combined genotype and spatial clustering across and within districts. The design and procedures of this population-based, prospective study have been described in detail elsewhere (8,13).

Setting
Botswana is an economically and politically stable sub-Saharan country that has a universal healthcare system for its citizens. We recruited participants from 2 distinct geographic areas: the capital city and surrounding suburbs, Gaborone district; and the rural district of Ghanzi ( Figure 1). The study sites were purposefully selected because they were believed to represent disparate populations in Botswana in terms of demographic, environmental, epidemiologic (i.e., HIV prevalence, TB prevalence) and socioeconomic characteristics. With a population of 354,380 persons, Gaborone is the largest and most crowded urban area in the country. In 2013, at the start of this study, 17% of the general population in Gaborone was estimated to be living with HIV (14). Annual TB rates in Gaborone ranged from 440 to 470 cases/100,000 population during the 5 years before study implementation; ≈70% of TB patients were co-infected with HIV (15).
In contrast, Ghanzi is a rural district in northwestern Botswana, and most of the 44,100 persons in this district live in congregate housing; the town of Ghanzi has a population of 12,179 persons. Most of the population in the district is of San ethnicity. The San kept a traditionally hunter-gatherer lifestyle until early 1990s, when they were forced to transition to farming as a result of government-mandated modernization programs. Since then, most San live in large and crowded private freehold farms most of the year. For short periods during the year (ranging from days to several weeks), they transition through mid-size villages and the town of Ghanzi. Migration between farms and villages in the district is the norm and it is seasonal. However, because of cultural and geographic reasons there is little migration outside the district. Thus, despite major rotational migration between villages and farms, the community remains highly insular. Altogether, these unique cultural and social conditions contribute to the higher rates of TB transmission in Ghanzi. Over the past 2 decades, the TB notification rate in Ghanzi district has consistently been the highest in the country (722 cases/100,000 persons) (14,15). In 2013, the proportion of the general population in Ghanzi estimated to be living with HIV (17%) was similar to that for Gaborone; however, only 36% of TB patients are co-infected with HIV (14).

Recruitment
Participants were enrolled during August 2012-March 2016. All patients given a diagnosis of TB were eligible for enrollment. Participants were recruited from TB clinics and directly observed treatment centers in greater Gaborone (n = 24) and Ghanzi District (n = 6). Patients receiving TB treatment for >14 days before study screening, incarcerated persons, or those who did not consent were excluded from the study.

Data Sources, Measurements, and Variables
Behavioral, clinical, and demographic information were obtained by medical record abstraction and standardized interview at enrollment. Primary residential address, work place address at diagnosis, and address of social gathering venues of patients were obtained through patient interview. All addresses were verified by site visit geotagging, or through a reference layer created by manually relocating addresses in satellite imagery by using OpenStreetMap (http://www.openstreetmap.org) (16), Google Maps, and ArcGIS (Environmental System Research Institute, https://www. esri.com) online geocoding services. WGS 84 projection system latitude and longitude coordinates (with 1.1-m precision) were exported for each address. Botswana population and housing data was used to define geographic boundaries and enumerate localized populations necessary for TB incidence rates (17). We defined high-burden geographic areas if the estimated annualized TB incidence was >305 TB patients/100,000 persons, which is the estimated national TB incidence rate at the start of the study period.
HIV status was determined for all enrolled participants. Following national guidelines, we offered HIV testing to all participants who did not have documented HIV test results or had negative test results from >12 months before enrollment. Patients were asked to report the average number of alcoholic beverages consumed on the same occasion in the previous 30 days, the number of days consuming alcohol in the previous 30 days, and if they currently smoke tobacco. We defined excessive alcohol consumption as a self-report of >5 drinks on the same occasion or drinking on >5 days within the previous 30 days (18). Venues for social gathering were classified as alcohol-related (e.g., bars, liquor stores, pubs, shebeens), places of work, places of worship (e.g., churches, mosques, temples), and healthcare facilities.

Sputa Collection and Laboratory Methods
At least 1 expectorated sputum sample was obtained from each enrolled patient. Patients unable to produce enough sputum or high-quality sputum underwent inhaled nebulized hypertonic saline solution induction. Sputa were decontaminated by using the N-acetyl-L-cysteine and NaOH method with a final concentration of 1% NaOH, and then inoculated into 1 Mycobacterial Growth Indicator Tube (MGIT; Becton Dickinson, https://www.bd.com). MGIT cultures were incubated at 35°C-37°C in the MGIT960 instrument (Becton Dickinson) for <6 weeks. MGIT cultures scored as positive were examined by microscopy and Ziehl-Neelsen staining to identify acidfast bacilli. SD. The Bioline TB Ag MPT64 Rapid Test (Abbott, https://www.globalpointofcare.abbott/en/ product-details/sd-bioline-tb-ag-mpt64-rapid.html) was used to identify the M. tuberculosis complex. Cultures positive for acid-fast bacilli but with negative TB Ag MPT64 results were classified as nontuberculous mycobacteria. Cultures with evidence of both Mycobacterium species and other potential contaminating species were redecontaminated by using the standard method described above. Drug susceptibility testing (DST) for first-line anti-TB drugs was performed by using MGIT DST. Susceptibility was set at 0.1 µg/mL for isoniazid and 1.0 µg/mL for rifampin. We used DST with Lowenstein-Jensen medium in instances for which MGIT DST results were not available.

M. tuberculosis Genotyping
The first culture isolate per patient was genotyped (Genoscreen, https://www.genoscreen.fr) by using 24-locus mycobacterial interspersed repetitive unitsvariable number of tandem repeats (MIRU-VNTR) and standardized methods (19). MIRU-VNTR results with >1 copy number at >1 loci (i.e., double alleles) as seen with mixtures of different clonal subpopulations, or with missing or indeterminate copy number at any locus, were considered noninterpretable for cluster assignment and were excluded from analysis (20). Two or more patient isolates that had valid, complete, and matching MIRU results were classified as a genotype cluster.

Localized Transmission Events
We used SaTScan (https://www.satscan.org) to identify geographic areas with a larger-than-expected rate of unique genotype clusters. We also used data for all other culture-positive TB patients reported during the study as the background rate (20)(21)(22). In brief, all individual MIRU-VNTR results were assigned to the corresponding geocoordinates of the patient's residence. Each unique MIRU-VNTR result was then scanned separately, applying a purely spatial analysis, in which the number of events in an area was assumed to be Poisson distributed to generate circular zones of various sizes up to a maximum radius of 50 km. A log-likelihood ratio was calculated for each zone in comparison with all possible zones, with the maximum likelihood ratio representing the zone most likely to identify statistically significant spatial concentrations for each MIRU-VNTR result. Thus, by definition, localized transmission was characterized by genotypic and spatial clustering. A Monte Carlo simulation with 9,999 repetitions was used to determine the distribution of the scan statistic under the null hypothesis of spatial randomness; significant spatial clusters were chosen by using an α of p<0.05. No duplicative case counting occurred. The purpose of the spatial scan was to characterize each patient (based on residence) for a dichotomous outcome: member of a localized transmission event or not.

Statistical Methods
Annualized TB incidence per 100,000 persons and 95% CIs, assuming a Poisson distribution, were calculated for local geographies on the basis of the number of cases enrolled from each geography divided by the population estimates for each geography annualized to the duration of the study period. Estimates were assigned to the geographic centroid in ArcGIS. Isopleth cartographic images were produced by using a raster layer interpolated with inverse distance weighting (21). Differences in proportions between behavioral, clinical, and demographic variables by geographic location were assessed by using the χ 2 test. Multivariable logistic regression analysis was conducted to assess the association of involvement in a localized transmission event (coded as a binary yes/no variable) and select variables by using adjusted odds ratios (aORs) that were significant at the 95% CIs. All variables statistically associated with the main outcome in bivariable analyses (p<0.1) were included in the multivariate model.

Ethics
This study was approved by the Centers for Disease Control and Prevention Institutional Review Board; the Health Research and Development Committee, Ministry of Health and Wellness, Botswana; and the University of Pennsylvania Institutional Review Boards. Participants provided written informed consent.

Geographic Distribution of TB
The estimated annualized TB incidence for the overall study population was 306 TB patients/100,000 persons (95% CI 228-339) ( Table 2). The incidence rate varied considerably by geography, ranging from 66 (95% CI 44-99) TB patients/100,000 persons in the suburban areas of Gaborone to 1,140 (95% 836-1,556) TB patients/100,000 persons in remote, rural villages of the Ghanzi District. The degree of heterogeneity was more pronounced in Gaborone than in Ghanzi District. For example, a 7.3-fold difference in annualized incidence was found between the highest (location A) and lowest (location J) burden areas in Gaborone. This difference was 1.3 fold in Ghanzi District (highest location W and lowest location KU). In this context, we observed that certain neighborhoods contributed disproportionally to the district-level burden of TB. Some locations that had the lowest TB prevalence had the highest number and proportion of patients co-infected with HIV (Table 1).

Patient Isolate Characteristics
A total of 2,462 (56%) patients had >1 positive culture result; 2,162 (88%) were classified as M. tuberculosis complex, whereas 300 (12%) were classified as nontuberculous mycobacteria and were excluded. MIRU-VNTR results were available for 2,137 patient isolates. We excluded 213 (10%) patient isolates that had incomplete or noninterpretable genotyping results; this exclusion was described elsewhere (23,24). Thus, 1,924 patients were included in phylogenetic analysis. Among these patients, 128 had no residential geocoordinates, which resulted in 1,796 patients available for localized transmission analysis (Figure 2).

Localized Transmission
A total of 780 (43%) patients were members of localized transmission events. Among these patients, 537 (69%) resided in Gaborone, 241 (31%) resided in the Ghanzi District, and 2 (0.3%) resided elsewhere. Localized transmission was independently associated with younger age ( Patients who had a known HIV-positive status had lower odds of being a member of localized transmission (aOR 0.71, 95CI % 0.58-0.85). When we superimposed the SaTScan results over the interpolated with inverse distance-weighted TB incidence estimates, the spatial concentration for localized transmission coincided with higher TB incidence rates (Figure 3). The proportion matched by TB genotype was significantly larger in Ghanzi District (80%) than in Gaborone (64%) and other locations (4%; p<0.001) ( Table 4). A total of 494 (22%) patients resided at the same address as another patient; among these, 29% matched by TB genotype. The proportion matched by TB genotype was significantly larger in the Ghanzi District (50%) than in Gaborone (18%; p<0.001). A total of 605 (32%) patients reported the same place of worship as another patient; among these, 11% matched by TB genotype. The proportion matched by TB genotype was significantly larger in Ghanzi District (24%) than in Gaborone (9%; p = 0.002). A total of 585 (30%) reported the same alcohol-related venue as another patient; among these, 28% matched by TB genotype. The proportion matched by TB genotype was significantly larger in Ghanzi District (57%) than in Gaborone (16%; p<0.001).

Discussion
Interrupting TB transmission is paramount for achieving TB elimination in high-burden settings. Accordingly, increasing interest exists on determining where, when, and among whom TB transmission occurs. Our study helps clarify factors fueling the TB epidemic in Botswana and highlights the necessity of understanding local epidemiology to design effective interventions aimed at interrupting TB transmission. We combined isolate genotype and spatial clustering as an indicator consistent with localized transmission. Although we acknowledge that some misclassification might occur with this approach, it enabled us to broadly consider geographic and individual characteristics that might be associated with localized transmission. We found high incidence rates and substantial variation in TB incidence between neighborhoods and districts. Actual incidence rates are likely higher because we calculated estimates on the basis of numbers of enrolled patients, but not all persons with TB were enrolled in this study. Residing in a high-burden geographic area was associated with localized transmission, likely reflecting more cumulative exposures leading to more infections, reinfections, and opportunities to progress to TB. In addition, the local differences in TB incidence likely overlaps with differences in the local distribution of social determinants of health (e.g., poverty, overcrowding, and housing conditions), which also influence TB epidemiology (12). Our finding that localized transmission was associated with young age might be reflective of differences in the frequency and intensity of social activities across the course of life (25). Younger patients might have had more social connections and relationships with nonfamily members (25).
In addition, older patients might have progressed to having TB with non-genotype clustered strains from infections in the distant past (26). The large number of patients living in the same neighborhoods of another patient, and high proportion matching another patient by TB genotype, suggests targeted screening and treatment in high-burden neighborhoods might be cost-effective. Overall, a substantial proportion of patients (22%) resided at the same address as another patient; among these patients, 29% were matched by TB genotype. However, major differences occurred by geography. Among pa-tients residing at the same address in Ghanzi, 50% were matched by genotyping, suggesting that household contact investigations in this district would be particularly effective at reducing transmission.
Social venue data suggested that communitybased interventions might be effective for interrupting transmission. A substantial number of patients named the same places of worship (32%) or alcoholrelated venues (30%) as another patient. Among persons naming the same alcohol-rated venue in Ghanzi, 57% were matched by TB genotype. These findings might help prioritize resources and guide effective strategies to interrupt M. tuberculosis transmission, such as intensified TB case finding in higher-burden geographic areas and targeted screening of frequently named social gathering venues.
Our results also highlight the need for using local data for local solutions. Comparative differences in spatial and genotypic clustering within and between communities reinforce the relative role of local factors that drive TB transmission and incidence. The proportion of patients attributed to localized transmission was higher in Ghanzi than in Gaborone. This finding suggests that, although TB transmission is a serious issue in both communities, a relatively higher proportion of TB cases might be caused by recent exposure to an infectious TB case-patient in Ghanzi than in Gaborone. Differences in population density and behavioral factors, such as smoking, drinking, and social mixing, highlight the potential impact of targeting interventions for these vulnerable populations. However, in settings that have prevalent endemic strains, genotype clustering might not be caused by recent transmission; higher resolution molecular characterization, such as whole-genome sequencing, might help further distinguish recent transmission from reactivation of highly prevalent, closely related strains. The trend toward an inverse population-and individual-level association between HIV and localized transmission is consistent with findings from previous TB molecular epidemiologic studies in Africa and highlights the complex time-dependent interactions between the TB and HIV epidemics (27)(28)(29)(30)(31)(32). At the population-level, Gaborone neighborhoods, which had the highest proportion of HIV-coinfected TB patients also demonstrated the lowest TB incidence, and HIV infection was negatively associated with localized transmission. HIV-co-infected TB patients progress more rapidly to active disease after M. tuberculosis infection and are generally less infectious and have higher mortality rates (33). As HIV care improves and antiretroviral therapy becomes more widely available, TB incidence among the HIV-infected persons decreases, leading to decreasing the rates of progression to active disease (34). Furthermore, being infected with HIV often   means more visits to healthcare facilities in which TB screening is part of routine visits. Our population-based design and multidisciplinary approach enables a high degree of confidence in our results and conclusions. However, major limitations need to be considered. First, although our study was multiyear and covered a broad geographic area, it is possible that some members of the transmission networks were missed (e.g., given a diagnosis before the study period or resided in areas not covered by the study, or refused enrollment) leading to genotype clustering misclassification. Also, not all enrolled TB patients produced sputum samples, and not all samples led to M. tuberculosis isolation or valid genotype results. This limitation might result in missed transmission links. Second, our molecular techniques characterized only part of the M. tuberculosis genome (17). It possible that genetic heterogeneity in loci not covered by this method might have been missed, resulting in misclassification (17). Moreover, the use of 1 isolate/patient, exclusion of mixed infections, missing data, and recall bias for naming potential transmission venues should also be acknowledged as potential limitations.
The Kopanyo Study adds to understanding of TB transmission dynamics in settings hyperendemic for TB and HIV by providing empirical data demonstrating the role of localized TB transmission during district-level epidemics. Thus, interrupting TB transmission in Botswana might warrant local solutions tailored for community differences and based on local epidemiology. *Localized transmission was defined by SaTScan (https://www.satscan.org)--identified geographic areas with a larger-than-expected rate of unique genotype clustering compared with all other culture-positive tuberculosis patients as the background rate; excludes 128 patients with valid genotype results and no residential address. †Residing in a geographic area that had an estimated annualized tuberculosis incidence >305 patients/100,000 persons. ‡Five or more drinks per session in the previous 30 d or drinking on >5 days in the previous 30 d.