Mycobacterium tuberculosis Complex Lineage 3 as Causative Agent of Pulmonary Tuberculosis, Eastern Sudan

Pathogen-based factors associated with tuberculosis (TB) in eastern Sudan are not well defined. We investigated genetic diversity, drug resistance, and possible transmission clusters of Mycobacterium tuberculosis complex (MTBC) strains by using a genomic epidemiology approach. We collected 383 sputum specimens at 3 hospitals in 2014 and 2016 from patients with symptoms suggestive of TB; of these, 171 grew MTBC strains. Whole-genome sequencing could be performed on 166 MTBC strains; phylogenetic classification revealed that most (73.4%; n = 122) belonged to lineage 3 (L3). Genome-based cluster analysis showed that 76 strains (45.9%) were grouped into 29 molecular clusters, comprising 2–8 strains/patients. Of the strains investigated, 9.0% (15/166) were multidrug resistant (MDR); 10 MDR MTBC strains were linked to 1 large MDR transmission network. Our findings indicate that L3 strains are the main causative agent of TB in eastern Sudan; MDR TB is caused mainly by transmission of MDR L3 strains.

Ongoing transmission is one of the key challenges for TB control programs, especially in countries with a high TB burden (1,11). In recent years, molecular techniques have been increasingly used to clarify and trace transmission of Mycobacterium tuberculosis complex (MTBC) strains and to direct and guide targeted TB control actions (12,13). However, availability of molecular techniques is limited in many countries in Africa with a high TB burden (11).
In Sudan, drug-resistant TB often goes undetected, resulting in inadequate treatment, illness, death, and ongoing transmission (1,14). Local laboratories have limited access to mycobacterial culture and drug susceptibility testing (DST) or DNA-based techniques (14). Therefore, MDR TB rates might be underestimated in eastern Sudan. In addition, mutations that mediate drug resistance have not been investigated.
Taken together, these factors indicate that, although TB is a huge health problem in eastern Sudan, precise data on the phylogeny and transmission dynamics of MTBC strains, as well as on resistance patterns, is sparsely available (2,3,7,8,15). Studies using molecular epidemiologic tools are rare and have used classical genotyping techniques, such as Pathogen-based factors associated with tuberculosis (TB) in eastern Sudan are not well defined. We investigated genetic diversity, drug resistance, and possible transmission clusters of Mycobacterium tuberculosis complex (MTBC) strains by using a genomic epidemiology approach. We collected 383 sputum specimens at 3 hospitals in 2014 and 2016 from patients with symptoms suggestive of TB; of these, 171 grew MTBC strains. Whole-genome sequencing could be performed on 166 MTBC strains; phylogenetic classification revealed that most (73.4%; n = 122) belonged to lineage 3 (L3). Genome-based cluster analysis showed that 76 strains (45.9%) were grouped into 29 molecular clusters, comprising 2-8 strains/patients. Of the strains investigated, 9.0% (15/166) were multidrug resistant (MDR); 10 MDR MTBC strains were linked to 1 large MDR transmission network. Our findings indicate that L3 strains are the main causative agent of TB in eastern Sudan; MDR TB is caused mainly by transmission of MDR L3 strains.
spoligotyping, which cannot deduce direct transmission events (5,15). New techniques, such as wholegenome sequencing (WGS), offer the highest resolution for MTBC genotyping and provide precise information on resistance mutations (16,17). We applied state-of-the-art phenotypic and molecular assays to investigate specimens collected from patients with symptoms suggestive of pulmonary TB, including new and retreatment cases, to analyze the MTBC population structure, putative transmission events, and DST profiles in eastern Sudan.

Study Design and Setting
We recruited patients with symptoms suggestive of pulmonary TB who had positive sputum smears and agreed to participate in this cross-sectional study. Patients had been treated in the outpatient departments at public hospitals in Kassala, Port Sudan, and El-Gadarif in eastern Sudan over 2 recruitment periods, June-October 2014 and January-July 2016. We collected spot and early morning sputum samples. If 1 sample was smear positive, the 2 samples were pooled and stored for <6 months at -20°C. Shortly before we shipped each sample to the National Reference Center (NRC) for Mycobacteria, Borstel, Germany, we transferred a volume of <2 mL to a screwcapped Eppendorf tube; the samples were shipped in 2 separate batches.

Mycobacterial Culture and Identification
Sample decontamination, smear microscopy, and mycobacterial culture were performed at the NRC (18,19). We extracted DNA using a QIAamp DNA Mini Kit 250 (QIAGEN, https://www.qiagen.com) according to the instructions of the manufacturer for quantitative PCR (qPCR). We extracted DNA by the boiling/sonication method for conducting line probe assays (LPAs) such as GenoType Mycobacterium CM and GenoType Mycobacterium MTBC (Hain Lifescience, https://www.hain-lifescience.de) (19). We used cetyl trimethylammonium bromide for DNA extraction for WGS (20). We transferred the extracted DNA to new Eppendorf tubes and stored it at -20°C until used.
We used an in-house qPCR detecting MTBC and nontuberculous mycobacteria (NTM) DNA to test available culture-negative/contaminated samples (21). We ran the qPCR experiments using the Rotor-Gene 2000 (Corbett Research Pty Ltd, http://www. australianexporters.net). We used LPAs (Hain Lifescience, https://www.hain-lifescience.de) according to the manufacturer's instructions to classify isolated mycobacteria into MTBC or NTM and to differentiate the MTBC species.
We identified NTM species using 16S rRNA, internal transcribed spacer (ITS) DNA fragment sequencing, or both (22). We sequenced the complete PCR products on an automated DNA sequencer (ABI 377; Applied Biosystems, https://www.thermofisher.com) by cycle sequencing using the Big Dye RR Terminator Cycle Sequencing Kit (Applied Biosystems). We aligned the resulting sequences and compared them with the sequences of the International Nucleotide Sequence Database Collaboration.

WGS
We performed WGS using the Illumina Nextera (XT) kit (https://www.illumina.com) (26). We sequenced isolates with a minimum average genome coverage of 50×. We used single-nucleotide polymorphisms (SNPs) occurring in >4 forward and >4 reverse reads, 4 reads calling the allele with a Phred score >20, and a minimum variant frequency of 75% for a concatenated sequence alignment (27). In the comparative genomic analysis, we allowed 5% of all samples to miss these coverage and frequency thresholds at individual positions and called the majority allele (>50% variant frequency) to not lose sequence information in genome regions with lower average coverage. We excluded repetitive region and drug resistance associated genes for phylogenetic reconstruction.

Phylogenetic Inference
We calculated a maximum-likelihood tree with Fast-Tree using the concatenated sequence alignment and a general time-reversible substitution model (28). We conducted inspection and rooting of the maximum likelihood tree using FigTree software and performed the graphical presentation using the online tool EvolView (29). We calculated maximum parsimony trees with BioNumerics version 7.6 (Applied Maths, https://www.applied-maths.com) using the concatenated sequence alignment (30).

Molecular Drug Resistance Prediction
We screened the rpsL, rrs, and gidB genes for mutations that confer resistance to streptomycin and the katG and inhA genes and the fabG1-inhA promoter for resistance to isoniazid (31). We inferred rifampin resistance by mutations in the rpoB gene. Moreover, we also noted putative compensatory mutations in the rpoA and rpoC genes (for rifampin resistance) and the ahpC gene (for isoniazid resistance). We investigated the embA, embB, and embC genes for resistance conferring mutations to ethambutol and screened the pncA gene for mutations associated with resistance to pyrazinamide (31). We investigated the gyrA and gyrB genes for resistance to fluoroquinolones and investigated the rrs gene for resistances against kanamycin, amikacin, and capreomycin. In addition, we screened the eis promoter region for resistance against KAN and the tlyA for resistance against capreomycin. For ethionamide, we investigated the ethA and inhA genes and the fabG1-inhA promoter and for para-aminosalicylic acid, we investigated the ribD, thyA, thyX, and folC genes (31).

Statistics
We used SPSS version 20.0 (https://www.ibm.com) for all appropriate statistical analyses. We obtained descriptive statistics of the variables, including frequencies and proportions. We analyzed differences between groups by using the χ 2 or Fisher exact test; p<0.05 denoted statistical significance (32).

Ethics Considerations
Scientific and ethics approval for the study was provided by the National Research Ethics Committee, Federal Ministry of Health, Khartoum, Sudan, and by the Institutional Review Board of the Institute of Endemic Diseases, University of Khartoum, Khartoum, Sudan (no. 85-03-09). We obtained written informed consent for participation in the study from participants or, in case of children or illiterate patients, their guardians.

Study Population
Sputum samples were provided by smear-positive patients with TB from 3 areas in eastern Sudan in 2014 (n = 101) and 2016 (n = 282) ( Figure 1). Based on hospital records, we included 10%-20% of all patients who received diagnoses of TB during the study period. We collected patient-derived samples from 161 patients (42%) in El-Gadarif, 133 patients (34.7%) in Kassala, and 89 patients (23.3%) in Port Sudan hospitals. Patients who provided samples had a median age of 35 years (interquartile range 25-45 years); most (245/383; 66%) were male. In addition, 81.5% (312/383) were new and 5.5% (21/383) were retreatment TB cases; data on TB treatment history were unavailable for 13.0% (50/383) ( Table 1). Comparison of the 2 patient cohorts revealed no significant difference between the proportions of L3 strains (p = 0.068 by Fisher exact test) but the 2014 cohort contained more drug-resistant (p = 0.019 by Fisher exact test) and clustered (p = 0.016 by Fisher exact test) strains ( Table 2).

Mycobacterium Isolation and Species Identification
Of all collected specimens, 51.2% (196/383) were culture positive for mycobacteria; LPAs identified most (n = 171, 87.2%) as MTBC (Figure 1). The rest of the specimens were either culture negative or contaminated; we tested them with qPCR and Sanger sequencing for mycobacterial DNA detection and species identification ( Figure 1) (14).

pDST and Genotypic DST
To determine resistance levels and related genomic variants, we performed pDST and genomic resistance predictions or genotypic DST (gDST) and compiled detailed data on resistances and resistance conferring mutations (Tables 3, 4; Appendix 1 Table). Overall, 21 We detected resistance to streptomycin in 19.9% (33/166) of the strains, mediated by mutations in rspL (Lys43Arg, Lys88Arg, and Lys88Met), gidB (e.g., Ala138Val), and rrs (514, a/c) genes. We observed all isoniazid-resistant strains (10.2%, 17/166) either with a mutation in katG (Ser315Thr and Ser315Asn) that changes catalase-peroxidase activities or in the promoter region of the drug target InhA, fabG1-inhA (-15 c/t), which also confers resistance to the secondline drug ETH. Resistance to rifampin was found in 10.2% (17/166) of the strains and was mediated by mutations in the rpoB gene (Ser450Leu, His445Tyr, His445Asn, and His445Asp). We found 1 ethambutol-resistant strain (0.6%) with the mutation embB Gln497Arg. However, we also detected 11 additional mutations associated with ethambutol resistance in the embB gene (10 Met306Ile and 1 Met306Val) but with MICs ranging from 1.25 to 5 µg/mL, classifying these strains as phenotypically susceptible based on the recommended critical concentration for ethambutol. With regard to pyrazinamide, we identified 1 strain (0.6%) with the mutation pncA Gln10Arg, coinciding with phenotypic pyrazinamide resistance. A detailed comparison of the pDST and gDST results revealed a high sensitivity and specificity for isoniazid, rifampin, and pyrazinamide resistance prediction by WGS (Table 4). For ethambutol, we determined high-confidence resistance SNPs at codon 306, 406, or 497; however, varying levels of ethambutol MICs in the strains with mutations resulted in a very low positive predictive value. For streptomycin, we considered the gidB mutations (Phe12Ser, Arg39Pro, Trp45STP, Ser136STP, Iso114Ser, and deletions at positions 4408101, 4408017, and 4408116) to be mutations with an unclear effect. However, these strains eventually tested phenotypically resistant to streptomycin, leading to a reduced sensitivity.
All strains with resistances to >1 first-line antimicrobial drug were phenotypically and genotypically susceptible to ofloxacin, capreomycin, and amikacin. We identified no genotypic resistance marker mediating para-aminosalicylic acid resistance.
Based on a 12-bp SNP threshold between any 2 strains, 80.0% (12/15) of the MDR strains were clustered or connected (i.e., associated with recent transmission); based on <5 SNPs distance, 60% (9/15) of the MDR strains were clustered (Appendix 1 Figure  4, panel A). Most of the clustered strains at <12 SNPs were isolated from patients in Kassala and grouped in clusters 4 and 29. These strains also shared the same rpsL (Lys43Arg) and the katG (Ser315Thr) mutations but harbored different mutations in the rpoB gene; strains of cluster 4 had the Ser450Leu mutation, whereas strains of cluster 29 exhibited a His445Tyr mutation. This finding points toward a close relationship between the strains of both clusters that likely emerged from a common recent ancestor already being polyresistant to streptomycin and isoniazid (Appendix 1 Figure 4, panel B). Furthermore, all 5 strains of cluster 29 had the embB Met306Ile mutation, but 1 of them also had the mutation embB Gly406Asp. Within cluster 4, two strains acquired the mutation embB Met306Ile independently, and 1 acquired embB Met306Val, as judged by the tree topology (Appendix 1 Figure 4, panel B). Moreover, among all drugresistant strains, only 1 strain was identified with resistance to pyrazinamide mediated by the mutation pncA Gln10Arg.

Discussion
By using conventional diagnostics and WGS, we showed that pulmonary TB in eastern Sudan is caused predominantly by L3 strains (Delhi/CAS). Drug resistance and recent transmission were associated with L3 strains, accentuating the key role of L3 strains in TB epidemiology in eastern Sudan. In addition, most In addition to the general dominance of L3 strains in eastern Sudan, we found that all MDR TB cases were caused exclusively by L3 strains. Of major concern is that 10 of 15 MDR strains were part of 2 genetically related clusters isolated mainly from patients treated in 2014 in Kassala hospital. At first glance, this finding suggested nosocomial transmission. However, 2 strains of these clusters were isolated in 2016, including a strain from a patient treated in El-Gadarif hospital who had also acquired resistance to ethambutol and pyrazinamide. This patient was the first patient in our study cohort infected with a fully first-line drug-resistant strain, clearly emphasizing the importance of adopting focused TB control measures, including rapid detection and effective treatment of patients with MDR TB, to better contain transmission of MDR strains and prevent development of further drug resistances in the region. However, these measures are far from reality because proper TB diagnostics are virtually absent in eastern Sudan; other impeding factors are social stigma, lack of motivation, and poor awareness of TB treatment,    with default rates of 14%-57% (38,39). This situation may even lead to a further aggravation of the drug resistance problem through selection of MDR clones with additional drug resistances in failing treatment regimens and further transmission of fully first-line resistant MDR strains (14,39). However, our WGS analysis revealed that MDR isolates did not exhibit mutations mediating resistances to second-line drugs (except for isoniazid/ETH cross resistance), leaving reasonable therapeutic options for patients in eastern Sudan with MDR TB.
Considering the lack of pDST and the technical challenges associated with its implementation in Sudan, introduction of rapid molecular diagnostics to find patients with MDR TB is crucial for timely detection, treatment, and control. Moreover, rapid diagnostics will ultimately strengthen the national TB control program in Sudan. In line with previous studies, our data demonstrate an excellent performance of gDST for molecular resistance prediction (16,17,41). One example of the benefits of molecular assays is the correction of false ethambutol susceptibility results based on pDST in strains that harbor high-confidence embB resistance (42). Previous studies already revealed a low performance of ethambutol pDST, attributable mainly to the small difference between the wild-type and mutant MIC levels, leading to the effect that strains with canonical embB mutations show ethambutol MICs around the defined breakpoint of 5.0 µg/mL, resulting in a low reproducibility of phenotypic results (24,25,42). Therefore, classical Sanger sequencing of the embB codons 306, 354, and 406 was recently proposed to overrule phenotypic ethambutol susceptibility results in cases of presence of mutations in these codons (42). Furthermore, Cepheid GeneXpert and Hain MTBDRplus version 2.0 would have recognized all rifampin-resistant mediating mutations in our study setting and, therefore, offer a rapid solution for identification of patients with MDR/rifampin-resistant TB in eastern Sudan.
This multisite study was conducted in 3 public hospitals in eastern Sudan, comprising 10%-20% of the TB cases in the region during the study period; it thus represents a snapshot of the population diversity and transmission dynamics of MTBC strains in eastern Sudan. An additional strength of this study is that cultures, DSTs, and WGS were done in a World Health Organization-certified NRC in a highresource setting in Germany, enabling maximum resolution for characterization of MTBC strains.
This study had >2 limitations. First, the prolonged transit time of patient-derived samples from Sudan to the NRC in Germany affected the viability of the MTBC bacteria; therefore, no mycobacterial growth was detected for some samples. Furthermore, the unavailability of clinical data, such as HIV status and treatment outcomes, prohibited further linking of bacteriological results to these clinical data.
In conclusion, L3 strains play a pivotal role in the epidemiology and transmission of TB, particularly MDR TB, in eastern Sudan. Transmission of MDR TB could possibly be an emerging concern for local TB departments and hospitals. Therefore, to contain MDR TB transmission, rapid molecular diagnostics, such as Cepheid GeneXpert or Hain MT-BDRplus v2.0, are desirable in combination with focused tracing of contacts of patients with MDR TB. In addition, early onset of MDR TB therapy would be an ideal approach to reduce the number of secondary cases.