Hypervirulent emm59 Clone in Invasive Group A Streptococcus Outbreak, Southwestern United States

The hyper-virulent emm59 genotype of invasive group A Streptococcus was identified in northern Arizona in 2015. Eighteen isolates belonging to a genomic cluster grouped most closely with recently identified isolates in New Mexico. The continued transmission of emm59 in the southwestern United States poses a public health concern.

The hyper-virulent emm59 genotype of invasive group A Streptococcus was identified in northern Arizona in 2015.
Eighteen isolates belonging to a genomic cluster grouped most closely with recently identified isolates in New Mexico. The continued transmission of emm59 in the southwestern United States poses a public health concern.
S everal cases of invasive group A Streptococcus (GAS) disease were detected in January 2015 in a northern Arizona hospital. A substantive percentage of the cases were associated with a homeless shelter and a local jail; outbreak case-patients were predominantly male and Native American. Other studies have shown an increase in infection risk for invasive GAS in Native American/First Nations populations (1,2), and outbreaks within this population in Arizona have been previously documented (3). Whole genome sequence analysis determined that the hypervirulent subtype emm59 was present among the first cases analyzed in early 2015. emm59 is known to have caused a nationwide outbreak of invasive GAS in Canada during 2006-2009 (4,5), and cases and outbreaks have been reported in the United States (6)(7)(8).

The Study
We identified isolates for sequencing from 29 invasive GAS cases diagnosed in patients in a northern Arizona hospital during January-July 2015 and randomly selected an additional 99 GAS isolates from a repository of >2,000 Arizona GAS isolates collected during 2002-2006 (no isolates from patients in Arizona were available for 2007-2014). Four additional isolates from central Arizona identified in 2015 were included in the analysis (online Technical Appendix Table, http://wwwnc.cdc.gov/EID/article/22/1/15-1200-Techapp1.pdf). All isolates were grown on 5% sheep blood tryptic soy agar plates (Hardy Diagnostics, Santa Maria, CA), and incubated at 37°C with 5% CO 2 . DNA was extracted by using a DNeasy Blood and Tissue Kit (QIAGEN, Valencia, CA, USA) following manufacturer's protocol. Genomic DNA libraries were prepared by using the Nextera XT library prep kit (Illumina, San Diego, CA) and sequenced with paired-end reads (250 bp) on an Illumina MiSeq instrument, as previously described (9). The finished genome of the emm59 Canadian clone MGAS15252 (GenBank accession no. CP003116) and high-quality publicly available sequence-read data from 44 US isolates, from NCBI short read archive (BioProject #PRJNA194066), were included in the subsequent phylogenetic analyses. The final core genome (all nucleotide loci found in all genomes) for single-nucleotide polymorphism (SNP) detection was 1,636,024 bp (98.6% of reference).
We used NASP SNP analysis pipeline (http://tgennorth.github.io/NASP/) for whole-genome SNP typing as previously described (10). SNP matrices were developed for both the whole species and the emm59-only analyses. We used MEGA version 5.2.2 software (11) to generate maximum parsimony phylogenetic trees. Regions of high SNP density were identified as possible regions of recombination and were further analyzed for impact on the consistency index. Genomes were assembled by using UGAP (https://github.com/jasonsahl/UGAP). GAS emm subtypes were assigned by using BLAST (http://blast.ncbi.nlm. nih.gov/Blast.cgi), querying the study genome assemblies against the Centers for Disease Control and Prevention's (CDC) emm type-specific sequence database (http:// www.cdc.gov/streplab/m-proteingene-typing.html). We resolved dual emm-type hits using CDC's emm typing Sanger sequencing primers (http://www.cdc.gov/streplab/ protocol-emm-type.html) as a BLAST query and noting hit locations.
The 18 Arizona emm59 cases occurred during January-July 2015 (Table). An emm59-only phylogenetic analysis demonstrated the apparent presence of multiple lineages of emm59 in the 2015 Arizona isolates (Figure 1). A distinct clone consisting of 14 of the 18 emm59 isolates were separated from each other by only 0-4 SNPs, genomically supporting the presence of an ongoing outbreak; >8 of these patients were epidemiologically linked to physical contact, cohabitation, or both with 1 other person (data not shown). The additional emm59 isolates make up additional lineages separated from one other by 8-28 SNPs. No recombination was identified among the Arizona isolates. A relatively large number of SNPs and indels were seen within an approximate 23-kilobase region (Figure 1). This region has been previously reported to contain mutational hotspots associated with virulence (12,13). Considering the presumptive positive selective force on this region, SNPs within the region were not included in the final phylogenetic analysis.
When compared with all other publicly available US emm59 isolate genomes, nearly all the genomes identified in the United States were closely related to each other and to the Canadian clone MGAS15252; individual isolate SNP branch lengths ranged from 0 to 10 ( Figure 2). The Arizona outbreak isolates were separated from 2 New Mexico isolates by 4 and 5 SNPs each; these isolates fell within the overall Arizona clade and were subsequently included   in the Arizona-only phylogenetic analysis (Figure 1). Conversley, the isolate from patient M appears more distant from the larger Arizona population. The Arizona clades, with the exception of that of the isolate from patient M, all appear to arise from the large Minnesota polytomy. The previously estimated 1.3-2.1 SNPs/year mutation rates for GAS (14,15) further support the Arizona outbreak as being caused by a single clone, likely originating from New Mexico and being spread over 6-12 months.

Conclusions
The emm59 subtype of GAS, the etiologic agent of a substantial nationwide outbreak of invasive GAS in Canada during 2006-2009 (4), is now present in Arizona, causing at least 1 outbreak of epidemiologically and genomically linked cases and several additional epidemiologically unrelated cases.  (7), although no outbreaks were specifically described in these states (Arizona is not included in the ABCs system). Similar to this outbreak study, Olsen et al. (7), in an analysis of 60 MN emm59 isolates from casepatients with identified race, determined that 25 (42%) were from Native Americans; of 5 isolates from New Mexico in that study, 3 were from Native Americans. Given the apparent distal nature of the Arizona/New Mexico isolates to the Minnesota population in our study, it is reasonable to propose an unidentified epidemiologic relationship between these case populations. However, caution must be used in drawing conclusions regarding the relationships of isolates from disparate geographic regions because only limited comparable sequence data from previous emm59 studies in the United States (7) were publicly available to compare to the Arizona isolates. Epidemiologic investigations, along with healthcare provider and patient education activities, are ongoing in Arizona to further determine the extent of the current outbreak and the associated risk factors and to help mitigate effects and limit or prevent further spread to at-risk populations.
Dr. Engelthaler is an associate professor with the Translational Genomics Research Institute in Flagstaff, AZ. His research interests are in advancing epidemiology and clinical response through applied infectious disease genomics.