Global Escherichia coli Sequence Type 131 Clade with blaCTX-M-27 Gene

Increased extended-spectrum β-lactamase–producing E. coli in Japan resulted mainly from a clade containing this gene.

increase of CTX-M-27-producing ST131 in Japan since 2010 (6), we designed a study to characterize these isolates using WGS techniques.

WGS
We used the Nextera XT DNA sample preparation kit (Illumina, San Diego, CA, USA) to prepare libraries for sequencing. Samples were multiplexed and sequenced on an Illumina MiSeq for 600 cycles (300-bp paired-end) or NextSeq500 for 300 cycles (151-bp paired-end). The ST131 genomes were sequenced at an average depth of 44.03 (SD ± 14.70) and an average coverage of 97.73% (SD ± 0.93%) using the 5,109,767-bp EC958 chromosome as previously described (16).

Core Genome Analysis
We used a core genome single-nucleotide polymorphism (SNP)-based approach to create a phylogenetic tree. We identified SNPs using raw read mapping followed by duplicate read removal, realignment, quality score recalibration, and variant filtering (online Technical Appendix). Reads from 53 isolates sequenced in this study and 4 isolates (S100EC, S107EC, S108EC, and S135EC) (5) were aligned against a reference genome of EC958, and SNPs were called. The remaining 4 draft or complete genomes underwent whole-genome alignment against EC958 to make EC958-like pseudo-chromosomes that contained only SNPs. The SNP-only core genome was identified as the blocks of >500 bp common to all 61 study isolates to ensure that each block represented a common segment from good alignment in each isolate and that the block had enough length to enable identification (5). A maximum-likelihood tree was built using RAxML (19). A recombination-free tree was also build by excluding recombination sites identified using a Bayesian analysis software BRATNextGen (20).

Comparative Genomic Analysis
To define presence of genes and their alleles, we used SRST2 with trimmed reads or BLAST+ (executables [http://blast.ncbi.nlm.nih.gov/]) with assembled draft genomes and following databases or typing schemes: Res-Finder antimicrobial resistance gene database, VFDB and VirulenceFinder virulence gene databases, serotypeFinder O:H typing database, PlasmidFinder plasmid replicon database, MLST (http://mlst.ucc.ie/mlst/dbs/Ecoli), plasmid MLST, fimH typing, gyrA/parC typing, ST131 virotyping, and detection of H30Rx-specific ybbW SNPs, plasmid addiction systems, and bla CTX-M genetic environment (online Technical Appendix). We used pangenome analysis to identify clade specific segments among draft or complete genomes. BRIG was used to visualize similarity of genomes to ST131 genomic islands (16) and to the ST131 reference plasmid pEC958 (21).

Statistical Analysis and Sequence Data Accession Numbers
We compared categorical variables using Fisher exact test. A p value <0.05 was considered statistically significant. We conducted our statistical analysis using Stata, version 13.1 (StataCorp, College Station, TX, USA). The sequences were deposited in the DDBJ Sequence Read Archive database (accession no. DRA004266 and DRA004267).

Core Genome SNP-based Phylogenetic Tree
We identified a 4,086,650-bp core genome that included 5,280 SNPs by mapping and alignment of the 61 study isolates to EC958 ( Figure 1). The ciprofloxacin-resistant isolates with gyrA 1AB and parC 1aAB alleles formed the C/H30R cluster that comprised the C2/H30Rx and C1/ Analysis of the core genome showed that 79 segments (i.e., 304,782 bp, including 3,453 SNPs) were associated with recombination sites (online Technical Appendix Figure 1). This finding suggests that recombinant segments contained 65% of SNPs with subsequent higher frequency of SNPs compared with nonrecombinant regions (average 11 vs. 0.48 SNPs/kb, respectively). The phylogenetic tree created without recombination sites showed the same results as the phylogenetic tree obtained with recombination sites (online Technical Appendix Figure 2). In addition to the core genome-based phylogeny with or without recombination sites, the C1-M27 clade was defined by a unique accessory genome of the M27PP1.

The C1-M27 Clade-Specific Region
The pangenome analysis of genomes from all the isolates identified an 11,894-bp region named M27PP1 that was specific to all the isolates from the C1-M27 clade. Further analysis using the BLAST database and Sanger sequencing for gap filling showed that this region was identical to a prophagelike genomic island (GenBank accession no. CP006632) in E. coli PCN033 that belonged to phylogenetic group D and was isolated from a pig in China. The BLAST database also identified 2 similar sequences (i.e., 99.9% homology): A CMY-2 containing plasmid pEQ011 (GenBank accession no. NC_023315) in an E. coli isolate from a horse in Ireland (22) and a multidrug-resistant plasmid pSD853_88 (GenBank accession no. JF267652) found in a bovine Salmonella enterica isolate in the United States. M27PP1 was inserted into chromosome creating a 7-bp direct repeat region ( Figure 2). PCN033 had the same flanking structure as the M27PP1, whereas the 2 plasmids (pEQ011, pSD853_89) contained only a 44-bp similar segment at 5′ side and other parts of these plasmids were not found in the C1-M27 clade isolates.
Two E. coli ST131 C1-M27 isolates (i.e., KUN5781 and Ec 24) had an additional insertion region of 19,352 bp, named M27PP2, situated upstream of M27PP1. M27PP2 was accommodated within the same 7-bp direct repeat region ( Figure 2). M27PP2 included a 15,555-bp region that showed 88.9% homology to a prophage-like sequence in the chromosome of the γ proteobacterium HdN1 (GenBank isolates. This maximum-likelihood phylogram is based on a 4,086,650-bp core genome and a total of 5,280 SNPs. The tree is rooted by using the outgroup H22 isolates, and asterisks indicate bootstrap support >90% from 100 replicates. Strains that had previously been sequenced are in italics. The Country columns indicate places of isolation: Ja to Jw, Japan (a to w indicates hospitals); AU, Australia; CA Canada; FR, France; NZ, New Zealand; SA, South Africa; TH, Thailand; UK, United Kingdom; US, United States; VI, Vietnam. Environment column shows a type of genetic environment of ESBL genes (online Technical Appendix Table 2, http://wwwnc.cdc.gov/ EID/article/22/11/16-0519-Techapp1.pdf). FQ columns indicate ciprofloxacin susceptibilities (S, susceptible; R, resistant). KSEC7 had a parC 1aE allele including G250A (S80K) mutation in addition to a 1a allele. The ciprofloxacin-resistant C/H30R cluster comprised the C2/H30Rx and C1/H30R clades. All of the H30Rx isolates belonged to the C2/H30Rx clade. The C1/H30R clade included CTX-M-14producing H30R, non-ESBL-producing H30R, and CTX-M-27-producing H30R isolates. CTX-M-27-producing isolates belonged to the C1-M27 clade within the C1/H30R clade except 2 isolates (S100EC and EC# 584). The bootstrap value for the root of the C1-M27 clade was 64%. An average of 68 SNPs was found among the C1-M27 clade, whereas an average of 158 SNPs was found between the C1-M27 clade and 2 non-C1-M27 clade isolates with bla CTX-M-27 . Scale bar indicates 100 SNPs. accession no. FP929140) and 99.8% homology to the insertion element ISSen4.

Genomic Comparison of the ST131 Genomic Islands and Virulence Genes
The sequences of the study isolates were similar to the ST131 genomic islands in EC958 and JJ1886 (a CTX-M-15-producing C2/H30Rx strain obtained in the United States from a patient with fatal urosepsis) ( Figure 3) (17). The C1-M27 clade isolates lacked the prophage 1 region present in EC958 (Figure 3). This prophage 1 region, specific for ST131, was present among the non-C1-M27 ST131 isolates in this study, except for BRG23 and EcSA01. The presence of ExPEC-associated virulence genes is shown in online Technical Appendix Figure 3. The senB enterotoxin gene was more common in C1/ H30R (than in C2/H30Rx). No significant differences existed in the distribution of virulence genes between E. coli ST131 C1-M27 and other isolates.

Discussion
A previously unreported clade named C1-M27 within C1/ H30R clade is responsible for the epidemic of ESBL-producing ExPEC in Japan and has already been disseminated to 5 countries on 3 continents. ST131 containing bla CTX-M-27 responsible for human infections has been reported from various continents (2) and is especially common among ESBL-producing ExPEC in certain countries, such as Israel, the Czech Republic, and Switzerland (2,13,14). CTX-M-27-producing ST131 also is present among nonclinical and nonhuman E. coli isolates, including fecal specimens of healthy children attending day care centers in France; fecal specimens of healthy adults in China, Portugal, and the Netherlands; samples from sick dogs and cats in Japan; samples from water birds from central Europe and Swiss rivers and lakes; and samples of well water from China (2,10,11,15,(23)(24)(25). The most common ESBL among E. coli ST131 in nonhuman samples is CTX-M-27 (2,(23)(24)(25). ST131 with bla CTXM-15 is rare among animal and environmental E. coli isolates (26). Our analysis of IEH71520, Figure 2. Genetic environments of the C1-M27 clade-specific region of Escherichia coli. All isolates other than the C1-M27 clade isolates had the type A structure in their chromosome (red arrows; gene locus tags shown in the bottom are annotated according to EC958). The C1-M27 clade isolates except 2 isolates (KUN5781 and Ec 24) had the type B structure. A 11,894-bp region (M27PP1; predicted genes shown in light blue arrows) is inserted into the type A structure creating the 7-bp direct repeat (CCGTTCT; yellow triangle). The inserted sequence M27PP1 is identical to a prophage-like genomic island in E. coli PCN033 chromosome (GenBank accession no. CP006632), which had the similar flanking structure (structure C, 98.8% similarity). M27PP1 included phage-like integrase and recombinase. The identical M27PP1 sequence was found in all of the C1-M27 isolates with the use of additional Sanger sequencing. Only the draft genome of IEH71520 had 98.7% coverage to the M27PP1 sequence because of contig discontinuity. KUN5781 and Ec 24 had the type D structure, of which an additional 19,352-bp region (M27PP2) is inserted into the type B structure by creating the same 7-bp direct repeat (yellow triangle). The M27PP2 includes a total of 15,555-bp region (genes shown in orange arrows), which was 88.9% similar to a prophage-like region in γ proteobacterium HdN1 chromosome (GenBank accession no. FP929140) and a following 1,221-bp region is 99.8% similar to ISSen4 (purple box). Code to gene locus tags: 1, 958RS23365; 2, 958RS23370; 3, HDN1F03950; 4, HDN1F03970; 5, HDN1F04000; 6, HDN1F04010; 7, HDN1F04020; 8, HDN1F04030; 9, HDN1F04040; 10, ISSen4; 11, 033RS22420; 12, 033RS22425; 13, 033RS22430; 14, 033RS22440; 15, 033RS22450; 16, 958RS23380.
an E. coli isolate from vacuum cleaner dust in the United States (15), showed that this ST131 isolate belong to the C1-M27 clade. The C1-M27 clade is likely to be present among animal and environmental ST131, and such isolates might act as a hidden reservoir for the introduction of ST131 containing bla CTX-M-27 into human medicine.
E. coli ST131 C1-M27 had an additional, unique prophage-like region (M27PP1) within its chromosome, lacked the prophage 1 genomic island previously identified in ST131 C2/H30Rx, and were negative for the transposon Tn2 containing bla TEM-1 (Figure 3; online Technical Appendix Figure 5). The direct flanking repeat sequences surrounding M27PP1 suggest that this region was introduced into E. coli ST131 C1/H30R with bla CTX-M-27 by a recombination event that was then followed by the clonal expansion of the C1-M27 clade.
Recent studies focusing on evolutionary history of ST131 suggested that C1/H30R and C2/H30Rx clades emerged ≈30 years ago, after their acquisition of gyrA-1AB and parC-1aAB alleles from C0/H30 (non-R) clade (27,28). The phylogeny and smaller numbers of SNPs in the C1-M27 clade (Figure 1) suggest that this clade was recently diverged from the C1/H30R. In the time-scaled phylogeny presented by Stoesser et al. (27), a cluster that included 6 CTX-M-27-producing isolates from Cambodia, Thailand, and Laos in 2007-2011 was present within the C1/H30R clade. This cluster, supposed to be the C1-M27 clade, diverged in the early 2000s, supporting our hypothesis.
CTX-M-27-producing ST131 that belongs to the H41 lineage previously had been described from Japan (6) and China (15). The characterization of the Japanese ST131 Rings drawn by BRIG show the presence of these regions. Colored segments indicate >90% similarity and gray segments indicate >70% similarity by BLAST comparison between the regions of interest and each genome. Extended-spectrum β-lactamase types are indicated in parentheses of Type column. The regions from Flag2 to GI-lueX were found in EC958, the prophage 8 region was found in JJ1886, and the M27PP1 and M27PP2 were found as the C1-M27 clade-specific regions in this study. Prophage 6, capsule, GI-selC, and prophage 8 regions were present in some C2/H30Rx isolates but were absent in C1/H30R isolates. H41 isolates showed different genetic structures flanking the bla CTX-M-27 from those structures present in E. coli ST131 H30R (6). The flanking regions previously characterized in ST131 H41 were identical to the flanking regions in ST131 non-C1-M27 from this study. It seems there are 2 types of structures flanking the bla CTX-M-27 among E. coli ST131; 1 type is confined to clade C1-M27 (i.e., 208 bp of ΔISEcp1 upstream and ΔIS903D downstream), whereas another structure (i.e., 388 bp of ΔISEcp1 upstream and full IS903D downstream) is distributed among non-C1-M27 isolates, including ST131 H41 (6). Therefore, ST131 H41, through horizontal transfer of bla CTX-M-27 , is unlikely to have played a substantial role in the emergence of the C1-M27 clade.
Our study has several limitations. Most isolates originated from Japan. However, we were able to include ST131 C1-M27 isolates from 5 countries on 3 continents and C1/H30R isolates producing CTX-M-14 or CTX-M-15 from 6 countries on 4 continents. Another limitation was that we were able to obtain only 1 environmental ST131 isolate with bla CTX-M-27 (IEH71520). Future studies that include environmental isolates will provide additional insights into molecular epidemiology and evolutionary history of the C1-M27 clade. We could not analyze plasmid contents of bla CTX-M-27 because bla CTX-M-27 -containing contigs were too short. The sequencing of plasmids that contain bla CTX-M-27 obtained from various ST131 clades (including the C1-M27 clade) should also be undertaken.
In conclusion, we showed that the recent increase in ESBL-producing E. coli from Japan resulted from emergence of a ST131 C1/H30R subclade with bla CTX-M-27 . This clade, named C1-M27, had unique genomic characteristics and was present in ST131 from Thailand, Australia, Canada, and the United States. Our findings suggest that the C1-M27 clade is contributing to the global success of ST131. E. coli ST131 C1-M27 poses a major new public health threat because of its global distribution and association with the very dominant C/H30 lineage. We urgently need rapid cost-effective detection methods for E. coli ST131 C1-M27 and well-designed epidemiologic and molecular studies to understand the dynamics of transmission, risk factors, and reservoirs for ST131 C1-M27. These efforts will provide insight into the emergence and spread of this multidrug-resistant clade that will lead to information essential for preventing the spread of ST131.