Clonal Dissemination of Antifungal-Resistant Candida haemulonii, China

Candida haemulonii, a relative of C. auris, frequently shows antifungal resistance and is transmissible. However, molecular tools for genotyping and investigating outbreaks are not yet established. We performed genome-based population analysis on 94 C. haemulonii strains, including 58 isolates from China and 36 other published strains. Phylogenetic analysis revealed that C. haemulonii can be divided into 4 clades. Clade 1 comprised strains from China and other global strains; clades 2–4 contained only isolates from China, were more recently evolved, and showed higher antifungal resistance. Four regional epidemic clusters (A, B, C, and D) were identified in China, each comprising ≥5 cases (largest intracluster pairwise single-nucleotide polymorphism differences <50 bp). Cluster A was identified in 2 hospitals located in the same city, suggesting potential intracity transmissions. Cluster D was resistant to 3 classes of antifungals. The emergence of more resistant phylogenetic clades and regional dissemination of antifungal-resistant C. haemulonii warrants further monitoring.

protocols. Library construction and genomic sequencing were performed by Novogene Co.
(Beijing, China) according to the manufacturer's protocols. Sequencing libraries were generated using the NEBNext Ultra DNA Library Prep Kit for Illumina (New England Biolabs, Ipswich, MA, USA). Each sample was then sequenced using the Illumina NovaSeq6000 platform with 2 × 150-bp reads and 100 × minimum coverage. Raw genome reads are available at the National Centre for Biotechnology Information (NCBI) under the BioProject accession number PRJNA827237.

Accessing publicly available C. haemulonii genomes
We searched the NCBI Sequence Read Archive (SRA) database and acquired all C.
haemulonii genomes available for download till June 15, 2022. All these genomes were in raw reads data format, with average sequencing depth of >50×. Detailed information for these strains could be found in Appendix 2 Table 4.

SNP analysis
Variant calling was performed for all strains of C. haemulonii (3). A threshold of 0.01 with (Phred score of 20) was used for trimming the raw Illumina sequencing reads. Quality control was performed for sequencing data using the fastx_toolkit 0.0.13 (http://hannonlab.cshl.edu/fastx_toolkit/download.html). Subsequently, paired-end reads were mapped to the C. haemulonii BMU5228 reference genome (GenBank assembly accession number: GCA_011426285.1) using BWA 0.7.17. SNPs were analyzed using SAMtools 0.1.18 (4)(5)(6). The filtering criteria for SNPs were the same as that reported previously (7). Gene prediction and annotation of the reference genome were performed using Augustus 3.3.2 (8) and eggNOG 5.0 (9), respectively.

Phylogenetic and population structure genetic analyses
A phylogenetic tree of whole-genome SNPs was constructed using the SNPs of all strains studied, based on the maximum likelihood method with 1000 ultrafast bootstrap approximation Page 3 of 7 replicates on the IQ-TREE web server (10). We placed the root of the tree to strain B10441 (CBS5149), which was the most ancient C. haemulonii strains identified to date in 1962 (all remaining strains were isolated after 2010). An interactive phylogenetic tree was generated using iTOL v5 (11). Fastbaps v1.0.1 (12) and YMAP V1.0 (13) were used for clade typing and copy number variation analysis. The aligned fasta file was used as input for a principal component analysis (PCA) of genetic covariance using the function fasta2genlight in the R package adegenet 1.3-1 (14). Moreover, the mating types (MTs) of the strains were further determined as previously described (3). In general, for each strain, the coverage depth of their whole genome was calculated using SAMtools from aligned binary alignment map (BAM) file (6).
Subsequently, by mapping to the corresponding reference genomes, normalized average read depth at MT locus was obtained, and the MT of the strains was further determined.

Karyotype analysis
Karyotypes of 13 isolates with notable aneuploidy or gene CNVs were visualized using the YMAP V1.0 software. The scaffold copy number is shown in a log2 ratio relative to that of the haploid B11899 reference strain (GenBank assembly accession: GCA_002926055.1) on the y-axis, with 1 copy at the midline clipped to show a maximum of 2 copies. The x-axis indicates the positions of reads on each scaffold mapped to the genome of strain B11899. A vertical red line is used to show the position of the ERG11 gene (Appendix 1 Figure 2).