Rapid Epidemic Expansion of Chikungunya Virus East/Central/South African Lineage, Paraguay

The spread of Chikungunya virus is a major public health concern in the Americas. There were >120,000 cases and 51 deaths in 2023, of which 46 occurred in Paraguay. Using a suite of genomic, phylodynamic, and epidemiologic techniques, we characterized the ongoing large chikungunya epidemic in Paraguay.

Sequences were aligned using MAFFT (6) and edited using AliView (7).These datasets were assessed for the presence of phylogenetic signals by applying the likelihood mapping analysis implemented in the IQ-TREE2 software (8).A maximum likelihood phylogeny was reconstructed using IQ-TREE2 software under the HKY + G4 substitution model (8).We inferred time-scaled trees using TreeTime (8).The presence of a temporal signal was evaluated in TempEst (9), and time-scaled phylogenetic trees were inferred using the BEAST package (10).We used a stringent model selection analysis with path-sampling (PS) and steppingstone (SS) procedures to estimate the most appropriate molecular clock model for the Bayesian phylogenetic analysis (11).The uncorrelated relaxed molecular clock model was chosen for all datasets as indicated by estimating marginal likelihoods, also using the codon based SRD06 model of nucleotide substitution and the nonparametric Bayesian Skyline coalescent model.To model the phylogenetic diffusion of detected 2022-2023 transmission clade we used a flexible relaxed random walk diffusion model (12,13) that accommodates branch-specific variation in rates of dispersal with a Cauchy distribution and a jitter window site of 0.01 (14,15).For each sequence, coordinates of latitude and longitude were attributed.MCMC analyses were performed in BEAST v1.10.4,running in duplicate for 50 million interactions and sampling every 10,000 steps in the chain.Convergence for each run was assessed in Tracer (effective sample size for all relevant model parameters >200).MCC trees for each run were summarized using TreeAnnotator after discarding the initial 10% as burn-in.Finally, we used the R package 'seraphim' version 1.0 (15) to extract and map spatiotemporal information embedded in the posterior trees.

Epidemiologic Data
Epidemiologic data of weekly fatal, notified and laboratory confirmed cases CHIKV in Paraguay from 2013 to 2023 (Figure 1) were obtained and curated from the PAHO data repository for Chikungunya (16).Confirmed infections are defined as a suspected or probable chikungunya case with a chikungunya test with positive result (as stated on the PAHO platform).
Epidemiologic data (Appendix 1 Figure 1) was provided by Dirección General de Vigilancia de la Salud del Ministerio de Paraguay (DGVS), including suspected, probable and confirmed CHIKV infections between 2015 and 2023 (17).Suspected infections are defined as any person with sudden onset of fever and arthralgia or disabling arthritis of sudden onset not explained by another medical condition.Probable infections are defined as any suspected case with a positive laboratory result for CHIKV (IgM ELISA) or any suspected case of CHIKV with an epidemiologic link with a confirmed case.Confirmed infections are any suspected or probable case of CHIKV that includes real-time RT-PCR or viral isolation.When epidemiologic data are presented, we aggregate suspected and probable infections into a single, suspected category.

Sample Metadata
Samples were selected for sequencing based on a Ct value (≤35) and availability of epidemiologic metadata, such as date of symptom onset, date of sample collection, sex, age, municipality of residence, symptoms, and disease classification (Appendix 2).Patients were classified based on their clinical outcomes: Outpatient, Inpatient, intensive care unit (ICU), and fatal cases.

Temperature Data
Monthly temperature data for Paraguay was extracted from Copernicus.eu satellite climate data (18).We summarized the temperature data by calculating the minimum, mean and maximum per year.

Generalized Additive Model of Sample Sequence Coverage versus Ct
We consider the sequencing coverage of each sample (between 0 and 1) as a probability that all genome sites are sequenced with success.For this, we augmented the dataset by counting the number of successful and unsuccessful events (sequencing of sites) per sample, from which we model a binomial based Generalized Additive Model (GAM).GAM was implemented using R v3.6.3 and the package mgcv v1.38.1 (19,20).We included random effects for the clinical/infection outcome associated with each sample (outcomes) and for each sample independently (ID).The following code snippet summarizes this approach: