Volume 9, Number 11—November 2003
Genetic Variation among Temporally and Geographically Distinct West Nile Virus Isolates, United States, 2001, 2002
Analysis of partial nucleotide sequences of 22 West Nile virus (WNV) isolates collected during the summer and fall of 2001 and 2002 indicated genetic variation among strains circulating in geographically distinct regions of the United States and continued divergence from isolates collected in the northeastern United States during 1999 and 2000. Sequence analysis of a 2,004-nucleotide region showed that 14 isolates shared two nucleotide mutations and one amino acid substitution when they were compared with the prototype WN-NY99 strain, with 10 of these isolates sharing an additional nucleotide mutation. In comparison, isolates collected from coastal regions of southeast Texas shared the following differences from WN-NY99: five nucleotide mutations and one amino acid substitution. The maximum nucleotide divergence of the 22 isolates from WN-NY99 was 0.35% (mean = 0.18%). These results show the geographic clustering of genetically similar WNV isolates and the possible emergence of a dominant variant circulating across much of the United States during 2002.
West Nile virus (WNV) is a member of the genus Flavivirus (family Flaviviridae) and belongs to the Japanese encephalitis virus serocomplex. Until 1999, the geographic distribution of the virus was limited to Africa, the Middle East, India, and western and central Asia with occasional epidemics in Europe (1,2). By December 2002, however, the distribution of the virus had expanded to include 44 states of the continental United States and southern regions of 5 Canadian provinces from Saskatchewan to Nova Scotia (3). Over the course of 3 years, the virus has traversed North America, presumably from New York City, where it was first isolated during the summer of 1999 (4–7). Partial nucleotide and complete genome sequence analysis of several WNV strains isolated in the northeastern United States during 1999 and 2000 showed that these isolates were most closely related to a WNV strain isolated from the brain of a dead goose in Israel in 1998 (6,8,9). The subsequent establishment of WNV across the eastern and midwestern regions of North America from 1999 through 2001 set the stage for the rapid and widespread movement of the virus across the remainder of the continent during the summer of 2002, resulting in the highest number of annual case reports and deaths attributed to WNV in humans, equines, and birds documented since the discovery of the virus in North America. Surveillance programs initiated by public health agencies, research institutions, and diagnostic laboratories have resulted in the collection of hundreds of WNV isolates across the United States and Canada from various sources, including mosquitoes, humans, equines, birds, and a number of other vertebrate species (3).
Phylogenetic comparisons of partial and complete nucleotide sequences from isolates collected in the northeastern United States during 1999 and 2000 demonstrated a high degree of genetic similarity to the prototype New York strain, WN-NY99 (GenBank accession no. AF196835), with nucleotide identities of >99.8% and amino acid identities of >99.9% (9–12). Although these studies have confirmed that northeastern isolates collected in 1999 and 2000 showed limited genetic divergence from WN-NY99, to date little published information has described the continuing divergence of WNV as its temporal and spatial distribution have expanded (13). To assess the extent to which WNV has evolved since its introduction in North America, we analyzed the partial nucleotide and deduced amino acid sequences of WNV isolates collected during the summer and fall of 2001 and 2002 and compared them to a homologous sequence region of WN-NY99. Collaborations between the University of Texas Medical Branch (UTMB) and a number of U.S. public health agencies have allowed 22 isolates of WNV to be collected, representing several geographically distinct U.S. regions. Phylogenetic comparisons of a 2,004-nucleotide region encoding the entire premembrane and envelope proteins (prM-E) of each isolate have shown the most divergent variants of WNV in North America to date and provide evidence of the possible emergence of a dominant variant circulating in many regions of the United States. Furthermore, our results indicate geographic clustering of distinct variants within and between states and reinforce previous evidence supporting the likelihood of multiple introductions of virus into the state of Texas (13).
Collection and Virus Isolation
Isolates were collected from five states: Illinois, Alabama, Louisiana, Colorado, and Texas. Isolates from Texas were collected from nine counties representing regions across the entire state (Figure 1). All isolates were collected from September 2001 to October 2002. After being confirmed WNV-positive by state public health laboratories, virus or tissues were sent to UTMB for submission into the World Arbovirus Reference Collection. Each sample was given one passage in Vero cells to derive viruses for use in these studies. Virus samples represented a variety of sources, including mosquito pools, bird brain, human cerebrospinal fluid (CSF), and a dog kidney. Of the 18 isolates sequenced in this study (Tables 1 and 2), 11 were isolated from mosquito pools by the Texas Department of Health (TDH); 2 from a mosquito pool and dog kidney homogenate by the Illinois Natural History Survey (INHS); 2 from passerine brain homogenates from the University of Alabama at Birmingham; 1 from a red-tailed hawk brain homogenate by the Centers for Disease Control and Prevention, Division of Vector-Borne Infectious Diseases (CDC-DVBID), Fort Collins, Colorado; 1 from a mosquito pool in Louisiana, courtesy of CDC-DVBID; and 1 from the CSF of a patient who died of West Nile encephalitis at UTMB.
RNA Extraction, Reverse Transcription, and Polymerase Chain Reaction
Viral RNA was extracted directly from 140 mL of infected Vero or BHK cell culture supernatants by using the QiaAMP viral RNA extraction kit (Qiagen, Valencia, CA). Reverse transcription (RT) was performed in a 50-mL volume containing 5 mL of viral RNA, 1 mL of random hexamer primer, 10 mL of 5X RT buffer, 4 mL of 10 mM dNTPs, 0.4 mL of cloned RNAse inhibitor, 0.5 mL of Moloney murine leukemia virus (MMLV) reverse transcriptase, and 29.1 mL of high-performance liquid chromatography (HPLC) water. Polymerase chain reaction (PCR) was performed in a 25-mL volume containing 2.0 mL cDNA template from RT, 1.0 mL forward primer, 1.0 mL reverse primer, 2.5 mL 10X PCR buffer, 0.5 mL 10 mM dNTPs, 0.5 mL of 1 U/mL Taq PCR, and 17.5 mL of HPLC water. Three previously described primer pairs were used to amplify the entire prM-E genes of WNV (13). PCR products were gel-purified by using the QIAquick kit (Qiagen), according to the manufacturer’s protocol, and the resulting template was directly sequenced by using the amplifying primers. The WN1751/WN2504A PCR product derived from WNV isolate Galveston County, TX-3 was cloned into pGEM-T Easy (Promega Corporation, Madison, WI), and 10 clones were sequenced to determine the degree of nucleotide sequence divergence within a single isolate collected from the southeast coast of Texas. Sequencing reactions were performed in the UTMB Biomolecular Resource Facility’s DNA sequencing laboratory by previously described methods (13). Analysis and assembly of sequencing data were performed by using the Vector NTI Suite software package (Informax, Frederick, MD). Nucleotide and deduced amino acid sequences of the entire prM-E genes from each isolate were aligned by using the AlignX program in the Vector NTI Suite and compared with previously published sequences of isolates from southeast Texas collected from June to August of 2002 (13). All isolates were then compared with isolates collected in the northeastern United States during 1999, 2000, and 2001, and a phylogenetic tree was constructed by maximum parsimony algorithm by using PAUP (Version 4.0b10) (Sinauer Associates, Sunderland, MA) to show genetic relationships of these isolates with other North American WNV isolates found in GenBank, in which the homologous 2,004-nucleotide region had been sequenced.
Nucleotide sequences representing a 2,004-nucleotide region of the complete prM-E genes of WNV (nucleotides 466–2,469) of the 18 isolates collected in 2001 and 2002 (GenBank accession nos. AY4281514-AY428531), plus 4 southeast Texas strains (13), were compared with a homologous sequence region of the prototype WNV, WN-NY99 (Table 1). Of the 22 isolates analyzed, 16 were collected from 10 different Texas counties, and 2 each from Illinois and Alabama, plus 1 each from Colorado and Louisiana. All isolates were from 2002, except 2 that came from Alabama in 2001 (Figure 1). Sequence alignments comparing WN-NY99 with individual 2001 and 2002 isolates showed up to seven nucleotide mutations and three amino acid substitutions among the 22 isolates analyzed (Table 1). Nucleotide mutations occurred at 33 positions (9 in prM, 24 in E) with a total of 7 amino acid substitutions (2 in prM, 5 in E). The maximum nucleotide divergence of the 22 isolates from WN-NY99 was 0.35%, with an average nucleotide divergence of 0.18%.
Several of the nucleotide mutations identified in this study were shared by many isolates (Tables 1 and 2; Figure 2). Two nucleotide mutations at residues 1,442 (conservative amino acid substitution of Val to Ala at position E159) and 2,466 were shared by 14 of the 22 isolates, with 10 of these 14 isolates sharing an additional noncoding nucleotide mutation at residue 660. Five different nucleotide mutations (at residues 969, 1,192 [amino acid substitution of Thr to Ala at position E76], 1,356, 2,154, and 2,400) were shared by seven isolates, all of which were collected from coastal regions of southeast Texas. The isolate from Louisiana differed from WN-NY99 at only one nucleotide (residue 807) over the region studied and did not share any nucleotide mutations with other isolates from this study. In comparison, all other nucleotide mutations identified in this study were not shared by nucleotide sequences reported previously from isolates collected in the northeastern United States during 1999, 2000, or 2001 (9–12). Because these mutations were unique to isolates sequenced during this study, our results did not show a closer genetic relationship to isolates from 2001, 2000, or 1999. However, the two isolates in this study that were collected in 2001 (Alabama-1; Alabama-2) did share two nucleotide mutations (residues 1,442 and 2,466) with 12 of the other isolates collected in 2002. Construction of a phylogenetic tree by maximum parsimony analysis (Figure 3) illustrates the genetic proximity of isolates from this study to those collected from the northeastern United States in 1999, 2000, and 2001. Branch groupings showed both temporal and geographic separation of isolates, with those collected in the northeastern United States in 1999, 2000, and 2001 representing a distinct clade in relation to isolates collected in 2002. An exception to this grouping is an isolate from Louisiana collected in 2002, which was grouped with northeastern United States isolates from 1999 to 2001. Notably, WNV isolates from the southeastern coast of Texas also comprise a clade of their own, separating these isolates from other 2001 and 2002 isolates collected from various regions within the United States. A recently reported WNV isolate collected from a Missouri dog in 2002 (GenBank accession no. AY160126) also shared a nucleotide mutation (residue 2,466 C to U) with the 2002 isolates from this study. Although the entire prM-E gene of this isolate was not reported, this isolate likely represents an additional member of the large 2002 clade.
In a previous report concerning the genetic divergence of WNV since its introduction into the United States, Beasley et al. (13) described a quasispecies population within a single WNV isolate from Harris County, Texas. To determine whether nucleotide mutations that define the southeast coastal Texas variant were uniform throughout the quasispecies population of a select isolate, the WN1751/WN2504A PCR product derived from WNV isolate Galveston Co., TX-3, was cloned into pGEM-T Easy. Ten clones were sequenced to obtain homologous regions of 700 nucleotides, which were then compared with the Galveston Co., TX-3, consensus sequence. This region contained the U to C mutation at nucleotide 2154 and the U to C mutation at nucleotide 2,400. Five of the 10 clones were identical to the consensus sequence, while the other five clones each had one or two nucleotide changes from the consensus sequence for a total of eight nucleotide changes (Table 3). None of the mutations identified represented amino acid substitutions and, unlike the 2001–2002 variant population (13), none of the mutations encoded a stop codon. The maximum nucleotide divergence of individual clones was 0.28% (mean = 0.11%). Furthermore, none of the nucleotide changes identified in the five clones was shared with WNV strains representing the 2001–2002 variant, nor were any nucleotide changes identified at two of the nucleotide positions that defined the southeastern coastal Texas variant. These results suggest that none of the virus genomes existing in a quasispecies population from WNV isolate Galveston Co., TX-3, contained nucleotide mutations characteristic of the 2001–2002 variant identified in this study.
Sequence comparisons of a 2,004-nucleotide region of 22 WNV isolates collected during the summer and fall of 2001 and 2002 showed the highest degree of nucleotide divergence from WN-NY99 to date. Studies by Lanciotti et al. (9) and Huang et al. (12) have shown that the complete genomes of several WNV isolates collected in 1999, 2000, and 2001 share >99.8% nucleotide identity with WN-NY99, with three or fewer amino acid substitutions in the entire polyprotein. Similar studies of partial nucleotide sequences conducted by Anderson et al. (10) and Ebel et al. (11) reported up to three nucleotide mutations encompassing a region of 921 nucleotides and 1,503 nucleotides from isolates collected in Connecticut in 1999 and 2000 and New York in 2000, respectively. Although our studies have compared a larger portion of the genome than earlier studies of partial nucleotide sequences, we have identified individual isolates with as many as seven nucleotide mutations and three amino acid substitutions, with a maximum divergence of 0.35% from the homologous region of the prototype North American WNV, WN-NY99. The nucleotide mutations identified in this study were not shared by previously sequenced isolates from 1999, 2000, or 2001 (9–12) and represent new nucleotide changes in the North American WNV population. Since these changes were not shared with other previously reported WNV sequences, the isolates analyzed in this study did not show a greater genetic similarity with northeastern isolates from 1999, 2000, or 2001. However, several of these nucleotide changes (660, 969, 1,356, 2,154, 2,400, and 2,466) are observed in other Old World WNV strains from both lineage I and lineage II (Table 4). Each of these changes represents a noncoding mutation from either a C to U or U to C in the third codon of the open reading frame; nucleotides at these positions may revert back to nucleotides observed in the more ancestral Old World strains.
Our results also suggest the geographic clustering of genetically distinct variants. Seven of the 22 isolates, all of which were collected from coastal regions of southeast Texas, share five nucleotide mutations unique to only these isolates. Fourteen of the other isolates, which represent the CDC-defined East South Central (AL), West South Central (LA and TX), East North Central (IL), and Mountain (CO) regions (3), all share two unique nucleotide mutations not identified in other isolates (Figure 2). The results of this study support the findings of Beasley et al. (13), which suggest that during the summer of 2002 WNV was introduced into Texas on at least two separate occasions. These results might reflect the unique migratory patterns of North American birds, which act as reservoir hosts for WNV. As Rappole et al. (14) have illustrated, many North American birds follow well-documented migration routes from summer grounds in the northeastern United States to southern areas that are classified as the southeastern United States, circum-Gulf, trans-Gulf, and Caribbean/western North Atlantic routes. For example, the Laughing Gull (Larus atricilla) has been known to follow a circum-Gulf route as it travels from the northeastern United States to stopover sites along the northern and western Gulf Coast on its way to Mexico or Central America. Because certain species of birds have a more limited geographic range than others, geographically clustered populations of distinct genetic variants, for example, isolates collected from coastal regions of southeast Texas, might arise as a result of restricted migratory routes. This hypothesis is supported by a number of studies. Peiris and Amerasinghe (15) have identified a group of geographically restricted antigenic variants of WNV confined to southern India. Because of the lack of bird migratory routes linking southern India with the Middle East and Africa, a distinct antigenic group exists exclusively in southern India. Furthermore, numerous studies have shown antigenic variation among WNV strains that correlate with geographically distinct regions and restricted bird migratory patterns (16,17). Phylogenetic comparisons of Indian viruses with other WNV strains show similar findings, which place Indian WNV strains in a unique clade of lineage I (9,18). Recent studies in Israel by Malkinson et al. (19) also support the role of migratory birds in the dispersion of unique WNV variants in geographically distinct regions. The results of our study support an alternative hypothesis that explains the continental spread of WNV as a consequence of transmission between local bird and mosquito populations in a given region. This mechanism allows for spread of the virus from region to region over shorter distances, in contrast to the long distances traveled by migratory birds (20). Our finding of a dominant variant that exists over a large part of the United States, together with evidence of a geographically distinct southeast coastal Texas variant, suggests that both mechanisms of spread have influenced the genetic distribution and spread of WNV in the United States.
To date, little genetic evidence supports or refutes the hypothesis that WNV becomes established in an enzootic transmission cycle in a particular geographic area rather than being reintroduced into a particular area each year when the transmission season begins. Similarly, because of the limited published data detailing the year-to-year genetic changes observed in WNV, whether the virus is becoming endemic in particular regions of the United States remains to be established. This question will be answered in part by determining baseline phylogenetic results of specific variants in a geographic area and by analyzing isolates collected in sequential transmission seasons.
Although the isolates analyzed in this study do not represent the entire temporal and geographic distribution of WNV in North America, at least some nucleotide mutations have been conserved among WNV strains circulating across the continent. If indeed the conservation of these mutations is the result of selective pressure, such as the continued capacity to replicate in both arthropod and vertebrate hosts, rather than random mutations occurring as a consequence of genetic drift, one would expect these mutations to be conserved in virus isolates collected in other regions of North America. Further investigation concerning the genetic composition of viruses from additional regions of North America will define the extent to which dominant variants have emerged. If dominant variants do continue to emerge across the United States, phylogenetic analyses will help researchers monitor the spread of WNV in North America and may provide explanations for the rapid and widespread movement of this newly emerging virus in North America. Similarly, identifying the genetic composition of WNV isolates from other regions of the United States and Canada, as well as comparing these isolates with isolates collected in 2003, will continue to define evolutionary relationships of WNV circulating in North America and facilitate predictions concerning the primary mechanisms of transmission and spread of the virus.
Mr. Davis is a Ph.D. candidate at the University of Texas Medical Branch, Galveston. His research interests include the molecular epidemiology and pathogenesis of flaviviruses.
We thank Juliet Bryant for assistance in generating phylogenetic trees and CDC for providing virus strains. This work was supported in part by the State of Texas Advanced Research Program, the Alabama Department of Public Health, NIH grants AI 10984 and AI 49724, and NIH contract NO1-AI25489, and Centers for Disease Control and Prevention (CDC) cooperative grant U90/CCU620916.
- Hayes CG. West Nile fever. In: Monath TP, editor. The arboviruses: epidemiology and ecology. Boca Raton (FL): CRC Press; 1989. p. 59–88.
- Murgue B, Zeller H, Duebel V. The ecology and epidemiology of West Nile virus in Africa, Europe, and Asia. Curr Top Microbiol Immunol. 2002;267:195–221.
- Centers for Disease Control and Prevention. Provisional surveillance summary of the West Nile virus epidemic—United States, January–November 2002. MMWR Morb Mortal Wkly Rep. 2002;51:1129–33.
- Anderson JF, Andreadis TG, Vossbrinck CR, Tirrell S, Wakem EM, French RA, Isolation of West Nile virus from mosquitoes, crows, and a Cooper’s hawk in Connecticut. Science. 1999;286:2331–3.
- Briese T, Jia XY, Huang C, Grady LJ, Lipkin WI. Identification of a Kunjin/West Nile-like flavivirus in brains of patients with New York encephalitis. Lancet. 1999;354:1261–2.
- Lanciotti RS, Roehrig JT, Deubel V, Smith J, Parker M, Steele K, Origin of the West Nile virus responsible for an outbreak of encephalitis in the northeastern U.S. Science 1999;286:2333–7.
- Steel KE, Linn MF, Schoepp RJ, Komar N, Geisbert TW, Manduca RM, Pathology of fatal West Nile virus infections in native and exotic birds during the 1999 outbreak in New York City, New York. Vet Pathol. 2000;37:208–24.
- Jia XY, Briese T, Jordan I, Rambaut A, Chi HC, Mackenzie JS, Genetic analysis of West Nile New York 1999 encephalitis virus. Lancet. 1999;354:1971–2.
- Lanciotti RS, Ebel GD, Deubel V, Kerst AJ, Murri S, Meyer R, Complete genome sequences and phylogenetic analysis of West Nile virus strains isolated from the United States, Europe, and the Middle East. Virology. 2002;298:96–105.
- Anderson JF, Vossbrinck CR, Andreadis TG, Beckwith WH, Mayo DR. A phylogenetic approach to following West Nile virus in Connecticut. Proc Natl Acad Sci U S A. 2001;98:12885–9.
- Ebel GD, Dupuis AP, Ngo K, Nicholas D, Kauffman E, Jones SA, Partial genetic characterization of West Nile virus strains, New York State, 2000. Emerg Infect Dis. 2001;7:650–3.
- Huang C, Slater B, Rudd R, Parchuri N, Hull R, Dupuis M, First isolation of West Nile virus from a patient with encephalitis in the United States. Emerg Infect Dis. 2002;8:1367–71.
- Beasley DW, Davis CT, Guzman H, Vanlandingham DL, Travassos da Rosa AP, Parsons RE, Limited evolution of West Nile virus during its southwesterly spread in the United States. Virology. 2003;309:190–5.
- Rappole JH, Derrickson SR, Hubalek Z. Migratory birds and spread of West Nile virus in the Western Hemisphere. Emerg Infect Dis. 2000;6:319–28.
- Peiris JSM, Amerasinghe FP. West Nile fever. In: Beran, G.W. and J.H. Steele, editors. Handbook of zoonoses. 2nd ed. Section B: Viral. Boca Raton (FL): CRC Press; 1994. p.139–48.
- Hammam HM, Clarke DH, Price WH. Antigenic variation of West Nile virus in relation to geography. Am J Epidemiol. 1965;82:40–55.
- Price WH, O’Leary W. Geographic variation in the antigenic character of West Nile virus. Am J Epidemiol. 1967;85:84–6.
- Burt FJ, Grobbelaar AA, Leman PA, Anthony FS, Gibson GV, Swanepoel R. Phylogenetic relationships of southern African West Nile virus isolates. Emerg Infect Dis. 2002;8:820–6.
- Malkinson M, Banet C, Weisman Y, Pokamunski S, King R, Drouet MT, Introduction of West Nile virus in the Middle East by migrating white storks. Emerg Infect Dis. 2002;8:392–7.
- Rappole JH, Hubalek Z. Migratory birds and West Nile virus. J Appl Microbiol. 2003;94:47–58.