Genetic Variation of SARS Coronavirus in Beijing Hospital

To characterize genetic variation of severe acute respiratory syndrome–associated coronavirus (SARS-CoV) transmitted in the Beijing area during the epidemic outbreak of 2003, we sequenced 29 full-length S genes of SARS-CoV from 20 hospitalized SARS patients on our unit, the Beijing 302 Hospital. Viral RNA templates for the S-gene amplification were directly extracted from raw clinical samples, including plasma, throat swab, sputum, and stool, during the course of the epidemic in the Beijing area. We used a TA-cloning assay with direct analysis of nested reverse transcription–polymerase chain reaction products in sequence. One hundred thirteen sequence variations with nine recurrent variant sites were identified in analyzed S-gene sequences compared with the BJ01 strain of SARS-CoV. Among them, eight variant sites were, we think, the first documented. Our findings demonstrate the coexistence of S-gene sequences with and without substitutions (referred to BJ01) in samples analyzed from some patients.

A novel severe acute respiratory syndrome-associated coronavirus (SARS-CoV) has been implicated as the causative agent of a worldwide outbreak of SARS during the first 6 months of 2003 (1)(2)(3). From March 4 to June 18, Beijing had 2,521 cases and 192 deaths from SARS (4). Because of the poor fidelity of RNA-dependent RNA polymerase, genetic variation typically forms a heterogeneous virus pool in RNA virus populations, including coronaviruses such as mouse hepatitis virus (MHV) (5,6). This feature makes viruses highly adaptable and contributes to difficulties in preventing and controlling viral disease. SARS-CoV, a single-stranded RNA virus, has been reported with relatively less variability in analyses of a limited number of viral isolate collections (7)(8)(9)(10). Furthermore, no SARS-CoV quasispecies have been documented, as they have been in many other RNA viruses, including hepatitis C virus (HCV) (11), HIV (12), and MHV (6).
During the SARS outbreak in Beijing, 132 SARS patients were hospitalized and treated on our unit at Beijing Hospital, including the first cluster of case-patients in the area (13). To characterize genetic variation among SARS-CoV transmitted in the Beijing area, we sequenced 29 full-length S genes of SARS-CoV from 20 hospitalized SARS patients, since S glycoprotein plays a key role in virus-host interaction and is predicted to be the main target of immune response (14). Samples that were analyzed represented the timespan of the epidemic. To exclude culturederived artifacts and estimate mutational heterogeneity, viral RNA was directly extracted from raw clinical samples, and a TA-cloning assay was used with direct analysis of reverse transcriptase-polymerase chain reaction (RT-PCR) products. We compared these sequences with all previously documented S-gene sequences of SARS-CoV.

Patients and Samples
All patients in the study were hospitalized on our unit with a confirmed diagnosis of SARS. Samples from patients included plasma, throat swab, sputum, and stool; these were stored at -70°C for extraction of viral RNA. A total of 64 RNA samples from 28 SARS-CoV-positive patients (detected by using BNI primers recommended by the World Health Organization [15]) were initially used in S-gene amplification, but only those that generated all six overlapping fragments covering the full-length S-gene sequence (see Nested RT-PCR below and Figure 1) were included in the sequence analysis. As a result, 29 RNA samples from 20 patients were included in the study ( Table 1). All patients had received ribavirin and steroid combination therapy.

RNA Extraction
RNA extraction was performed in a biosafety level 3 (P3) laboratory. RNA was extracted directly from plasma samples. Sputum samples were shaken for 30 min with an equal volume of 1.0% acetylcysteine and 0.9% sodium chloride, followed by isolating supernatant by centrifuging (10,000 g x 3 min). Throat swab and stool samples were

Genetic Variation of SARS Coronavirus in Beijing Hospital
suspended with phosphate-buffered saline (PBS) containing 10 U/mL RNasin (Promega, Madison, WI) and shaken for 10 min, followed by isolating supernatant by centrifuging as mentioned above. RNA was extracted according to the manufacturer's instructions by using the QIAamp Viral RNA Mini Kit (Qiagen, Hilden, Germany).

TA-Cloning
RT-PCR products were purified by QIAquick PCR Purification Kit (Qiagen) or QIAquick Gel Extraction Kit (Qiagen), with a final volume of 30 µL of elution. The ligation and transformation were performed according to the manufacturer's instructions by using pGEM-T Vector System II (Promega). Transformants were selected in LBagar plate containing 100 µg of ampicillin, 100 µg of 5bromo-4-chloro-3-indolyl β-L-fucopyranoside (X-gal), and 200 µg of isopropylthiogalactoside. Escherichia coli from white clones was added to 5 mL of LB culture for overnight growing at 37°C with vigorous shaking. Plasmid was purified by QIAprep Spin Miniprep Kit (Qiagen). The recombinant plasmids for sampling sequence analysis were screened by electrophoresis in 1% agarose containing 0.5 µg/mL of ethidium bromide.

Sequencing and DNA Analysis
For each S-gene fragment, four to six clones were screened. To verify variations, 5-50 additional clones generated from independently prepared, RNA-derived RT-PCR products were sequenced in two to four independent experiments. The cloned plasmids were prepared from different RT-PCR products and were directly sequenced for confirmation. DNA sequences were obtained with the use of an automated ABI 377 sequencer (Applied Biosystems Inc., Foster City, CA). For cloned plasmids, SP6 and T7 primers were used for two-directional sequencing reactions. For PCR products, specific primers (sense: S1cF-S6cF; antisense: S1cB-S6cB) were used for twodirectional sequencing reactions. Analysis and comparison of nucleotide and amino acid sequences were carried out with the DNASTAR computer software (DNASTAR Inc., Madison, WI). The S gene sequence of BJ01 strain was taken as the reference for variation analysis.

Results
With the designed six pairs of primers, all six overlapping S-gene fragments were amplified by nested RT-PCR from 29 RNA samples. However, most RNA samples ini-tially included in the study, though positive for SARS-CoV with BNI primers, failed to simultaneously generate all six overlapping S-gene fragments and were excluded from further sequence analysis. Disintegration of the virus and low viral load in the raw samples likely accounted for these failures.
One hundred and thirteen sequence variations distributed in nine variant sites were identified in analyzed sequences that were compared to the reference BJ01 strain of SARS-CoV. BJ01 is an isolate from a tissue-culture propagated sample (16) and is used as reference strain in other studies (9,10). With the exception of one site (position 21702), other variant sites have not, to our knowledge, been documented in humans. Seven of nine variant sites were nonsynonymous. Figure 2 shows the identified variant sites compared to the reference sequence.

Discussion
We identified novel variant sites and the coexistence of sequences with and without S-gene substitutions in SARS-CoV. Theoretically, a replicating RNA virus expresses a range of genetic and phenotypic variants and has the potential to generate novel virions, which may be selected in response to environmental pressures. RNA viruses generally tolerate high levels of mutagenesis because of their limited genetic complexity (17). Mutations have the potential to be pathogenic (e.g., giving the virus immunity to neutralizing antibodies, cytotoxic T cells, or antiviral drugs [18][19][20]). The dynamics of error copying and sequence decomposition are time-dependent. In HIV infection, for example, one adaptive substitution in the env gene  occurred every 3.3 months or 25 viral generations, averaging across patients (21). In our study, a higher variation frequency in the S gene was identified for SARS-CoV compared to previous reports (7)(8)(9)(10). This difference may be due to a broader sample collection covering a longer timespan of infection. In addition, since virus isolates were not passaged in culture, the whole mutant repertoire is more likely to be detected, since no reverse mutation occurs in cell culture. Our observation most likely reflected the real situation in vivo. Variations were unlikely to result from Taq polymerase errors, since we repeated the experiments for all variations from preparing independent RNA and RT-PCR products and used Platinum Pfx DNA polymerase, which has a high fidelity, to confirm the results in some cases. We could not exclude the possibility that some variations were from defective genomes. However, the fact that the variations remained detectable in the sequences from two or three specimens of the same patient, obtained at different times, suggested that these variations might be active and extensible in vivo.
Sequences with and without substitutions (compared to BJ01) were simultaneously detected in the sequences from seven samples, which suggests the existence of SARS-CoV quasispecies. Furthermore, S-gene sequences from different samples collected at different times from the same patient showed similar, but not exactly identical, variation profiles in four participants (patients 4, 5, 6, and 19 in Table 1); this implies that a dynamic mutational process may exist in vivo. Table 2 summarizes the variations occurring in 29 analyzed S-gene sequences from 20 individual SARS patients.
The genetic variation of SARS-CoV remains limited in relation to many other RNA viruses such as HIV-1, HCV, and MHV. The probable reason is that SARS-CoV only causes an acute, self-limited infection, which may prevent persistent long-term mutant development in vivo as occurs in chronic RNA viral infections. Notably, some modules in the S protein remain conserved, e.g., the fusion-important HR domains. Although some variations may predict changes of protein functional features, no obvious correlation exists between mutation and clinical disease manifestation from the limited data reported here. Instead, the variation profile was closely correlated with epidemiography; e.g., patients 3-8 were infected in one hospital.
In conclusion, we report here some new variant sites in the S gene of SARS coronavirus and possible existence of SARS-CoV quasispecies in some patients, though in limited numbers. This knowledge furthers our understanding of this emerging virus.