Emerg Infect DisEmerging Infect. DisEIDEmerging Infectious Diseases1080-60401080-6059Centers for Disease Control and Prevention15200810332323103-087510.3201/eid1005.030875ResearchGenetic Variation of SARS Coronavirus in Beijing HospitalGenetic Variation of SARS CoronavirusXuDongping*ZhangZheng*ChuFuliang*LiYonggang*JinLei*ZhangLingxia*GaoGeorge F.WangFu-Sheng*Beijing 302 Hospital, Beijing, ChinaUniversity of Oxford, Headington, Oxford, United KingdomAddress for correspondence: Fu-Sheng Wang, Beijing Institute of Infectious Diseases, 302 Hospital, 100 Xi Si Huan Middle Road, Beijing 100039, China; fax: 86-10-63831870; email: fswang@public.bta.net.cn52004105789794

To characterize genetic variation of severe acute respiratory syndrome–associated coronavirus (SARS-CoV) transmitted in the Beijing area during the epidemic outbreak of 2003, we sequenced 29 full-length S genes of SARS-CoV from 20 hospitalized SARS patients on our unit, the Beijing 302 Hospital. Viral RNA templates for the S-gene amplification were directly extracted from raw clinical samples, including plasma, throat swab, sputum, and stool, during the course of the epidemic in the Beijing area. We used a TA-cloning assay with direct analysis of nested reverse transcription–polymerase chain reaction products in sequence. One hundred thirteen sequence variations with nine recurrent variant sites were identified in analyzed S-gene sequences compared with the BJ01 strain of SARS-CoV. Among them, eight variant sites were, we think, the first documented. Our findings demonstrate the coexistence of S-gene sequences with and without substitutions (referred to BJ01) in samples analyzed from some patients.

Keywords: SARS coronavirus,genetic variationquasispecies,spike glycoprotein gene

A novel severe acute respiratory syndrome–associated coronavirus (SARS-CoV) has been implicated as the causative agent of a worldwide outbreak of SARS during the first 6 months of 2003 (13). From March 4 to June 18, Beijing had 2,521 cases and 191 deaths from SARS (4). Because of the poor fidelity of RNA-dependent RNA polymerase, genetic variation typically forms a heterogeneous virus pool in RNA virus populations, including coronaviruses such as mouse hepatitis virus (MHV) (5,6). This feature makes viruses highly adaptable and contributes to difficulties in preventing and controlling viral disease. SARS-CoV, a single-stranded RNA virus, has been reported with relatively less variability in analyses of a limited number of viral isolate collections (710). Furthermore, no SARS-CoV quasispecies have been documented, as they have been in many other RNA viruses, including hepatitis C virus (HCV) (11), HIV (12), and MHV (6).

During the SARS outbreak in Beijing, 132 SARS patients were hospitalized and treated on our unit at Beijing Hospital, including the first cluster of case-patients in the area (13). To characterize genetic variation among SARS-CoV transmitted in the Beijing area, we sequenced 29 full-length S genes of SARS-CoV from 20 hospitalized SARS patients, since S glycoprotein plays a key role in virus-host interaction and is predicted to be the main target of immune response (14). Samples that were analyzed represented the timespan of the epidemic. To exclude culture-derived artifacts and estimate mutational heterogeneity, viral RNA was directly extracted from raw clinical samples, and a TA-cloning assay was used with direct analysis of reverse transcriptase–polymerase chain reaction (RT-PCR) products. We compared these sequences with all previously documented S-gene sequences of SARS-CoV.

Materials and MethodsPatients and Samples

All patients in the study were hospitalized on our unit with a confirmed diagnosis of SARS. Samples from patients included plasma, throat swab, sputum, and stool; these were stored at –70°C for extraction of viral RNA. A total of 64 RNA samples from 28 SARS-CoV–positive patients (detected by using BNI primers recommended by the World Health Organization [15]) were initially used in S-gene amplification, but only those that generated all six overlapping fragments covering the full-length S-gene sequence (see Nested RT-PCR below and Figure 1) were included in the sequence analysis. As a result, 29 RNA samples from 20 patients were included in the study (Table 1). All patients had received ribavirin and steroid combination therapy.

Diagram showing amplification of six overlapping fragments covering full-length spike gene sequence of severe acute respiratory syndrome–associated coronavirus by nested reverse transcriptase–polymerase chain reaction.

Clinical backgrounds of patients and sample collection
Patient no.Age (y)SexaOnset dateHospitalized dateSpecimen no.bSampling date
1
53
M
2/28
3/05
SW6
3/06
2
32
M
3/08
3/08
SW17
3/09
3
32
F
3/20
4/04
PL1
4/07
4
20
M
3/21
4/06
PL10
4/07





PL17
4/22





SP4
5/03
5
33
M
3/28
4/04
PL9
4/07





SP1
5/03
6
59
M
3/30
4/06
PL5
4/07





SP9
5/12
7
52
M
3/30
4/04
PL7
4/07
8
59
M
3/30
4/06
PL8
4/07
9
19
F
4/01
4/12
PL15
4/22





SP32
4/26
10
73
M
4/02
4/03
PL6
4/07





SP62
4/18





SW73
4/21
11
45
F
4/04
4/04
SP67
4/18
12
26
M
4/08
4/18
SW76
4/21
13
31
M
4/08
4/14
ST123
4/26
14
32
M
4/09
4/18
PL57
4/21





SW77
4/22
15
39
M
4/10
4/10
SP61
4/18
16
31
F
4/10
4/12
PL59
4/30
17
46
F
4/20
4/21
SP28
4/26
18
48
M
4/20
4/22
SP43
4/24
19
38
M
4/22
4/26
SP13
5/03





ST158
4/30
2027M5/105/11SP85/12

aM, male; F, female.
bFirst two letters indicate source of sample: SW, throat swab; PL, plasma; SP, sputum; ST, stool.

RNA Extraction

RNA extraction was performed in a biosafety level 3 (P3) laboratory. RNA was extracted directly from plasma samples. Sputum samples were shaken for 30 min with an equal volume of 1.0% acetylcysteine and 0.9% sodium chloride, followed by isolating supernatant by centrifuging (10,000g x 3 min). Throat swab and stool samples were suspended with phosphate-buffered saline (PBS) containing 10 U/mL RNasin (Promega, Madison, WI) and shaken for 10 min, followed by isolating supernatant by centrifuging as mentioned above. RNA was extracted according to the manufacturer’s instructions by using the QIAamp Viral RNA Mini Kit (Qiagen, Hilden, Germany).

Nested RT-PCR

Screening RNA for SARS-CoV was based on the method by Drosten et al. (1). For the S-gene amplification, 18 pairs of primers were designed by using MacVactor computer software (Accelrys Inc, San Diego, CA) based on the BJ01 strain of SARS-CoV (GenBank accession no. AY278488) (16). Among them, six pairs (sense/antisense: S1aF/S1aB, S2aF/S2aB, S3aF/S3aB, S4aF/S4aB, S5aF/S5aB, S6aF/S6aB) were used as outer primers, six pairs (sense/antisense: S1bF/S1bB, S2bF/S2bB, S3bF/S3bB, S4bF/S4bB, S5bF/S5bB, S6bF/S6bB) were used as inner primers, and six pairs (sense/antisense: S1cF/S1cB, S2cF/S2cB, S3cF/S3cB, S4cF/S4cB, S5cF/S5cB, S6cF/S6cB) were designed for direct RT-PCR product sequencing. The sequences covering the full-length S gene were amplified separately as six overlapping fragments (F1b, F2b, F3b, F4b, F5b, and F6b) (Figure 1). The one-step RT-PCR Kit (Qiagen) was used for reverse transcription and the first round of PCR amplification with outer primers. Thermal cycling consisted of 50°C for 30 min; 95°C for 15 min; 10 cycles of 95°C for 30 s, 57.5°C for 30 s (decreasing by 1.5°C every other cycle), 72°C for 1 min; 40 cycles of 95°C for 30 s, 54°C for 30 s, 72°C for 1 min. Afterwards, 2 μL of the product was used as a template for the second round of PCR amplification in 100-μL volume with inner primers with Taq DNA polymerase (MBI Fermentas, Hanover, MD). Thermal cycling consisted of 30 cycles of 95°C for 25 s, 54°C for 25 s, 72°C for 50 s. In some cases, Transcript III RNase H Reverse Transcriptase (Invitrogen, Carlsbad, CA) was used for reverse transcription, according to the manufacturer’s instructions. The next two rounds of PCR amplification were performed by using Platinum Pfx DNA Polymerase with a higher fidelity (Invitrogen). The reaction condition was set as above, with a twofold elongation at 68°C instead of 72°C. All reactions were carefully carried out to avoid contamination.

TA-Cloning

RT-PCR products were purified by QIAquick PCR Purification Kit (Qiagen) or QIAquick Gel Extraction Kit (Qiagen), with a final volume of 30 μL of elution. The ligation and transformation were performed according to the manufacturer’s instructions by using pGEM-T Vector System II (Promega). Transformants were selected in LB-agar plate containing 100 μg of ampicillin, 100 μg of 5-bromo-4-chloro-3-indolyl β-L-fucopyranoside (X-gal), and 200 μg of isopropylthiogalactoside. Escherichia coli from white clones was added to 5 mL of LB culture for overnight growing at 37°C with vigorous shaking. Plasmid was purified by QIAprep Spin Miniprep Kit (Qiagen). The recombinant plasmids for sampling sequence analysis were screened by electrophoresis in 1% agarose containing 0.5 μg/mL of ethidium bromide.

Sequencing and DNA Analysis

For each S-gene fragment, four to six clones were screened. To verify variations, 5–50 additional clones generated from independently prepared, RNA-derived RT-PCR products were sequenced in two to four independent experiments. The cloned plasmids were prepared from different RT-PCR products and were directly sequenced for confirmation. DNA sequences were obtained with the use of an automated ABI 377 sequencer (Applied Biosystems Inc., Foster City, CA). For cloned plasmids, SP6 and T7 primers were used for two-directional sequencing reactions. For PCR products, specific primers (sense: S1cF–S6cF; antisense: S1cB–S6cB) were used for two-directional sequencing reactions. Analysis and comparison of nucleotide and amino acid sequences were carried out with the DNASTAR computer software (DNASTAR Inc., Madison, WI). The S gene sequence of BJ01 strain was taken as the reference for variation analysis.

Results

With the designed six pairs of primers, all six overlapping S-gene fragments were amplified by nested RT-PCR from 29 RNA samples. However, most RNA samples initially included in the study, though positive for SARS-CoV with BNI primers, failed to simultaneously generate all six overlapping S-gene fragments and were excluded from further sequence analysis. Disintegration of the virus and low viral load in the raw samples likely accounted for these failures.

One hundred and thirteen sequence variations distributed in nine variant sites were identified in analyzed sequences that were compared to the reference BJ01 strain of SARS-CoV. BJ01 is an isolate from a tissue-culture propagated sample (16) and is used as reference strain in other studies (9,10). With the exception of one site (position 21702), other variant sites have not, to our knowledge, been documented in humans. Seven of nine variant sites were nonsynonymous. Figure 2 shows the identified variant sites compared to the reference sequence.

Variants identified from 29 full-length S genes of severe acute respiratory syndrome–associated coronavirus from 20 SARS patients in comparison with BJ01 strain (GenBank accession no. AY278488). The nucleotide positions are numbered according to the sequence of BJ01 strain. Numbers start from the beginning of the genome, but the amino acid numbers start from the S protein. The filled arrows represent nonsynonymous mutations, and the hollow arrows represent synonymous ones. The occurrence indicates the frequency of the variant nucleotide at the given site of the identified 29 entire S genes.

Discussion

We identified novel variant sites and the coexistence of sequences with and without S-gene substitutions in SARS-CoV. Theoretically, a replicating RNA virus expresses a range of genetic and phenotypic variants and has the potential to generate novel virions, which may be selected in response to environmental pressures. RNA viruses generally tolerate high levels of mutagenesis because of their limited genetic complexity (17). Mutations have the potential to be pathogenic (e.g., giving the virus immunity to neutralizing antibodies, cytotoxic T cells, or antiviral drugs [1820]). The dynamics of error copying and sequence decomposition are time-dependent. In HIV infection, for example, one adaptive substitution in the env gene occurred every 3.3 months or 25 viral generations, averaging across patients (21).

In our study, a higher variation frequency in the S gene was identified for SARS-CoV compared to previous reports (710). This difference may be due to a broader sample collection covering a longer timespan of infection. In addition, since virus isolates were not passaged in culture, the whole mutant repertoire is more likely to be detected, since no reverse mutation occurs in cell culture. Our observation most likely reflected the real situation in vivo. Variations were unlikely to result from Taq polymerase errors, since we repeated the experiments for all variations from preparing independent RNA and RT-PCR products and used Platinum Pfx DNA polymerase, which has a high fidelity, to confirm the results in some cases. We could not exclude the possibility that some variations were from defective genomes. However, the fact that the variations remained detectable in the sequences from two or three specimens of the same patient, obtained at different times, suggested that these variations might be active and extensible in vivo.

Sequences with and without substitutions (referred to BJ01) were simultaneously detected in the sequences from seven samples, which suggests the existence of SARS-CoV quasispecies. Furthermore, S-gene sequences from different samples collected at different times from the same patient showed similar, but not exactly identical, variation profiles in four participants (patients 4, 5, 6, and 19 in Table 1); this implies that a dynamic mutational process may exist in vivo. Table 2 summarizes the variations occurring in 29 analyzed S-gene sequences from 20 individual SARS patients.

Variation in S-gene sequences from 20 individual SARS patients<sup>a,b</sup>
Pt. no.Samp. no.21494
21702
21858
22908
23198
24018
24247
24469
24540
C→TA→GA→TA→GT→CA→TT→CA→GA→G
1
SW6









2
SW17
9/2c
+




+


3
PL1
8/39
8/43
48/2
+
+
+
+
+
+
4
PL10
14/7
+

+
+
2/8
+
+
+

PL17
+
+



+
+
+
+

SP4

+

+

+
+
+
+
5
PL9
+
+

+
+
+
+
+
+

SP1

+

+
+

+
+
+
6
PL5

+

+
+
8/4
+
+
+

SP9

+

+
+
+
+
+
+
7
PL7

+

+
+
4/6
+
+
+
8
PL8
7/28
+
33/2
+
+
+
+
+
+
9
SP15

+








SP32

+







10
SP6










SP62










SW73









11
SP67

+







12
SW76

+







13
ST123

+







14
PL57

+


+
+
+
+
+

SW77

+


+
+
+
+
+
15
SP61

+







16
PL59









17
SP28

+







18
SP43

+







19
ST158

+

+
+
+
+
+
+

SP13
19/4
14/10
10/13
+
+
+
6/16
+
14/8
20SP8+

aThe results were determined by analysis of cloned sequences; + represents that nucleotide substitution at the variant site is detected and – represents that the nucleotide at the site is identical to the one of BJ01 reference sequence in all analyzed sequences.
bSARS, severe acute respiratory syndrome; SW, throat swab; PL, plasma; SP, sputum; ST, stool.
cThe numbers represent the ratio of reference to variant nucleotide detected at the site from the analyzed cloned sequences.

One nonsynonymous change observed at position A1023G is within the heptad repeat (HR) domains, which is thought to be important for virus entry, and previous study on MHV showed that it would have some effect on virus infection (22). At this stage, we cannot rule out the possibility that this change affects the biological outcome of the virus, but further experiments need to be addressed in the near future.

We observed the coexistence of the S-gene sequences with and without substitutions and time-dependent variation profile in some patients. These observations suggest the possible existence of SARS-CoV quasispecies in an acute infection. In this study, however, the limitation of clinical sample collection and difficulty in directly amplifying full-length S gene from raw clinical samples restricted further extensive study for dynamic mutant distributions of the virus. In addition, the sequencing clone number was conditioned by the scale of the project, and this may have led to some minor variant sequences escaping analysis. Another factor possibly affecting the stability of the viral genome is the administration of the antiviral drug ribavirin. That ribavirin enhances mutagensis of RNA viruses has been addressed (23). Therefore, the artificial effect of ribavirin on the SARS-CoV mutant spectrum remains to be clarified.

The genetic variation of SARS-CoV remains limited in relation to many other RNA viruses such as HIV-1, HCV, and MHV. The probable reason is that SARS-CoV only causes an acute, self-limited infection, which may prevent persistent long-term mutant development in vivo as occurs in chronic RNA viral infections. Notably, some modules in the S protein remain conserved, e.g., the fusion-important HR domains. Although some variations may predict changes of protein functional features, no obvious correlation exists between mutation and clinical disease manifestation from the limited data reported here. Instead, the variation profile was closely correlated with epidemiography; e.g., patients 3–8 were infected in one hospital.

In conclusion, we report here some new variant sites in the S gene of coronavirus and possible existence of SARS-CoV quasispecies in some patients, though in limited numbers. This knowledge furthers our understanding of this emerging virus.

Suggested citation for this article: Xu D, Zhang Z, Chu F, Li Y, Jin L, Zhang L, et al. Genetic variation of SARS coronavirus in Beijing hospital. Emerg Infect Dis [serial on the Internet]. 2004 May [date cited]. Available from: http://www.cdc.gov/ncidod/EID/vol10no5/03-0875.htm

Acknowledgments

We thank K.Y. Yuen for his valuable suggestions for this project; and Panyong Mao and Yuanli Mao for providing some of the samples.

The work was supported by Beijing Natural Science Foundation (Number: 7034051), Emergent Foundation for SARS Treatment and Prevention (Number: 03F017), and in part by Sino-UK Collaboration Foundation for SARS Immunopathogenesis Study (Number: H030230100130).

Dr. Xu is an associate professor of medicine at the Beijing Institute of Infectious Diseases. His work focuses on cancer gene therapy, medical viral molecular biology, and immunology.

ReferencesIdentification of a novel coronavirus in patients with severe acute respiratory syndrome. N Engl J Med. 2003;348:196776Drosten C, Gunther S, Preiser W, van der Werf S, Brodt HR, Becker S, et al. 10.1056/NEJMoa03074712690091Ksiazek TG, Erdman D, Goldsmith CS, Zaki SR, Peret T, Emery S, A novel coronavirus associated with severe acute respiratory syndrome. N Engl J Med. 2003;348:195366 10.1056/NEJMoa03078112690092Marra MA, Jones SJ, Astell CR, Holt RA, Brooks-Wilson A, Butterfield YS, The Genome sequence of the SARS-associated coronavirus. Science. 2003;300:1399404 10.1126/science.108595312730501Epidemiological features of severe acute respiratory syndrome in Beijing. Zhonghua Liu Xing Bing Xue Za Zhi. 2003;24:10969Liang WN, Mi J, Information Branch Joint Leadership Group of SARS Prevention and Control in Beijing.14761623Quasispecies and the development of new antiviral strategies. Prog Drug Res. 2003;60:13358Domingo E.12790341Evolution of mouse hepatitis virus (MHV) during chronic infection: quasispecies nature of the persisting MHV RNA. Virology. 1995;209:33746Adami C, Pooley J, Glomb J, Stecker E, Fazal F, Fleming JO, et al. 10.1006/viro.1995.12657778268Comparative full-length genome sequence analysis of 14 SARS coronavirus isolates and common mutations associated with putative origins of infection. Lancet. 2003;361:177985Ruan YJ, Wei CL, Ee AL, Vega VB, Thoreau H, Su ST, et al. 10.1016/S0140-6736(03)13414-912781537Coronavirus genomic-sequence variations and the epidemiology of the severe acute respiratory syndrome. N Engl J Med. 2003;349:1878Tsui SK, Chim SS, Lo YM. 10.1056/NEJM20030710349021612853594Mutation analysis of 20 SARS virus genome sequences: evidence for negative selection in replicase ORF1b and spike gene. Acta Pharmacol Sin. 2003;24:7415Hu LD, Zheng GY, Jiang HS, Xia Y, Zhang Y, Kong XY.12904271Severe acute respiratory syndrome–associated coronavirus genotype and its characterization. Chin Med J (Engl). 2003;116:128892Li L, Wang Z, Lu Y, Bao Q, Chen S, Wu N, et al.14527350Bukh J, Miller RH, Purcell RH Genetic heterogeneity of hepatitis C virus: quasispecies and genotypes. Semin Liver Dis. 1995;15:4163 10.1055/s-2007-10072627597443Wain-Hobson S Human immunodeficiency virus type 1 quasispecies in vivo and ex vivo. Curr Top Microbiol Immunol. 1992;176:181931600752Epidemiologic features, clinical diagnosis and therapy of first cluster of patients with severe acute respiratory syndrome in Beijing area. Zhonghua Yi Xue Za Zhi. 2003;83:101822Zhou XZ, Zhao M, Wang FS, Jiang TJ, Li YG, Nie WM, et al.12899773Wang FS, Xu D Study of characteristics and pathogenic mechanism of SARS coronavirus. Infect Dis Inform. 2003;16:678PCR primers for SARS developed by WHO network laboratories [monograph on the Internet]. Geneva: World Health Organization. 2003 Apr 17. Available from: http://www.who.int/csr/sars/primers/en/Qin ED, Zhu QY, Yu M, Fan B, Chang G, Si B, A complete sequence and comparative analysis of a SARS-associated virus (isolate BJ01). Chin Sci Bull. 2003;48:9418Quasispecies structure and persistence of RNA viruses. Emerg Infect Dis. 1998;4:5217Domingo E, Baranowski E, Ruiz-Jarabo CM, Martin-Hernandez AM, Saiz JC, Escarmis C. 10.3201/eid0404.9804029866728Maekawa S, Enomoto N, Kurosaki M, Nagayama K, Marumo F, Sato C Genetic changes in the interferon sensitivity determining region of hepatitis C virus during the natural course of chronic hepatitis C. J Med Virol. 2000;61:30310 10.1002/1096-9071(200007)61:3<303::AID-JMV4>3.0.CO;2-F10861637Yang OO, Sarkis PT, Ali A, Harlow JD, Brander C, Kalams SA, Determinant of HIV-1 mutational escape from cytotoxic T lymphocytes. J Exp Med. 2003;197:136575 10.1084/jem.2002213812743169Cytotoxic T cell-resistant variants are selected in a virus-induced demyelinating disease. Immunity. 1996;5:25362Pewe L, Wu GF, Barnett EM, Castro RF, Perlman S. 10.1016/S1074-7613(00)80320-98808680Williamson S Adaptation in the env gene of HIV-1 and evolutionary theories of disease progression. Mol Biol Evol. 2003;20:131825 10.1093/molbev/msg14412777505Luo Z, Weiss SR Roles in cell-to-cell fusion of two conserved hydrophobic regions in the murine coronavirus spike protein. Virology. 1998;244:48394 10.1006/viro.1998.91219601516Crotty S, Maag D, Arnold JJ, Zhong W, Lau JY, Hong Z, The broad-spectrum antiviral ribonucleoside ribavirin is an RNA virus mutagen. Nat Med. 2000;6:13759 10.1038/8219111100123