Emerg Infect DisEmerging Infect. DisEIDEmerging Infectious Diseases1080-60401080-6059Centers for Disease Control and Prevention24206919383765513-066410.3201/eid1911.130664DispatchDispatchFull Genome of Influenza A (H7N9) Virus Derived by Direct Sequencing without CultureFull Genome of Influenza A (H7N9) Virus RenXianwen1YangFan1HuYongfeng1ZhangTing1LiuLiguo1DongJieSunLilianZhuYafangXiaoYanLiLiYangJianWangJianweiJinQiMOH Key Laboratory of Systems Biology of Pathogens, Beijing, ChinaAddress for correspondence: Qi Jin, No. 6 St Rongjing East, BDA, Beijing, 100176, People’s Republic of China; email: zdsys@vip.sina.com112013191118811884

An epidemic caused by influenza A (H7N9) virus was recently reported in China. Deep sequencing revealed the full genome of the virus obtained directly from a patient’s sputum without virus culture. The full genome showed substantial sequence heterogeneity and large differences compared with that from embryonated chicken eggs.

Keywords: H7N9influenza A virusdeep sequencingdirect sequencingculture-freeavian-origininfluenzaviruses

Recently, a novel influenza A (H7N9) virus infected humans in China (1,2), leading to great concerns about its threat to public health (3). However, almost all the current genomes of the novel subtype H7N9 virus have been sequenced after culture in embryonated chicken eggs or mammalian cells. Switching the evolutionary selection pressure from in vivo human respiratory tract to embryonated chicken eggs might introduce mutations into the final genome sequences during culture (4). We report determination of the full genome of the influenza A (H7N9) virus derived directly by deep sequencing, without virus culture, from a sputum specimen of an infected human. Deep sequencing provides a direct way to evaluate the genome characteristics and potential virulence and transmissibility of the novel influenza A (H7N9) virus.

The Study

We collected a sputum specimen from a 54-year-old woman with fever, cough, sputum production, and pneumonia. Influenza A (H7N9) virus was detected in the specimen by specific real-time reverse transcription PCR (RT-PCR). The specimen was then processed with a viral particle–protected nucleic acid purification method (5). Total RNA was extracted and amplified by sequence-independent PCR (5) and then sequenced with an Illumina/Solexa GAII sequencer (Illumina, San Diego, CA, USA). Reads generated by the Illumina/Solexa GAII with lengths of 80 bases were directly aligned to those nucleotide sequences of influenza A viruses in the National Center for Biotechnology Information nonredundant nucleotide database by the blastn program in the BLAST (6) software package, version 2.2.22 (www.ncbi.nlm.nih.gov/blast) with parameters −e 1e−5 −F T (−e 1e−5 for selection of highly similar reads and −F T for masking the low-complexity reads) after filtering of the sequence adapters and RT-PCR primers. No assembly was performed before alignment. We obtained 19,177 reads aligned to influenza A viruses.

We then conducted a reference-guided assembly based on the 19,177 reads by the Seqman program in the DNAStar software package version 7.1 (http://www.dnastar.com). The novel influenza A (H7N9) virus A/Anhui/1/2013was selected as the reference. With 80% minimum sequence similarity tolerance and 12 bp minimum match size, those 19,177 reads were assembled into 439 contigs. The top 8 contigs covered by the most reads corresponded to the 8 genome segments of the novel influenza A (H7N9) virus. The other contigs did not align to the reference virus, which might have resulted from sequencing or assembling errors. Calculating the consensus sequence, we obtained the genome of the influenza A (H7N9) virus directly from the sputum specimen of this patient. Further RT-PCR and Sanger sequencing confirmed the quality of the assembled subtype H7N9virus genome. Sequences were deposited in GenBank under accession nos. KF226105–KF226120 and KF278742–KF278749.

The influenza A (H7N9) genome that we report varies from that obtained by Sanger sequencing after passage in the allantoic sac and amniotic cavity of 9–11-day-old specific pathogen–free embryonated chicken eggs for 48–72 hours at 35°C (Table 1). In the nucleocapsid protein (NP) segment, 15 point mutations were found; 13 were synonymous and 2 induced amino acid changes S321N and M371I. In the nonstructural (NS) protein segment, 5 point mutations were found; all caused amino acid changes R59H, P107L, and V111Q. In the polymerase acidic (PA) protein segment, 3 point mutations were found, 1 of which caused amino acid change V707F. In the polymerase basic 1 (PB1) protein segment, 2 point mutations were found, both of which were synonymous. In the PB2 segment, 2 point mutations were found, 1 of which caused amino acid change S534F.

Mutations of directly sequenced influenza A (H7N9) virus and that obtained from chicken egg culture*
GenePositionDirect sequencingChicken egg cultureAmino acid change
PB218AGSynonymous
PB21601CTS534F
PB1303GASynonymous
PB1825GASynonymous
PB12274AGSynonymous
PA2115GTSynonymous
PA2119GTV707F
PA2127ACSynonymous
NS176GAR59H in NS1
NS792CTP107L in NS2
NS803GCSynonymous
NS804TASynonymous
NS805CAV111Q in NS2
NP387ATSynonymous
NP438TCSynonymous
NP480TCSynonymous
NP648AGSynonymous
NP657TCSynonymous
NP663AGSynonymous
NP892TCSynonymous
NP962GAS321N
NP982CTSynonymous
NP1086GASynonymous
NP1113GAM371I
NP1200GASynonymous
NP1251CTSynonymous
NP1257TCSynonymous
NP
1440
T
C
Synonymous
*PB, polymerase basic; PA, polymerase acidic; NS, nonstructural; NP, nucleocapsid protein.

The influenza A (H7N9) genome also demonstrates significant intraspecimen heterogeneity. Deep sequencing revealed that the average coverage (ratio of the total number of nucleotides of all reads to the length of the reference gene) of the 8 genes was quite inhomogeneous. Average coverage (± SD) was highest for neuraminidase (NA) (131.94 ± 30.25) and second highest for NP (130.41± 27.01). The average coverages of PB2, PB1, PA, matrix protein, and hemagglutinin were 99.89 (± 22.49), 95.35 (± 21.34), 43.35 (± 14.13), 53.73 (± 17.67), and 69.82 (± 19.02), respectively. Average coverage was lowest for NS (27.73± 11.31).

Besides the gene abundance, the genome sequence of influenza A (H7N9) virus also demonstrated heterogeneity (the heterozygous peak threshold 80%). In total, 22 positions were confirmed by PCR and Sanger sequencing to be heterogeneous (Table 2). In the NP segment, 4 positions demonstrated heterogeneity; 3 were synonymous and 1 induced amino acid change E421K. In the NS segment, 3 positions demonstrated heterogeneity; 2 were synonymous and 1 induced amino acid change R140W. In the hemagglutinin segment, 7 positions demonstrated heterogeneity; 6 were synonymous and 1 induced amino acid change H242Y. In NA, 3 positions demonstrated heterogeneity; 2 induced amino acid changes (S92L and S108L) and 1 was synonymous. In the PA segment, 2 positions demonstrated heterogeneity; both were synonymous. In the PB2 segment, 3 positions demonstrated heterogeneity; all were nonsynonymous (S532L, S533L, and S534F). All these heterogeneous sites were confirmed by PCR and Sanger sequencing; only 1 site overlapped with the mutation sites after passage in embryonated chicken eggs.

Heterogeneous genomic positions of directly sequenced influenza A (H7N9) virus and its protein differences from other viruses*
ProteinHeterogeneity, nucleotide position in gene sequence: nucleotides)†Amino acid position in protein sequence‡ Direct sequencing§Culture§#A/Anhui/1/2013§ Consensus of isolate from humans§**
HA330: C>T110FFFF
HA360: C>T120LLLL
HA696: T>A232VVVV
HA724: C>T242H242YHHH
HA762: C>T254FFFF
HA780: C>T260FFFF
HA1441: C>T481HHHH
HA65MMRR
M210LLPP
M224DDEE
NA275: C>T92S92LSSS
NA323: C>T108S108LSSS
NA408: C>T136IIII
NA40SSGG
NA300VVII
NA340IINN
NP321SNNN
NP858: C>T286AAAA
NP583: G>A195RRRR
NP1261: G = A421E421KEEE
NP1260: C = T420FFFF
NP371MIMM
NS2546: C>>T55LLLL
NS2543: C>>T54DDDD
NS1418: C>>T140R140WRRR
NS159RHRR
NS2107PLLL
NS2111VQQQ
PA174: C>A58GGGG
PA1305:C>T435IIII
PA618KKTT
PA707VFFF
PB1200IIVV
PB1368IIVV
PB1454LLPP
PB1637VVII
PB1-F242CCYY
PB1-F251TTMM
PB1-F270GEGG
PB1-F277LLSS
PB2534SFSS
PB2591KKQQ
PB2627EEKK

*HA, hemagglutinin; M, matrix; NA, neuraminidase; NP, nucleocapsid protein; NS, nonstructural; PA, polymerase acidic; PB,
polymerase basic.
†Position of first nucleotide = 1.
‡Position of first amino acid = 1.
§Types of amino acids.
#Virus cultured in chicken eggs.
**Thirteen influenza A (H7N9) viruses isolated from humans; data from Global Initiative on Sharing All Influenza Data.

Compared with the reference influenza A (H7N9) virus strain A/Anhui/1/2013, the influenza A (H7N9) virus demonstrated prominent sequence differences (Table 2). In particular, the amino acid at the 627 position of PB2 of A/Anhui/1/2013 is K, whereas the corresponding amino acid in the subtype H7N9 genome is E. The amino acid at the 368 position of PB1 of A/Anhui/1/2013 is V, whereas the corresponding amino acid in the subtype H7N9 genome is I. The E627K mutation in PB2 and the I368V mutation in PB1 are closely associated with the virulence and transmissibility of avian influenza A virus in mammals (1). E627K in PB2 was observed in A/Shanghai/1/2013, A/Shanghai/2/2013, and A/Anhui/1/2013 viruses (1). A/Zhejiang/DTID-ZJU01/2013 virus does not have this mutation but has a complementary mutation D701N in PB2 (2). I368V in PB1 was observed in A/Shanghai/2/2013 and A/Anhui/1/2013 viruses, but A/Shanghai/1/2013 virus does not have this mutation (1).

MEGA5.0 (www.megasoftware.net) was used to construct the phylogenetic trees on the basis of the nucleotide sequences of all influenza A (H7N9) viruses in the Global Initiative on Sharing All Influenza Data (GISAID) database (7). We conducted 2 rounds of phylogenetic analysis. First, to examine whether this subtype H7N9 virus is clustered with the available subtype H7N9 strains, we included all influenza A (H7N9) viruses in the GISAID database. To construct the multiple sequence alignment, we used the MUSCLE package with default parameters (www.megasoftware.net/); then, to construct the phylogenetic trees with 1,000 bootstrap replicates, we used the minimum-evolution method. Results suggested that all 8 genome segments are closely related to the available influenza A (H7N9) virus strains.

We next included all influenza A (H7N9) viruses isolated in China in 2013 to closely investigate the relationships between this virus and available subtype H7N9 genomes isolated during epidemics. However, the phylogenetic topologies based on different gene segments were not consistent (Figures 1, 2; Technical Appendix Figures 1–6), suggesting that the influenza A (H7N9) virus may have persistently evolved for a while (8).

Phylogenetic tree of the influenza A (H7N9) viruses isolated in China in 2013, based on the hemagglutinin gene segment. Scale bar indicates nucleotide differences per unit length.

Phylogenetic tree of the influenza A (H7N9) viruses isolated in China in 2013, based on the neuraminidase gene segment. Scale bar indicates nucleotide differences per unit length.

Conclusion

Using deep sequencing technologies, we derived the full-length genome of the novel influenza A (H7N9) virus directly from the sputum specimen of a patient, without conducting virus culture. The full genome revealed substantial sequence heterogeneity within the specimen, obvious sequence variations from that obtained from embryonated chicken eggs, and prominent differences from the available influenza A (H7N9) strains, most of which were sequenced after culture.

<supplementary-material content-type="loca-data" id="SD1"><caption><title>Technical Appendix

Phylogenetic trees of the influenza A (H7N9) viruses isolated in China in 2013, based on gene segments.

Suggested citation for this article: Ren X, Yang F, Hu Y, Zhang T, Liu L, Dong J, et al. Full genome of influenza A (H7N9) virus derived by direct sequencing without culture. Emerg Infect Dis [Internet]. 2013 Nov [date cited]. http://dx.doi.org/10.3201/eid1911.130664

These authors contributed equally to this article.

Acknowledgment

We acknowledge those who contributed to the generation of the genome sequences of influenza A (H7N9) viruses in GISAID, on which this research is based.

This work was supported by the National S&T Major Project, “China Mega-Project for Infectious Disease” (grant No. 2013ZX10004101).

Dr Ren is an assistant professor at the Institute of Pathogen Biology, Chinese Academy of Medical Sciences and Peking Union Medical College. His research focuses on the bioinformatics and computational biological questions of pathogens.

ReferencesGao R, Cao B, Hu Y, Feng Z, Wang D, Hu W, Human infection with a novel avian-origin influenza A (H7N9) virus. N Engl J Med. 2013;368:18889 10.1056/NEJMoa130445923577628Chen Y, Liang W, Yang S, Wu N, Gao H, Sheng J, Human infections with the emerging avian influenza A H7N9 virus from wet market poultry: clinical analysis and characterisation of viral genome. Lancet. 2013;381:191625 10.1016/S0140-6736(13)60903-423623390Uyeki TM, Cox NJ. Global concerns regarding novel influenza A (H7N9) virus infections. N Engl J Med. 2013;368:18624 10.1056/NEJMp130466123577629de Jong MD, Tran TT, Truong HK, Vo MH, Smith GJ, Nguyen VC, Oseltamivir resistance during treatment of influenza A (H5N1) infection. N Engl J Med. 2005;353:266772 10.1056/NEJMoa05451216371632Wu Z, Ren X, Yang L, Hu Y, Yang J, He G, Virome analysis for identification of novel mammalian viruses in bat species from Chinese provinces. J Virol. 2012;86:109991012 10.1128/JVI.01394-1222855479Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389402 10.1093/nar/25.17.33899254694Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:27319 10.1093/molbev/msr12121546353Koopmans M, de Jong MD. Avian influenza A H7N9 in Zhejiang, China. Lancet. 2013;381:18823 10.1016/S0140-6736(13)60936-823628442