Int J Evol BiolIJEBInternational Journal of Evolutionary Biology2090-052XSAGE-Hindawi Access to Research21350634303947710.4061/2011/143498Research ArticleComputational Analysis Suggests That Lyssavirus Glycoprotein Gene Plays a Minor Role in Viral AdaptationTangKevin1WuXianfu2*1BCFB, DSR, Centers for Disease Control and Prevention, Atlanta, GA 30333, USA2Rabies, PRB, Centers for Disease Control and Prevention, Atlanta, GA 30333, USA*Xianfu Wu: xwu@cdc.gov

Academic Editor: Hiromi Nishida

201162201120111434981510201015122010312011Copyright © 2011 K. Tang and X. Wu.2011This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

The Lyssavirus glycoprotein (G) is a membrane protein responsible for virus entry and protective immune responses. To explore possible roles of the glycoprotein in host shift or adaptation of Lyssavirus, we retrieved 53 full-length glycoprotein gene sequences from NCBI GenBank. The sequences were from different host isolates over a period of 70 years in 21 countries. Computational analyses detected 1 recombinant (AY987478, a dog isolate of CHAND03, genotype 1 in India) with incongruent phylogenetic support. No recombination was detected when AY98748 was excluded in the analyses. We applied different selection models to identify selection pressure on the glycoprotein gene. One codon at amino acid residual 483 was found to be under weak positive selection with marginal probability of 95% by using the maximum likelihood method. We found no significant evidence of positive selection on any site of the glycoprotein gene when the putative recombinant AY987478 was excluded. The computational analyses suggest that the G gene has been under purifying selection and that the evolution of the G gene may not play a significant role in Lyssavirus adaptation.

1. Introduction

Positive selection and recombination are important mechanisms in microbial pathogen adaption to new hosts, resistance to antibiotics, and evasion of immune responses [1]. RNA viruses have high mutation rates due to lack of both proofreading and postreplicative repair activities associated with RNA replicases and reverse transcriptases [2], which benefits RNA viruses in adapting to the changing environment. Recombination is a general phenomenon in evolution and plays a significant role in viral fitness [3, 4]. Rabies virus is a single-stranded negative RNA virus belonging to the order Mononegavirales, family Rhabdoviridae, genus Lyssavirus, which causes rabies in all warm-blooded mammals. Host shift and spillover events are frequently reported in rabies [59]. The nucleotide substitution rate of lyssaviruses is estimated to be around 10−4 per site per year [7]. The RNA-dependent RNA-polymerase (RdRp or L) together with phosphoprotein (P), functions as the transcriptase and replicase complex. The glycoprotein (G) is the only outer membrane protein responsible for virus entry and inducing protective immune responses [10, 11]. The role of the G gene in rabies spillover, host shift, and adaptation has not been analyzed thoroughly. The information could help understand viral pathogenesis and develop a vaccine for a broad spectrum of lyssavirus infections.

Here, we used newly developed computational algorithms as well as traditional methods to investigate potential recombination events and selection pressures in the G gene of Lyssaviruses. The dataset for the study was comprised of 53 full-length glycoprotein gene sequences isolated from different hosts in 21 countries over a period of 70 years. We hypothesized that if different hosts with rabies infections over decades did not lead to positive selection or recombination events in the G gene, the gene does not play a significant role in lyssavirus adaptation.

2. Methods2.1. Dataset

We choose a dataset that covers lyssavirus isolates spatially and geographically over a long period of time in various animal hosts. Fifty-three full-length G sequences from 21 countries isolated over a period of 70 years were retrieved from NCBI GenBank. The sequences were aligned using fast statistical alignment (FSA, [12]). Briefly, FSA is a probabilistic multiple-sequence alignment algorithm, which uses a “distance-based” approach to aligning homologous protein, RNA, or DNA sequences. It produces superior alignments of homologous sequences that are subject to very different evolutionary constraints. The nucleotide (nt) sequence alignment of the lyssavirus G genes was corrected manually by visual inspection using the amino acid sequence alignment. Gaps were removed if they existed in majority of the sequences.

2.2. Phylogenetic Analyses

A phylogenetic tree was reconstructed by using the neighbor joining algorithm in the MEGA 4 package [13]. The maximum composite likelihood model was used as well as the pairwise deletion option for gaps. The statistical significance of the phylogeny was measured by bootstrap with 1,000 replicates.

2.3. Recombination Detection

We first applied PHI [14], NSS [15], and Max χ2 [16] tests (implemented in PhiPack [14]) with 1,000 permutations to detect recombination. Sequences involved in the recombination and breakpoints were determined by using 3SEQ [17] and GARD implemented in the Datamonkey web interface [18, 19]. The recombination was further verified by bootscanning and phylogenetic incongruence analysis. Bootscanning was performed using SimPlot software version 3.5.1 [20]. The parameters for bootscanning were window size, 200 bp; step, 10 bp; GapStrip, on; bootstrap replicate, 1000; distance model, Kimura (2-parameter); tree algorithm, neighbor-joining.

2.4. Selection Analyses

To test positive selection on sites of the G gene in Lyssaviruses, the Codeml program in PAML software package version 4.4 was employed [21]. Codeml implements the maximum likelihood method to test if positive selection has taken place at sites within a gene. This method uses different codon substitution models to estimate the number of nonsynonymous (dN) and synonymous substitutions (dS) per site among codons, since different amino acids in a protein could be under different selective pressures, thus creating a different ω (dN/dS) ratio. The models in our dataset analyses were M0 (one-ratio), M1 (nearly neutral), M2 (positive selection), M7 (β distribution), and M8 (β + ω > 1) [22]. The M0 model estimates overall  ω for the data. The M1 model estimates codon site proportion p0 with ω0 < 1 and proportion p1 (p1 = 1 − p0) with ω1 = 1. The M2 model allows an additional class of positively selected sites with proportion p2 (p2 = 1 − p1p0) with ω2 estimated from the data. The M7 model specifies that ω follows a beta distribution and the value of ω is allowed to change between 0 and 1. Parameters p and q of the beta distribution are estimated from the data in the M7 model. In the M8 model, a proportion of sites p0 has a ω in the beta distribution and the proportion p1 sites are assumed to be positively selected. Two sets of comparisons (M2 versus M1, M8 versus M7) were made to test the hypothesis of selection. Within the comparison, the likelihood ratio test statistic used to determine the level of significance was calculated as twice the difference of the likelihood scores (2Δl) estimated by each model. The significance was determined under χ2 distribution. The degrees of freedom for the M1 versus M2 and M7 versus M8 tests are 2 [22]. If M8 or M2 is significantly favored and it contains codons with ω > 1, positive selection is significantly evident. Posterior probabilities of the inferred positively selected sites were estimated by the Bayes empirical Bayes (BEB) approach [23].

We also applied single-likelihood ancestor counting (SLAC), fixed-effects likelihood (FEL), and random-effects likelihood (REL) [18] to indentify selection pressure on individual codons of the G gene in lyssaviruses.

3. Results3.1. Recombination Analyses

Our dataset covered lyssaviruses isolated over a period of 70 years from 21 countries (Table 1), including the new and old continents. The hosts included bats, cows, dogs, foxes, humans, raccoons, sheep, and skunks.

The PHI and Max χ2 tests suggested significant evidence of recombination in the G gene. By 1000 permutations, the P-values of PHI and Max χ2 test were .006 and 0, respectively. However, no significant evidence (P = .796) of recombination was detected by using the NSS test.

By using 3SEQ, 6 long recombinant sequences (>100 bp) were detected: AF233275, AY237121, AY987478, DQ074978, DQ849071, and L04523 (Table 2). Two breakpoints were identified in all recombinants. The first breakpoint was at nucleotide position between 400 and 800. The second breakpoint was around nucleotide position of 1080. However, the two breakpoints for DQ074978 and L04523 were at the very beginning and around nucleotide position of 109, respectively.

The analysis by using GARD also suggested evidence of recombination with significant topological incongruence at the 2 breakpoints (Table 3). The first breakpoint was at nucleotide position of 441 and the second was at nucleotide position of 1089. The significance value for the 2 breakpoints was 0.01. The left hand side (LHS) and the right hand side (RHS) P-values for the 2 breakpoints were .0004.

We analyzed the recombination events by using BootScanning as implemented in SimPlot. Sequence AY987478 was used as a query sequence in all four cases (Figures 1(a)1(d)). The analysis confirmed the recombination event in the G gene of lyssavirus. The high bootstrap values support clustering sequence AY987478 with AF325489 (Figures 1(a) and 1(b)) and with AY237121 (Figures 1(c) and 1(d)) at positions from 1 to around 440 and at positions from around 1130 to the end of the sequences. The bootstrap values are also high for clustering AY987478 with AF23375 (Figures 1(a) and 1(c)) and DQ074978 (Figures 1(b) and 1(d)) at positions from around 540 to 1000. The switches of the high bootstrap values at nucleotide positions from around 440 to 540 and from 1000 to 1130 indicate two possible breakpoints for the recombination.

Since recombination with 2 breakpoints was predicted by 3SEQ, GARD, and Bootscanning, we constructed phylogenetic trees by using sequences from the beginning to the first breakpoint and the sequences from the second breakpoint to the end (Figure 2(a)) and a phylogenetic tree with sequences between the two breakpoints (Figure 2(b)). The reconstructed trees presented conflicting topological positions of the putative recombinant AY987478. The putative recombinant was clustered with AY237121 and AF325489 in Figure 2(a), but clustered with DQ074978 and AF233275 in Figure 2(b). All other 5 putative recombinants did not present phylogenetic incongruence. The same result was also verified by GARD (data not shown). When AY987478 was excluded from the dataset, the P-values of Phi, Max χ2, and NSS were .121, .209, and .791, respectively, suggesting no evidence of recombination. The GARD analysis did not indicate evidence of recombination either.

3.2. Selection Pressure Analyses

The selection pressure analysis with the glycoprotein gene by using PAML is presented in Table 4. The likelihood ratio test statistic (2Δl) estimated by M2 and M1 was 0. The corresponding P value was .99, which is not significant to reject the nearly null hypothesis of neutral selection in M1. In the comparison between the null neutral site model (M7) and the selection model (M8), the 2Δl was 18.18 and the corresponding P-value was .0001, indicating that the positive selection model was significantly favored over the null neutral site model. Posterior probabilities of the inferred positively selected sites estimated by the BEB approach were shown in Table 5. Four amino acid sites at 466, 483, 486, and 490 were identified to be under positive selection. But only the site at position 483 had a marginal significance support with posterior probability of 95% and weak positive selection pressure with  ω of 1.466. The corresponding posterior probabilities for sites at 466, 486 and, 490 were 68%, 56%, and 82%, respectively.

To test the effect of recombination on positive selection analysis, we excluded the putative recombinant AY987478 from the dataset. Similar results were observed, and the BEB posterior probability supports for amino acid sites under positive selection were nonsignificant (Table 5). When all six putative recombinants were excluded in our analysis, no evidence was found to support positive selection either in M1 or M7 (data not shown). In all cases, the ω in M0 was either 0.07 or 0.08. Overall, 87% of the sites in the G gene had a very low ω value of 0.05 in M2 and M7, indicating strong selective constraints on those sites.

To study the effect of viral passages and possible genetic bottlenecks on the results, we repeated the analysis with a dataset excluding six vaccine sequences and the sequence AF233275 (PV11) from cell culture of lyssaviruses under intensive cell culture. We found no significant evidence for positive selection pressure on any site of the G gene.

Analyses using SLAC, REL, and FEL found no evidence of any amino acid in the G gene under positive selection, instead most of the amino acids were found to be under negative selection (Table 6). One site at position 416 was under marginal positive selection by FEL with P-value of .0999, narrowly passing the significance level of 0.1. However, this result was not supported by SLAC and REL.

4. Discussion

Lyssaviruses can infect all warm-blooded mammals, and spillover events and host shift have been well documented [59]. The molecular mechanism of rabies infection and transmission is still not completely understood, and the phenomenon usually leads to the connection with rabies virus G protein, since G is the only membrane protein responsible for virus entry both in vitro and in vivo. Therefore, it is a reasonable assumption that rabies virus adaptation is due to the G gene. Positive selection is an important evolutionary force that drives adaptation. It is not surprising that evolutionary scientists first applied selection analysis to the G gene of lyssaviruses [39]. One notable difference between the previous investigations and our study was the dataset. Previous dataset with 55 complete G gene sequences were from isolates of natural rabies infections, excluding passages and vaccine strains. Our dataset included street 53 rabies isolates and vaccine strains collected over a period of 70 years from 21 countries. The neutrality tests on the G in lyssavirus indicated that the protein was under negative selection. Analysis of heterogeneous selective pressures on the amino acid sites across the gene found no evidence for positive selection on any site when the putative recombinant AY987478 was excluded. Instead, most of the sites were under strong negative selection, which was consistent with previous investigations using only street rabies isolates [39, 40]. The only weak positive selection identified by our analyses was at amino acid residue 483 (not in the ectodomain). No positive selection has been detected in the main epitope II or III, the site of virus escape identified by monoclonal antibody binding selections in vitro. It is possible that the results were confounded by the sequences from isolates under intensive cell culture. Repeated passages of an RNA virus resulted in loss of fitness due to Muller's ratchet [41]. Serial virus passages severely reduce population size when a small set of founder population is reintroduced into an identical unpopulated environment, which may lead to the stochastic loss of certain genotypes, especially the rare genotypes [42, 43]. However, exclusion of sequences of passaged lyssaviruses from the dataset in this study did not affect the readout of the analyses. It appears that rabies spillover, host shift (happened naturally), virus escape by monoclonal antibody selection, and vaccine strains (under various in vitro and in vivo conditions) is not the result of positive selection in the G gene.

Recombination is another important evolutionary driving force in adaptation, and it is a mechanism that prevents the accumulation of deleterious substitutions [44]. It allows the acquisition of multiple genetic changes in a single step and can combine genetic information to produce advantageous genotypes. It may be important for incremental host adaptation after switching to new host has occurred [45]. Recombination in rabies viruses had been proposed, but it was not thoroughly inspected [46, 47]. Our study suggested one recombinant event. The recombinant sequence AY987478 was from a dog isolate (CHAND03, genotype 1) and the possible parental sequences were isolated from dogs and sheep from the same geographic area (India and Nepal). However, the putative recombinant AY987478 could be an artifact from sequencing or sample contamination. Generation of recombinants in the course of reverse transcription of RNA and subsequent PCR is a well-known phenomenon [4850]. From the bootscanning analysis in this research, the 3 prime and 5 prime regions of AY987478 were clustered with putative parents with a bootstrap value of 100%, indicating little difference between the two sequences in the two regions. By checking the sequences, there are regions of about 450 bases long that are identical between the recombinant and the corresponding parent, which is rare considering the high mutation rate in RNA viruses. The homologous recombination rate in negative-sense RNA virus was found to be low [46], which is supported by a recent report that homologous recombination is very rare or absent in influenza A virus [17]. Further experimentation is needed to prove that the recombinant AY987478 is not an artifact.

In summary, we did not find significant support for positive selection pressure on G gene in lyssavirus isolates from different rabies hosts and vaccine strains that cover 70 years of evolution in 21 countries. The recombination analysis suggested an orphan event that needs further investigation. It appears that evolution of the G gene may not play a major role in lyssavirus adaptation. It is surprising considering the functions of glycoprotein in lyssavirus infection. It has been reported that host switching from chiropters to carnivores has occurred in lyssavirus evolution history [7, 9]. Spillovers of lyssaviruses from chiropters to other animals may have happened repeatedly and still occur [8]. Transmission of European bt lyssavirus 1 (EBLV-1) was reported in sheep [51], stone marten [52], and cats [53]. For a successful spillover and subsequent adaptation, there must be effective cross-species viral exposure and compatibility between the virus and the new host to allow replication and transmission. Lyssavirus infections are typically transmitted by the virus-laden saliva of a rabid animal via a bite or scratch, which can facilitate cross-species viral exposures. The initial viral interaction with cells of a new host plays a critical role in determining host specificity and host shift [45]. For example, feline virus acquired the ability to infect dogs through changes in its capsid protein that binds to canine transferrin receptor on canine cells [54]. Lyssavirus G is a surface glycoprotein responsible for receptor recognition and membrane fusion [79, 55]. It is reasonable to expect that the protein is under positive selection pressure in the viral adaptation to the new host. The lack of positive selection in the G glycoprotein suggests that the virus is not subject to strong immune selection [25]. The G gene may escape the immunity of the host since lyssaviruses migrate from the peripheral to the central nervous systems [7]. Recent investigation demonstrated that diminishing frequencies of both cross-species transmission and host shifts were found with increasing phylogenetic distance between bat species [9], indicating the virus, thus the G gene, is subject to less selection pressure in a similar host and cellular environment [7, 25]. However, the G gene might have been under relative low positive selection that was not detected by current computational methods. More sensitive method or properly relaxed statistical significance stringency with experimental verification may help identify the role of the G gene in lyssavirus adaptation.

Acknowledgments

The authors thank Jan Pohl, Elizabeth Neuhaus, and Charles E. Rupprecht for support in this investigation. They also thank Kathryn Kellar and Scott Sammons for helpful suggestions to the paper. Use of trade names and commercial sources are for identification only and do not imply endorsement by the U S Department of Health and Human Services. The findings and conclusions in this paper are those of the authors and do not necessarily represent the views of the funding agency.

LefébureTStanhopeMJEvolution of the core and pan-genome of Streptococcus: positive selection, recombination, and genome compositionGenome Biology200785R71.1R71.1617475002SteinhauerDADomingoEHollandJJLack of evidence for proofreading mechanisms associated with an RNA virus polymeraseGene199212222812881336756KirkegaardKBaltimoreDThe mechanism of RNA recombination in poliovirusCell19864734334433021340WorobeyMHolmesECEvolutionary aspects of recombination in RNA virusesJournal of General Virology199980102535254310573145PfukenyiDMPawandiwaDMakayaPVUshewokunze-ObatoluUA retrospective study of wildlife rabies in ZimbabweTropical Animal Health and Production200941456557218758985WandelerAINadin-DavisSATinlineRRRupprechtCERupprechtCEDietzcholdBKoprowskiHRabies epidemiology: some ecological and evolutionary perspectivesLyssaviruses1994Berlin, GermanySpringer297324BadraneHTordoNHost switching in Lyssavirus history from the chiroptera to the carnivora ordersJournal of Virology200175178096810411483755Crawford-MikszaLKWadfordDASchnurrDPMolecular epidemiology of enzootic rabies in CaliforniaJournal of Clinical Virology199914320721910614858StreickerDGTurmelleASVonhofMJKuzminIVMcCrackenGFRupprechtCEHost phylogeny constrains cross-species emergence and establishment of rabies virus in batsScience2010329599267667920689015DietzscholdBWunnerWHWiktorTJCharacterization of an antigenic determinant of the glycoprotein that correlates with pathogenicity of rabies virusProceedings of the National Academy of Sciences of the United States of America198380170746185960YanXMohankumarPSDietzscholdBSchnellMJFuZFThe rabies virus glycoprotein determines the distribution of different rabies virus strains in the brainJournal of Neurovirology20028434535212161819BradleyRKRobertsASmootMFast statistical alignmentPLos Computational Biology200955Article ID e1000392.TamuraKDudleyJNeiMKumarSMEGA4: molecular evolutionary genetics analysis (MEGA) software version 4.0Molecular Biology and Evolution20072481596159917488738BruenTCPhilippeHBryantDA simple and robust statistical test for detecting the presence of recombinationGenetics200617242665268116489234JakobsenIBEastealSA program for calculating and displaying compatibility matrices as an aid in determining reticulate evolution in molecular sequencesComputer Applications in the Biosciences19961242912958902355SmithJMAnalyzing the mosaic structure of genesJournal of Molecular Evolution19923421261291556748BoniMFPosadaDFeldmanMWAn exact nonparametric method for inferring mosaic structure in sequence tripletsGenetics200717621035104717409078Kosakovsky PondSLPosadaDGravenorMBWoelkCHFrostSDWGARD: a genetic algorithm for recombination detectionBioinformatics200622243096309817110367Kosakovsky PondSLFrostSDWDatamonkey: rapid detection of selective pressure on individual sites of codon alignmentsBioinformatics200521102531253315713735LoleKSBollingerRCParanjapeRSFull-length human immunodeficiency virus type 1 genomes from subtype C- infected seroconverters in India, with evidence of intersubtype recombinationJournal of Virology19997311521609847317YangZPAML 4: phylogenetic analysis by maximum likelihoodMolecular Biology and Evolution20072481586159117483113YangZNielsenRGoldmanNPedersenAMKCodon-substitution models for heterogeneous selection pressure at amino acid sitesGenetics2000155143144910790415YangZWongWSWNielsenRBayes empirical Bayes inference of amino acid sites under positive selectionMolecular Biology and Evolution20052241107111815689528BadraneHBahloulCPerrinPTordoNEvidence of two Lyssavirus phylogroups with distinct pathogenicity and immunogenicityJournal of Virology200175173268327611238853GuyattKJTwinJDavisPA molecular epidemiology study of Australian bat lyssavirusJournal of General Virology200384248549612560583TangQOrciariLARupprechtCEZhaoXSequencing and positional analysis of the glycoprotein gene of four Chinese rabies virusesZhongguo Bingduxue20001512233HemachudhaTWacharapluesadeeSLumlertdaechaBSequence analysis of rabies virus in humans exhibiting encephalitic or paralytic rabiesJournal of Infectious Diseases2003188796096614513414ItoHMinamotoNWatanabeTA unique mutation of glycoprotein gene of the attenuated RC-HL strain of rabies virus, a seed virus used for production of animal vaccine in JapanMicrobiology and Immunology19943864794827968680AgrawalSShasanyAKKhanujaSPSPlant transformation vectors having street rabies virus (Indian strain) glycoprotein geneJournal of Plant Biochemistry and Biotechnology20051428187HyunBHLeeKKKimINJMolecular epidemiology of rabies virus isolates from South KoreaVirus Research20051141-211312516051390MengSLYanJXXuGELA molecular epidemiological study targeting the glycoprotein gene of rabies virus isolates from ChinaVirus Research20071241-212513817129631BaiXWarnerCKFekaduMComparisons of nucleotide and deduced amino acid sequences of the glycoprotein genes of a Chinese street strain (CGX89-1) and a Chinese vaccine strain (3aG) of rabies virusVirus Research19932721011128460524YelvertonENortonSObijeskiJFGoeddelDVRabies virus glycoprotein analogs: biosynthesis in Escherichia coliScience198321945856146196297004BenmansourABrahimiMTuffereauCCoulonPLafayFFlamandARapid sequence evolution of street rabies glycoprotein is related to the highly heterogeneous nature of the viral populationVirology1992187133451736537Nadin-DavisSAAllen CaseyGWandelerAIA molecular epidemiological study of rabies virus in central Ontario and western QuebecJournal of General Virology19947510257525837931145Nadin-DavisSASampathMICaseyGATinlineRRWandelerAIPhylogeographic patterns exhibited by Ontario rabies virus variantsEpidemiology and Infection1999123232533610579454Nadin-DavisSAHuangWWandelerAIThe design of strain-specific polymerase chain reactions for discrimination of the racoon rabies virus strain fron indigenous rabies viruses of OntarioJournal of Virological Methods19965711148919819MorimotoKPatelMCorisdeoSCharacterization of a unique variant of bat rabies virus responsible for newly emerging human cases in North AmericaProceedings of the National Academy of Sciences of the United States of America19969311565356588643632HolmesECWoelkCHKassisRBourhyHGenetic constraints and the adaptive evolution of rabies virus in natureVirology2002292224725711878928BourhyHReynesJMDunhamEJThe origin and phylogeography of dog rabies virusJournal of General Virology200889112673268118931062ClarkeDKDuarteEAMoyaAElenaSFDomingoEHollandJGenetic bottlenecks and population passages cause profound fitness differences in RNA virusesJournal of Virology19936712222288380072OberleMBalmerOBrunRRoditiIBottlenecks and the maintenance of minor genotypes during the life cycle of Trypanosoma bruceiPLos Pathogens201067Article ID e1001023.WahlLMGerrishPJSaika-VoivodIEvaluating the impact of population bottlenecks in experimental evolutionGenetics2002162296197112399403PossMIdoineARossHATerweeJAVandeWoudeSRodrigoARecombination in feline lentiviral genomes during experimental cross-species infectionVirology2007359114615117046045ParrishCRHolmesECMorensDMCross-species virus transmission and the emergence of new epidemic diseasesMicrobiology and Molecular Biology Reviews200872345747018772285ChareERGouldEAHolmesECPhylogenetic analysis reveals a low rate of homologous recombination in negative-sense RNA virusesJournal of General Virology200384102691270313679603GeueLScharesSSchnickCGenetic characterisation of attenuated SAD rabies virus strains used for oral vaccination of wildlifeVaccine200826263227323518485548FlockerziAMaydtJFrankOExpression pattern analysis of transcribed HERV sequences is complicated by ex vivo recombinationRetrovirology200743911217212810LuoGTaylorJTemplate switching by reverse transcriptase during DNA synthesisJournal of Virology1990649432143281696639MeyerhansAVartanianJPWain-HobsonSDNA recombination during PCRNucleic Acids Research1990187168716912186361BourhyHDacheuxLStradyCMaillesARabies in Europe in 2005Euro Surveillanc20051011213216TjørnehøjKFooksARAgerholmJSRønsholtLNatural and experimental infection of sheep with european bat lyssavirus type-1 of danish bat originJournal of Comparative Pathology20061342-319020116545840DacheuxLLarrousFMaillesAEuropean bat lyssavirus transmission among cats, EuropeEmerging Infectious Diseases200915228028419193273HuefferKParkerJSLWeichertWSGeiselRESgroJYParrishCRThe natural host range shift and subsequent evolution of canine parvovirus resulted from virus-specific binding to the canine transferrin receptorJournal of Virology20037731718172612525605DurrerPGaudinYRuigrokRWHGrafRBrunnerJPhotolabeling identifies a putative fusion domain in the envelope glycoprotein of rabies and vesicular stomatitis virusesJournal of Biological Chemistry19952702917575175817615563

Bootscanning analysis of recombination in glycoprotein gene of lyssavirus by using the SimPlot program with a window size of 200 nucleotides and a step size of 10 nucleotides.

(a) NJ phylogenetic tree of 53 glycoprotein gene sequences with regions concatenated from position of 1 to 441 and position of 1090 to 1572. Bootstrap values of 1000 replicates are shown above the branches. The red marker represents the putative recombinant. (b) NJ phylogenetic tree of 53 glycoprotein gene sequences with region from position of 441 to 1089. Bootstrap values of 1000 replicates are shown above the branches. The red marker represents the putative recombinant.

Sequences of glycoprotein gene used in this study.

Accession no.CountryHostYear of isolationStrain/isolateGenotypeReferences
AB115921IndonesiaDog2001SN01-23GT1Unpublished
AF233275IndiaSheepPV11GT1Unpublished
AF298141USABat1979USA7-BTGT1Badrane et al. [24]
AF298142PolandBat1985EBL1POLGT5Badrane et al. [24]
AF298143FranceBat1989EBL1FRAGT5Badrane et al. [24]
AF298144FinlandBat1986EBL2FINGT6Badrane et al. [24]
AF298145HollandBat1986EBL2HOLGT6Badrane et al. [24]
AF298146S. AfricaBat1970DuvSAF1GT4Badrane et al. [24]
AF298147S. AfricaBat1981DuvSAF2GT4Badrane et al. [24]
AF325487MalaysiaHuman1985MAL1-HMGT1Badrane and Tordo [7]
AF325489NepalDog1989NEP1-DGGT1Badrane and Tordo [7]
AF325490FrenchBovine1985GUY1-BVGT1Badrane and Tordo [7]
AF325491BrazilBovine1986BRA1-BVGT1Badrane and Tordo [7]
AF325492MexicoBat1987MEX2-VPGT1Badrane and Tordo [7]
AF325494USABat1981USA8-BTGT1Badrane and Tordo [7]
AF325495USABat1982USA9-BTGT1Badrane and Tordo [7]
AF401285Thailand8743THAGT1Unpublished
AF426297AustraliaBat1997ABLSF12NBGT7Guyatt et al. [25]
AF426298AustraliaBat1997ABLSF11KWGT7Guyatt et al. [25]
AJ871962ChinaVaccinePMGT1Unpublished
AY009098ChinaHuman1986CNX8601GT1Tang et al. [26]
AY009099ChinaHuman1986CNX8511GT1Tang et al. [26]
AY009100ChinaDog (Vaccine)1955CTNGT1Tang et al. [26]
AY237121IndiaDogRVDGT1Unpublished
AY257980ThailandHumanHM65GT1Hemachudha et al. [27]
AY257982ThailandHumanHM88GT1Hemachudha et al. [27]
AY257983ThailandHumanHM208GT1Hemachudha et al. [27]
AY987478IndiaDog1999CHAND03GT1Unpublished
D14873JapanVaccineRC-HLGT1Unpublished
D16330JapanVaccineRC-HLGT1Ito et al. [28]
DQ074978IndiaDogGT1Agrawal et al. [29]
DQ076097S. KoreaBovineSKRBV0404HCGT1Hyun et al. [30]
DQ076099S. KoreaDogSKRRD9903YGGT1Hyun et al. [30]
DQ767897ChinaVaccineCTN-35GT1Unpublished
DQ849071ChinaDog1994GX4GT1Meng et al. [31]
DQ849072ChinaDog1992CQ92GT1Meng et al. [31]
L04522ChinaVaccine (Dog)19313aGGT1Bai et al. [32]
L04523ChinaVaccine (dog)1993CGX89-1GT1Bai et al. [32]
L40426CVSGT1Yelverton et al. [33]
M81058AlgeriaDogALG1-DGGT1Benmansour et al. [34]
M81059AlgeriaHumanGT1Benmansour et al. [34]
M81060AlgeriaHumanGT1Benmansour et al. [34]
U03765CanadaVulpes8480FXGT1Nadin-Davis et al. [35]
U03766Arctic CircleDog1992Arctic A1-1090DGGT1Nadin-Davis et al. [35]
U03767CanadaDog1993Hudson Bay-4055DGGT1Nadin-Davis et al. [35]
U11736CanadaArctic Fox91RABN1035GT1Nadin-Davis et al. [36]
U11755CanadaSkunk91RABN1578GT1Nadin-Davis et al. [36]
U27214USARaccoonNY 516GT1Nadin-Davis et al. [37]
U27215USARaccoonNY 771GT1Nadin-Davis et al. [37]
U27216USARaccoonFLA 125GT1Nadin-Davis et al. [37]
U27217USARaccoonPA R89GT1Nadin-Davis et al. [37]
U52946USABat1994SHBRVGT1Morimoto et al. [38]
X69122IndiaVaccineFluryGT1Unpublished

Recombination detection in glycoprotein gene of lyssavirus by using 3SEQ.

PQCP-valueDunn SidakBreakpoints
M81058AY987478AF233275 0 2.08E − 08 432–440, 1080–1089 456–496, 1080–1089
M81060AY987478AF2332751E − 121.31E − 07 432–440, 1080–1089 456–496, 1080–1089
AY987478M81059AY237121 0 2.81E − 11 441–455, 1077–1079
AY987478M81058AY237121 0 2.13E − 13 441–455, 1077–1079
AY987478M81060AY237121 0 1.13E − 13 441–455, 1077–1079
AY987478AF233275AY2371211.3E − 101.88E − 05 432–455, 1068–1089 465–518, 1068–1089
AY987478L04522AY2371211.1E − 081.48E − 03 627–638, 1077–1089 663–666, 1077–1089
AY987478AF325489AY237121 0 2.71E − 15 700-701, 1077–1097
AY987478U11755AY2371213.2E − 104.42E − 05 717–719, 1077–1082 729–734, 1077–1082
AY987478U11736.2AY2371213.3E − 094.61E − 04 717–719, 1077–1082 729–734, 1077–1082
AY987478DQ849071AY2371216.1E − 118.61E − 06 736-737, 1077–1079
AY987478DQ076097AY2371211.2E − 101.69E − 05 630–638, 1077–1089 699–701, 1077–1089
AY987478DQ076099AY2371219E − 121.31E − 06 700-701, 1077–1089 714–719, 1077–1089
AY987478L04523AY2371212.2E − 093.04E − 04 736-737, 1077–1079
AY987478X69122AY2371214E − 126.00E − 07 666–669, 1032–1049 666–669, 1077–1089
AY987478AY009098AY2371214E − 124.99E − 07 693–701, 1077–1079 705–711, 1077–1079
AY987478AY009099AY2371214E − 124.99E − 07 693–701, 1077–1079 705–711, 1077–1079
AY987478DQ849072AY2371212.1E − 113.02E − 06 693–701, 1077–1079 705–711, 1077–1079
AY987478AJ871962AY2371211E − 127.29E − 08 750–794, 1077–1089
AY987478AF325487AY237121 0 1.36E − 08 780–794, 1077–1079
AY987478L40426AY2371214.8E − 116.71E − 06 750–794, 1077–1089
AY987478AF401285AY237121 0 9.99E − 10 780–795, 1077–1079
AY987478AY257983AY2371212.3E − 113.27E − 06 780–795, 1077–1079
AY987478AY257980AY237121 0 5.70E − 09 750–761, 1077–1079 780–795, 1077–1079
AY987478AY257982AY2371215.9E − 118.33E − 06 780–795, 1032–1043 780–795, 1077–1079
AY987478DQ767897AY2371211E − 071.46E − 02 759–767, 972–974
AY987478U52946AY2371215.3E − 087.43E − 03 741–748, 900-901 741–748, 918–938
AY987478U03766AY2371212.4E − 073.27E − 02 717–719, 876–889 717–719, 894–914
AY987478U03765AY2371212.6E − 073.55E − 02 717–719, 876–889 717–719, 894–914
AY237121AF233275AY987478 0 1.38E − 39 432–452, 1077–1089
AY237121DQ074978AY987478 0 1.16E − 38 432–452, 1077–1089
AY237121L04522AY987478 0 7.15E − 25 627–647, 1065–1089
AY237121DQ076097AY987478 0 1.47E − 08 630–647, 1056–1058 630–647, 1065–1089
AY237121U03767AY9874781E − 111.42E − 06 630–638, 1041–1058 630–638, 1065–1079
AY237121AJ871962AY987478 0 6.36E − 20 642–647, 1041–1058 642–647, 1065–1089
AY237121X69122AY987478 0 3.16E − 26 642–647, 1041–1049 654–659, 1041–1049
AY237121L40426AY987478 0 1.38E − 17 642–647, 1041–1058 642–647, 1065–1089
AY237121M81058AY987478 0 9.60E − 21 441–452, 1065–1079 618–710, 1065–1079
AY237121M81060AY987478 0 6.49E − 23 441–452, 1065–1079 618–710, 1065–1079
AY237121D14873AY987478 0 6.82E − 17 685–701, 1065–1085 705–710, 1065–1085
AY237121D16330AY987478 0 5.23E − 17 685–701, 1065–1085 705–710, 1065–1085
AY237121AY257980AY9874785.3E − 087.49E − 03 705–710, 1041–1046
AY237121DQ849071AY9874781.9E − 092.68E − 04 708–710, 1041–1046
AY237121L04523AY9874781.3E − 071.85E − 02 708–710, 1041–1046
AY237121DQ076099AY987478 0 1.07E − 08 634–647, 1056–1058 634–647, 1065–1089
AY237121U11755AY9874781E − 121.60E − 07 630–647, 1056–1058 630–647, 1065–1082
AY237121U11736.2AY987478 0 2.29E − 08 630–647, 1056–1058 630–647, 1065–1082
AY237121DQ767897AY9874781.7E − 092.37E − 04 708–710, 1017–1022 708–710, 1041–1046
AY237121AY009098AY9874781.8E − 082.58E − 03 705–710, 1041–1046 736-737, 1041–1046
AY237121AY009099AY9874781.8E − 082.58E − 03 705–710, 1041–1046 736-737, 1041–1046
AY237121AF325487AY9874787.5E − 101.06E − 04 705–710, 1041–1046 732–734, 1041–1046
AY237121U03766AY9874781.2E − 091.75E − 04 630–638, 1041–1058 630–638, 1065–1079
AY237121U03765AY9874782.3E − 103.26E − 05 630–638, 1041–1058 630–638, 1065–1079
AY237121M81059AY987478 0 1.27E − 18 441–452, 993–998 441–452, 1017–1034
AY237121AF325490AY9874789.2E − 091.30E − 03 705–710, 993–995 705–710, 1017–1019
AY237121AF325491AY9874787E − 129.93E − 07 705–710, 993–995
AY237121AF325492AY9874789.6E − 091.34E − 03 700-701, 993–995 705–710, 993–995
AY237121DQ849072AY9874781.3E − 071.83E − 02 705–710, 924–935 705–710, 945–950
AY237121AY009100AY9874783.4E − 074.62E − 02 708–710, 885–887 708–710, 924–938
AY237121AF401285AY9874786.4E − 098.95E − 04 736-737, 883–887
AY237121AY257983AY9874789.9E − 081.38E − 02 732–734, 883–887 732–734, 1041–1046
M81059AY987478DQ074978 0 9.05E − 10 519–522, 1080–1089
M81058AY987478DQ074978 0 4.12E − 09 519–522, 1080–1089
M81060AY987478DQ074978 0 2.80E − 08 519–522, 1080–1089
AY009100M81059DQ8490712E − 082.80E − 03 0–3, 108–119
AY009100M81058DQ8490713E − 084.20E − 03 0–3, 108–119
AY009100M81060DQ8490711.2E − 071.70E − 02 0–3, 108–119
AY009100AJ871962DQ8490712.2E − 073.02E − 02 0–3, 108–110
AY009100M81059L045239.6E − 091.34E − 03 0–3, 108–119
AY009100M81058L045231.5E − 082.04E − 03 0–3, 108–119
AY009100M81060L045236.7E − 089.37E − 03 0–3, 108–119 0–3, 139–161
AY009100AJ871962L045231.3E − 071.88E − 02 0–3, 108–110

Note: P and Q are putative parent sequences, and C is the putative child sequence in the recombination.

KH tests verify the significance of breakpoints estimated by GARD analysis.

BreakpointLHS P-valueRHS P-valueSignificance
441.00040.000400.01
1089.00040.000400.01

Parameter estimates, dN/dS ratio, likelihood score, and test statistics under models of variable ω ratios among sites for the glycoprotein gene in lyssavirus.

Parameter estimatesdN/dSLikelihood scores (l)Model comparison (2Δl, d.f., P)Positive selection
M0: one ratioω = 0.080.08−24586.10 None

M1: Nearly neutralω0 = 0.05, ω1 = 1, (p0 = 0.87, p1 = 0.13)0.17−24010.40Not allowed

M2: Positive selectionω0 = 0.05, ω1 = 1, ω2 = 1, (p0 = 0.87, p1 = 0.06, p2 = 0.07)0.17−24010.40M2 versus M1:0, d.f. = 2, P = .99None

M7: β, Neutralp = 0.26, q = 2.110.1023443.16Not allowed

M8: β + ω > 1, Selectionp0 = 0.98, p = 0.28, q = 2.92, (p1 = 0.02), ω = 1.00.1023434.07M7 versus M8: 18.18, d.f. = 2, P = .0001 See Table 6

Positive selection sites in the glycoprotein gene predicted by using Bayes empirical analysis under different PAML models.

CodonAmino acidPosterior probabilityPost mean ± S.E.
Dataset IDataset IIDataset IDataset IIDataset IDataset IIDataset IDataset II
466466AA0.680.721.27 ± 0.351.29 ± 0.34
483483VV0.950.841.46 ± 0.161.39 ± 0.26
486486TT0.560.531.19 ± 0.361.16 ± 0.36
490490QQ0.820.801.38 ± 0.271.36 ± 0.29

Dataset I: The whole 53 nucleotide sequences. Dataset II: AY987478 was excluded.

Detection of selection pressure on glycoprotein gene using methods implemented in the Datamonkey website.

DatasetMean dN/dS Positive selection sitesNegative selection sitesCodon (P-Value)
SLACFELRELSLACFELRELSLACFELREL
Dataset I0.12260.12780003974180
Dataset II0.12310.12740003914170
Dataset III0.12140.12330103864160416 (.0999)

Dataset I: The whole 53 nucleotide sequences. Dataset II: AY987478 was excluded. Dataset III: the six putative recombinants were excluded.