Members of the 2012–2014 International Committee on Taxonomy of Viruses (ICTV)
Members of the ICTV Data Subcommittee.
National Center for Biotechnology Information (NCBI) Viral RefSeq Genomes Advisors for members of the order
Members of the NCBI Genome Annotation Virus Working Group.
Sequence determination of complete or coding-complete genomes of viruses is becoming common practice for supporting the work of epidemiologists, ecologists, virologists, and taxonomists. Sequencing duration and costs are rapidly decreasing, sequencing hardware is under modification for use by non-experts, and software is constantly being improved to simplify sequence data management and analysis. Thus, analysis of virus disease outbreaks on the molecular level is now feasible, including characterization of the evolution of individual virus populations in single patients over time. The increasing accumulation of sequencing data creates a management problem for the curators of commonly used sequence databases and an entry retrieval problem for end users. Therefore, utilizing the data to their fullest potential will require setting nomenclature and annotation standards for virus isolates and associated genomic sequences. The National Center for Biotechnology Information’s (NCBI’s) RefSeq is a non-redundant, curated database for reference (or type) nucleotide sequence records that supplies source data to numerous other databases. Building on recently proposed templates for filovirus variant naming [<virus name> (<strain>)/<isolation host-suffix>/<country of sampling>/<year of sampling>/<genetic variant designation>-<isolate designation>], we report consensus decisions from a majority of past and currently active filovirus experts on the eight filovirus type variants and isolates to be represented in RefSeq, their final designations, and their associated sequences.
The National Center for Biotechnology Information (NCBI) RefSeq project was initiated to create a nonredundant and curated set of genomic, transcript, and protein sequence records [
In the case of virological RefSeq records, each viral species was initially represented by only one genome sequence record, and all other genome records for members of the same species, or for different strains, variants, and isolates of the same member of this species were linked to this record as “genome neighbors” [
The process of curating genome sequence data must now be fundamentally reformed, since the number of sequenced viral genomes has increased exponentially over the past decade [
The mononegaviral family
Summary of the current filovirus taxonomy endorsed by the 2012-2014 ICTV
| Current Taxonomy and Nomenclature (Ninth ICTV Report and Updates) |
|---|
| Order |
| Family |
| Genus |
| Species |
| Virus 1: Marburg virus (MARV) |
| Virus 2: Ravn virus (RAVV) |
| Genus |
| Species |
| Virus: Taï Forest virus (TAFV) |
| Species |
| Virus: Reston virus (RESTV) |
| Species |
| Virus: Sudan virus (SUDV) |
| Species |
| Virus: Ebola virus (EBOV) |
| Species |
| Virus: Bundibugyo virus (BDBV) |
| Genus |
| Species |
| Virus: Lloviu virus (LLOV) |
These eight viruses are differentiated from each other by biological characteristics [
Temporary filovirus type viruses and type variants chosen by the 2010-2011 ICTV
| Filovirus Species | Type Virus of Species (Virus Abbreviation) | Type Variant and Isolate of Type Virus of Species | Type Sequence of Type Variant of Type Virus of Species (RefSeq) |
|---|---|---|---|
| Bundibugyo virus (BDBV) | Unnamed variant represented by isolate “811250”1 | NC_014373 | |
| Lloviu virus (LLOV) | Unnamed variant represented by isolate “MS-Liver-86/2003”2 | NC_016144 | |
| Marburg virus (MARV) | Unnamed variant represented by isolate “Musoke” | NC_001608 | |
| Reston virus (RESTV) | Unnamed variant represented by isolate “Pennsylvania” | NC_004161 | |
| Sudan virus (SUDV) | Unnamed variant represented by isolate “Boniface” [sic]3 | None | |
| Taï Forest virus (TAFV) | Unnamed variant represented by isolate “Côte d’Ivoire”4 | NC_014372 | |
| Ebola virus (EBOV) | Unnamed variant represented by isolate “Mayinga” | NC_002549 |
1 Isolate “811250” is/was not explicitly mentioned in [
These variants and sequences therefore needed to be re-evaluated by filovirus experts. To achieve uniformity and consistency, the current RefSeq entries have to be relabeled to conform to current ICTV taxonomy. In addition, type filovirus variant designations have to be chosen and the individual isolate names have to be adjusted to the filovirus strain/variant/isolate schemes that were recently established [
Genome-based classification of novel filoviruses or filovirus genomic sequences. Viruses are classified in the family
The “gold standard” filovirus type RefSeq entry should be selected on the basis of experimental importance and accessibility and represent a repository of functional information about a particular filovirus. It is of crucial importance that any functional annotation of a RefSeq entry (e.g., functions of particular genome parts or of genome-encoded proteins), is linked to the actual sequence associated with these experiments. The RefSeq entry should contain the most characterized virus/variant/isolate/sequence, independent of whether this virus, variant, or isolate was the first one discovered or the most widely used experimentally. Importantly, decisions on RefSeq entries do not entail a mandate that future experiments should necessarily be performed with the viruses associated with these entries. However, direct comparisons with RefSeq-associated viruses are highly recommended to further increase the detail associated with the RefSeq entries. These entries should be updated, and, if necessary, corrected on a continuous basis by a filovirus RefSeq subcommittee comprised of filovirus experts, whose composition is currently under consideration.
The authors of this article confirmed or replaced the current taxonomic type virus variants and isolates and the current filovirus RefSeq entries based on the availability of scientific information characterizing a particular virus. If scientific information is scarce for all members belonging to an entire taxon, other criteria such as availability, passaging history, or medical importance were used in decision making. Decisions were reached by consensus or simple majority voting, with the understanding that all authors will apply the final decisions reached by the entire group and enforce them in their functions as authors, peer-reviewers, and/or editors.
Only one cuevavirus, Lloviu virus (LLOV), has been described [
In line with filovirus strain/variant/isolate definitions outlined previously [Full name: Lloviu virus M.schreibersii-wt/ESP/2003/Asturias-Bat86 Shortened name: LLOV/M.sch/ESP/03/Ast-Bat86 Abbreviated name: LLOV/Ast-Bat86
Accordingly, in RefSeq #NC_016144 the definition line “Lloviu virus, complete genome” was changed to “
The genus
Bundibugyo virus (BDBV) is the second least characterized ebolavirus. Although at least eight isolates of this virus are available [Full name: Bundibugyo virus H.sapiens-tc/UGA/2007/Butalya-811250 Shortened name: BDBV/H.sap/UGA/07/But-811250 Abbreviated name: BDBV/But-811250
Accordingly, in RefSeq #NC_014373, the definition line “Bundibugyo ebolavirus, complete genome” was changed to “
Ebola virus (EBOV) is the most thoroughly characterized ebolavirus. Dozens of EBOV isolates are available, but the vast majority of published experiments have been performed with isolates “Mayinga” and “Kikwit” (reviewed in [Full name: Ebola virus H.sapiens-tc/COD/1976/Yambuku-Mayinga Shortened name: EBOV/H.sap/COD/76/Yam-May Abbreviated name: EBOV/Yam-May
Accordingly, in RefSeq #NC_002549 the definition line “Zaire ebolavirus, complete genome” was changed to “
Reston virus (RESTV) has caused multiple epizootics among captive macaques (1989-1990, 1992, 1996) and domestic pigs in 2008 (reviewed in [Full name: Reston virus M.fascicularis-tc/USA/1989/Philippines89-Pennsylvania Shortened name: RESTV/M.fas/USA/89/Phi89-Pen Abbreviated name: RESTV/Phi89-Pen
Accordingly, in RefSeq #NC_004161, the definition line “Reston ebolavirus, complete genome” was changed to “
Sudan virus (SUDV) is the second-best characterized ebolavirus. Approximately 15 SUDV isolates have been described, but very few experiments have been performed with any of these isolates. Early experiments focused on isolate “Boneface” (often misspelled “Boniface”). Recently variant “Gulu” isolate “808892” has become a more popular choice, and data from experiments with this virus continue to accumulate (reviewed in [Full name: Sudan virus H.sapiens-tc/UGA/2000/Gulu-808892 Shortened name: SUDV/H.sap/UGA/00/Gul-808892 Abbreviated name: SUDV/Gul-808892
Accordingly, in RefSeq #NC_006432, the definition line “Sudan ebolavirus, complete genome” was changed to “
Taï Forest virus (TAFV) is the least characterized ebolavirus. Only one isolate (“807212” = “CI”) was obtained from a female survivor [
We propose the variant designation “Pauléoula” (after the village of Pauléoula, Guiglo Department in Moyen-Cavally Region, Côte d’Ivoire, where TAFV was first found [Full name: Taï Forest virus H.sapiens-tc/CIV/1994/Pauléoula-CI Shortened name: TAFV/H.sap/CIV/94/Pau-CI Abbreviated name: TAFV/Pau-CI
Accordingly, in RefSeq # NC_014372, the definition line “Tai Forest ebolavirus, complete genome” was changed to “
The genus
Marburg virus (MARV) is the most thoroughly characterized marburgvirus. Some 70 MARV isolates are available, but the majority of published experiments have been performed with isolate “Musoke” (reviewed in [Full name: Marburg virus H.sapiens-tc/KEN/1980/Mt. Elgon-Musoke Shortened name: MARV/Hsap/KEN/80/MtE-Mus Abbreviated name: MARV/MtE-Mus
Accordingly, in RefSeq #NC_001608, the definition line “Marburg marburgvirus, complete genome” was changed to “
Ravn virus (RAVV) is a largely uncharacterized marburgvirus that belongs to the same species as MARV. At least three human (“Ravn” = “810040,” “09DCR,” ”02Uga”) and four Egyptian rousette isolates (“44Bat,” “188Bat,” “982Bat,” “1304 Bat”) have been obtained. Virtually all RAVV characterization experiments have been performed with “Ravn” = “810040,” which was obtained after at least two passages in SW-13 cells and four passages in Vero E6 cells. Since RAVV is a phylogenetically distinct marburgvirus, we created a RefSeq entry for the “Ravn” isolate, for which we propose the variant designation “Kitum Cave” (after Kenya’s Kitum Cave on Mount Elgon where RAVV first emerged) and the isolate designation “810040”:
Full name: Ravn virus H.sapiens-tc/KEN/1987/Kitum Cave-810040 Shortened name: RAVV/H.sap/KEN/87/KiC-810040 Abbreviated name: RAVV/KiC-810040
Accordingly, the RefSeq entry was created with the definition line “
A summary of the proposed designations and RefSeq accession numbers can be found in
Final filovirus type viruses/variants/isolates/sequences.
| Filovirus Species | Type Virus of Species (Virus Abbreviation) | Type Variant and Isolate of Type Virus of Species | Type Sequence of Type Variant of Type Virus of Species (RefSeq) |
|---|---|---|---|
| Bundibugyo virus (BDBV) | Bundibugyo virus H.sapiens-tc/UGA/2007/Butalya-811250 | NC_014373 | |
| Lloviu virus (LLOV) | Lloviu virus M.schreibersii-wt/ESP/2003/Asturias-Bat861 | NC_016144 | |
| Marburg virus (MARV) | Marburg virus H.sapiens-tc/KEN/1980/Mt. Elgon-Musoke | NC_001608 | |
| Reston virus (RESTV) | Reston virus M.fascicularis-tc/USA/1989/Philippines89-Pennsylvania | NC_004161 | |
| Sudan virus (SUDV) | Sudan virus H.sapiens-tc/UGA/2000/Gulu-808892 | NC_006432 | |
| Taï Forest virus (TAFV) | Taï Forest virus H.sapiens-tc/CIV/1994/Pauléoula-CI | NC_014372 | |
| Ebola virus (EBOV) | Ebola virus H.sapiens-tc/COD/1976/Yambuku-Mayinga | NC_002549 |
1 Note that LLOV has not been isolated in culture yet. “Isolate” here refers to the theoretical isolate, the coding sequences of which would correspond to this RefSeq sequence.
We thank Laura Bollinger (IRF-Frederick) for carefully editing the manuscript. The content of this publication does not necessarily reflect the views or policies of the US Department of the Army, the US Department of Defense or the US Department of Health and Human Services or of the institutions and companies affiliated with the authors. J.H.K. performed this work as an employee of Tunnell Government Services, Inc.; M.G.L. as an employee of Lovelace Respiratory Research Institute; and G.G.O. as an employee of MRI Global; all three subcontractors to Battelle Memorial Institute; and J.C.J., J.K., and J.P. performed this work as employees of Battelle Memorial Institute; all under Battelle Memorial Institute’s prime contract with NIAID, under Contract No. HHSN272200700016I. This research was further supported in part by the Intramural Research Program of the NIH, National Library of Medicine (Y.B., O.B., and J.R.B.), and the Intramural Research Program of the NIH, NIAID (T.H.). This work was also funded under Agreement No. HSHQDC-07-C-00020 awarded by the Department of Homeland Security Science and Technology Directorate (DHS/S&T) for the management and operation of the National Biodefense Analysis and Countermeasures Center (NBACC), a Federally Funded Research and Development Center. This work was partially supported by the Defense Threat reduction Agency. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the US Department of Homeland Security. In no event shall the DHS, NBACC, or Battelle National Biodefense Institute (BNBI) have any responsibility or liability for any use, misuse, inability to use, or reliance upon the information contained herein. The Department of Homeland Security does not endorse any products or commercial services mentioned in this publication.
All authors were engaged in the discussion about the best possible RefSeq virus variants and sequences. The final decisions presented in the paper were reached by consensus or simple majority voting, with the understanding that all authors will apply the final decisions reached by the entire group and enforce them in their functions as authors, peer-reviewers, and/or editors.
The authors declare no conflict of interest.