Proposal for Human Respiratory Syncytial Virus Nomenclature below the Species Level

Human respiratory syncytial virus (HRSV) is the leading viral cause of serious pediatric respiratory disease, and lifelong reinfections are common. Its 2 major subgroups, A and B, exhibit some antigenic variability, enabling HRSV to circulate annually. Globally, research has increased the number of HRSV genomic sequences available. To ensure accurate molecular epidemiology analyses, we propose a uniform nomenclature for HRSV-positive samples and isolates, and HRSV sequences, namely: HRSV/subgroup identifier/geographic identifier/unique sequence identifier/year of sampling. We also propose a template for submitting associated metadata. Universal nomenclature would help researchers retrieve and analyze sequence data to better understand the evolution of this virus.

H uman respiratory syncytial virus (HRSV) is the leading cause of severe respiratory illness in children <5 years of age and is associated with substantial illness from lower respiratory tract infections in industrialized countries and substantial illness and death in low-and middle-income countries (1)(2)(3)(4)(5). HRSV also causes severe disease among elderly and high-risk adults (6).
In 2016, HRSV was reclassifi ed by the International Committee on Virus Taxonomy (ICTV) into a new family, Pneumoviridae, genus, Orthopneumovirus, and species, Human orthopneumovirus. (7). The wider availability of viral sequencing technologies has increased submissions of HRSV sequences to databases ( Figure 1), a trend we anticipate will continue. Human respiratory syncytial virus (HRSV) is the leading viral cause of serious pediatric respiratory disease, and lifelong reinfections are common. Its 2 major subgroups, A and B, exhibit some antigenic variability, enabling HRSV to circulate annually. Globally, research has increased the number of HRSV genomic sequences available. To ensure accurate molecular epidemiology analyses, we propose a uniform nomenclature for HRSV-positive samples and isolates, and HRSV sequences, namely: HRSV/subgroup identifi er/geographic identifi er/unique sequence identifi er/year of sampling. We also propose a template for submitting associated metadata. Universal nomenclature would help researchers retrieve and analyze sequence data to better understand the evolution of this virus.
Although ICTV provides nomenclature standards for virus taxa, there is currently no standardized format for HRSV nomenclature below the species level. Given the current interest in both HRSV and database submissions, a standard nomenclature is needed to simplify studies of the genomic diversity of HRSV strains and variants below the species level. ICTV's taxonomic reassignment provides us a timely opportunity to propose a universal naming convention for HRSV strains, sequences, and isolates, including a framework for database submissions that are rich in contextual information and associated metadata.
Several large laboratory HRSV surveillance and epidemiology studies are currently in progress. These studies include the World Health Organization's Global Respiratory Syncytial Virus (WHO RSV) Surveillance Project (https://www.who.int/influenza/rsv), which conducts large-scale testing for HRSV and extensive sequencing of HRSV-positive clinical specimens from >20 countries worldwide. Focused molecular analyses have helped elucidate HRSV household (8) and local (9) transmission dynamics and may guide development of strategies for the control of HRSV transmission. For example, molecular analysis showed that HRSV in healthcare facilities can be acquired from sources within the facility or introduced from the community (10,11).
In temperate climates, annual HRSV epidemics usually occur in winter months; it remains to be seen how social distancing measures and nonpharmaceutical interventions due to the current coronavirus disease (COVID-19) pandemic will affect global HRSV circulation patterns. One of the 2 major genetic and antigenic HRSV subgroups, A or B, usually predominates in alternating years, but both subgroups can also co-circulate in the same season. Early research has shown that subgroup A HRSV is associated with slightly greater clinical severity than subgroup B (12). Disease severity has been correlated with specific strains, genotypes, or clades, but to date, no consistent association has been established between strains (13)(14)(15), genotypes, or clades (16)(17)(18)(19) and virulence. Thus, a possible role of different HRSV strains in disease severity remains to be elucidated. The lack of standard nomenclature and the scarcity of rich metadata in databases currently limit and complicate such studies.
Reliable and concise nomenclature systems below the species level are available for measles virus, influenza virus, rotavirus, filovirus isolates (20)(21)(22)(23), and many other human viral pathogens. A similar nomenclature system tailored to HRSV and its pathology would support the requirements of researchers and the public health community by minimizing information errors when handling, storing, and shipping HRSV samples and when submitting, searching, and displaying sequencing data and associated metadata. Moreover, consistent nomenclature would improve the ability of researchers to pool and analyze data and associated information from different sources. To fill this need, an international group of researchers, in conjunction with the WHO RSV Global Surveillance Project, proposes a concise nomenclature system for HRSV below the species level.

HRSV Subgroups and Genotype Designations: Status and Outlook
HRSV subgroups A and B exhibit genomewide nucleotide and amino acid divergence (Figure 2, panel A) (25,27). The reference sequences for the 2 subgroups are derived from strains HRSV A2 ( Figure 2, panel B). F glycoprotein sequences between the 2 subgroups are well conserved (89% aa identity), whereas the G glycoproteins are the most divergent (53% aa identity between the subgroups) among the HRSV proteins ( Figure 2, panel A) and undergo continuous molecular evolution. The ectodomain of the G glycoproteins of both subgroups contains a conserved central domain, representing an important antigenic site, flanked by 2 hypervariable domains (33). Except for the central conserved region, the antigenic cross-reactivity between G glycoproteins of the 2 subgroups is low (26). Because the G ORF exhibits the greatest degree of genetic variability between isolates, it is most commonly used for studies on the molecular evolution of HRSV. The genetic variability of HRSV strains over time has been commonly determined by sequencing the distal C-terminal third of the G ORF, which includes the second hypervariable domain. The variability in the G ORF is characterized by a high rate of nonsynonymous nucleotide changes, suggesting that evolution may be driven by immune pressure, even though this factor may be partially antibody independent (34). It is likely that variability in the G protein contributes to the capacity of HRSV to cause yearly outbreaks in the community (35)(36)(37). The nomenclature proposal outlined herein will be useful for the sequence analyses required to follow the molecular evolution of HRSV.
In a parallel effort, several research groups are working together on a genotyping proposal to provide a consensus on uniform genotype designations (38,39   extinct. HRSV genotyping designations will need to capture present molecular evolutionary status and be adaptable to changes and will need to be reevaluated periodically by a global consortium.

Nomenclature Proposal for HRSV Strains and Isolates
For molecular epidemiology studies, a concise standard for short identifiers of specific HRSV sequences, suitable for the short definition lines that give context to a sample and its derived sequence, would be useful.
Ideally, concise standardized identifiers should convey key information about each individual sequence in an alignment or phylogram, including source, date, and type, if known. Here, we aim to define this type of common naming convention for HRSV samples and isolates. We also propose the use using standard names and appropriate annotations for HRSV genes, provide examples to guide the annotation of sequence data during the sequence submission process, and suggest how to submit metadata associated with the source materials of HRSV sequences.   Our nomenclature proposal prioritizes a short, concise definition line that will be easy to use in the laboratory, easily readable, and be a uniform system for HRSV in public databases. Additional host, virus, location or temporal information if desired could be submitted in metadata fields, which would allow researchers, epidemiologists, and database users to apply specific metadata filters, as needed for data retrieval and specific applications, analyses, or for displaying designations, such as in dendrograms.

Terminology for Annotations
To support efficient data analysis, uniform designations must be used at the database submission stage.

Metadata for HRSV Sequence Submissions
What is the most pertinent host data will depend on the interests and objectives of individual study groups. For example, when studying HRSV in a pediatric setting, prematurity may be of interest, but when studying HRSV in an adult setting, researchers may be more interested in whether participants are immunocompromised. We suggest information that could be included in metadata fields for HRSV:

Collection date
We highly recommend that the exact date of specimen collection (DD-Mon-YYYY format; e.g., 17-Feb-2002) be used; if exact date is not known, at least the month and year should be indicated (Mon-YYYY format).

Genotype according to the consensus in genotype classification by an HRSV working group [in progress (38,39)]
Associated with the International RSV Society, a special interest group of the International Society for Influenza and other respiratory viruses https://www. isirv.org/site/index.php/special-interest-groups/ international-respiratory-syncytial-virus-society.

Metadata on the patient-host and the clinical disease should be included in the notes field in a structured format
Protected personally identifiable health information will be excluded from metadata submissions.

Outlook
Molecular surveillance has revealed that multiple HRSV genotypes circulate simultaneously in communities. Circulating genotypes often vary between communities, and circulation patterns within a community can change from year to year. Extended monitoring of circulating viruses is necessary to better understand transmission and molecular evolution (42). As HRSV vaccine candidates and antivirals are being developed, molecular epidemiology studies may reveal potential effects of prevention strategies on viral evolution and possible antibody-escape variants. Timely sharing of HRSV data worldwide through the use of public databases is essential. We propose that sequence data be uploaded to publicly accessible databases, such as NCBI (31). Although NCBI is the most complete repository for HRSV sequence information, studies may require that sequences first be submitted to other databases, such as GISAID (https://www.gisaid.org).