Skip directly to site content Skip directly to page options Skip directly to A-Z link Skip directly to A-Z link Skip directly to A-Z link

Disclaimer: Early release articles are not considered as final versions. Any changes will be reflected in the online version in the month the article is officially released.

Volume 27, Number 5—May 2021

Monitoring SARS-CoV-2 Circulation and Diversity through Community Wastewater Sequencing, the Netherlands and Belgium

Ray Izquierdo-Lara, Goffe Elsinga, Leo Heijnen, Bas B. Oude Munnink, Claudia M.E. Schapendonk, David Nieuwenhuijse, Matthijs Kon, Lu Lu, Frank M. Aarestrup, Samantha Lycett, Gertjan Medema1, Marion P.G. Koopmans1, and Miranda de Graaf1Comments to Author 
Author affiliations: Erasmus University Medical Center, Rotterdam, the Netherlands (R. Izquierdo-Lara, B.B. Oude Munnink, C.M.E. Schapendonk, D. Nieuwenhuijse, M. Kon, M.P.G. Koopmans, M. de Graaf); KWR Water Research Institute, Nieuwegein, the Netherlands (G. Elsinga, L. Heijnen, G. Medema); University of Edinburgh, Edinburgh, Scotland, UK (L. Lu, S. Lycett); Technical University of Denmark, Kongens Lyngby, Denmark (F.M. Aarestrup)

Suggested citation for this article


Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has rapidly become a major global health problem, and public health surveillance is crucial to monitor and prevent virus spread. Wastewater-based epidemiology has been proposed as an addition to disease-based surveillance because virus is shed in the feces of ≈40% of infected persons. We used next-generation sequencing of sewage samples to evaluate the diversity of SARS-CoV-2 at the community level in the Netherlands and Belgium. Phylogenetic analysis revealed the presence of the most prevalent clades (19A, 20A, and 20B) and clustering of sewage samples with clinical samples from the same region. We distinguished multiple clades within a single sewage sample by using low-frequency variant analysis. In addition, several novel mutations in the SARS-CoV-2 genome were detected. Our results illustrate how wastewater can be used to investigate the diversity of SARS-CoV-2 viruses circulating in a community and identify new outbreaks.

Since its discovery, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused >100 million confirmed cases of coronavirus disease (COVID-19). The global effects of SARS-CoV-2 and the need to learn more about its origin and epidemiology have resulted in the sequencing of >416,000 genomes as of January 2021 (1). This work has enabled the identification of groups of viruses that, on the basis of their genetic diversity, can be associated with geographic and temporal patterns of virus spread (2). Nextstrain ( currently divides SARS-CoV-2 diversity into 12 major global clades (19A, 19B, and 20A–20J), on the basis of high prevalence, signature mutations, and geographic spread (3).

Although SARS-CoV-2 primarily affects respiratory tract tissues, it can also replicate in the gastrointestinal tract, as evidenced by in vitro infection of enteroids (4), presence of viral proteins in gastrointestinal epithelium biopsy specimens (5), and detection of infectious virus in stool samples (6). Viral RNA is shed in the feces of ≈40% of infected persons, often for longer periods than the virus can be detected in nasal swab specimens. Detection of SARS-CoV-2 RNA in urine has been observed occasionally (<5% of infected patients) (79).

Because of the rapid spread of SARS-CoV-2, individual screening of clinical cases and study of viral diversity on a population level are challenging. Various reports have demonstrated that enteric and respiratory viruses can be detected in wastewater (1018). This finding has led to the recognition of wastewater-based epidemiology as a potentially valuable tool to assess the spread of the disease at a community level. Recently, the Water Research Institute in the Netherlands and other groups have demonstrated temporal correlations between SARS-CoV-2 RNA titers in sewage and the number of reported cases in a city or county when >26 gene copies per liter could be detected (14,1921). Therefore, sewage testing is currently considered globally to be an adjunct to patient-based surveillance and demonstrates promise as an early warning indicator of increasing virus circulation.

Enhanced surveillance is a key pillar of the current strategy to control the spread of SARS-CoV-2 and includes frequently testing mildly symptomatic persons, investigating infection clusters to identify possible common exposures, and monitoring hospital admission rates. Whole-genome sequencing of SARS-CoV-2 from clinical samples has been adopted as an additional tool to identify clusters. Particularly in geographic areas with minimal virus circulation, sequencing can help identify possible sources, provided that sufficient background sequencing has been performed. So far, little work has been done to correlate SARS-CoV-2 diversity in sewage samples with diversity in patients (22,23). We used next-generation sequencing (NGS) of SARS-CoV-2 from wastewater samples to assess whether these samples reflect the diversity of SARS-CoV-2 circulating within the population of the Netherlands and Belgium.


Sample Preparation

Wastewater specimens were collected as 24-h flow-dependent composite samples and processed as previously described (14). Debris of 100–200 mL of sewage samples was pelleted and the supernatant was concentrated by using 100 kDa Centricon ultrafilters (Millipore Sigma,; in vitro–transcribed dengue virus type-2 RNA was added as an internal extraction control. RNA was extracted by using the Nuclisens kit (bioMérieux, and KingFisher purification system (Thermo Fisher Scientific, (14). RNA was screened by quantitative reverse transcription PCR (qRT-PCR) with 5 primer–probe sets targeting the SARS-CoV-2 nucleocapsid (N) gene (N1–N3) (24), envelope (E) gene for all sarbecoviruses (25), and the internal control.


We performed SARS-CoV-2–specific multiplex PCR for nanopore sequencing as described previously (26). Primers for 89 overlapping amplicons spanning the genome were used in 2 PCR pools. Libraries were generated by using the Oxford Nanopore native barcode kits (Oxford Nanopore Technologies, and sequenced on a R9.4 flow cell.

Illumina sequencing was performed as described previously (27). Amplicons were generated by the multiplex PCR described previously. Amplicons were purified with 0.8X AMPure XP beads (Beckman Coulter, and 100 ng of DNA was converted into paired-end Illumina sequencing libraries by using the KAPA HyperPlus library preparation kit (Roche, We used the KAPA Unique Dual-Indexed Adapters Kit (Roche) to enable subsequent sequencing of multiple libraries in a single Illumina MiSeq version 3 flowcell (2 × 300 cycles) (Illumina,

Nanopore Sequence Analysis

Raw sequence data were processed as previously described (26). We used a snakemake script to demultiplex fastq raw reads by using Porechop (, to trim primers by using Cutadapt (28), and to perform a reference-based alignment by using minimap2 to GISAID sequence EPI_ISL_412973 ( The run was monitored by using RAMPART ( The consensus genome was extracted by using 2 analyses for which positions with a coverage <10X or <30X were replaced with an N. We confirmed mutations in the genome by manually checking the alignment in Ugene (29) and resolved homopolymeric regions by consulting reference genomes. On the basis of previous studies (30), we considered mutations with >30X coverage high quality, whereas mutations >10X and <30X coverage were considered low quality.

Illumina Sequence Analysis

We used a customized Galaxy workflow (31) for all processing, reference-based alignment, and variant analysis.Raw sequencing reads were filtered by using Fastp (32) to remove adaptor contamination, ambiguous bases, low quality reads (Phred score <30), and fragments <50 nt. Reads were mapped against GISAID sequence EPI_ISL_412973 by using the default settings of BWA-MEM (H. Li, unpub. data, Reads were realigned by using the leftalign utility from FreeBayes (E. Garrison, unpub. data, All reads with mapping scores of <30 were discarded. Consensus sequences and variants were generated by using iVar (33). Final consensus sequences (frequency >50%) were constructed by using all mapped reads with a coverage of >5X and Phred score of >30. For detection of low-frequency variants (LFVs), we parameters as follows: minimum coverage of 50X, Phred score >30, and a minimum frequency threshold of 10%. Variant calling was confirmed by manual inspection of the aligned reads in Ugene (29). Variant positions are given with respect to the Wuhan-Hu-1 strain (MN908947) (34). We uploaded all consensus sequences with coverage >50% to GISAID (accession nos. EPI_ISL_539300–25).

Phylogenetic Analysis

The first dataset included all full-length SARS-CoV-2 genomes from the Netherlands (1,544 genomes) and Belgium (888 genomes) from GISAID as of July 8, 2020. The second dataset was a subsample representative of the global diversity of all SARS-CoV-2 sequences in GISAID as of June 1, 2020. This global dataset contained 2,552 subsampled sequences (full length with Ns <5%) to include 1 unique genome per country or state per week. We aligned sequences with >75% genome coverage by using MAFFT ( and inferred maximum-likelihood trees by using the best predicted models general time-reversible plus F plus R3 (global subsample) and general time-reversible plus F plus R2 (Netherlands–Belgium dataset) and bootstrap with 1,000 replicates. Trees were visualized by using Figtree version 1.4.4 ( Clades were assigned by using the Nextclade tool.


Correlation between qRT-PCR and Percentage of Genome Recovered

Previously, sewage samples collected from 6 locations in the Netherlands (and Schiphol Airport) were tested by qRT-PCR to investigate the levels of SARS-CoV-2 RNA (14). To further investigate the genetic diversity of SARS-CoV-2, we subjected 55 wastewater samples obtained from 13 locations in the Netherlands (48 samples) and 7 locations in Belgium (7 samples) with cycle threshold (Ct) values of <36 to whole-genome sequencing by using nanopore technology. The wastewater treatment plants in the Netherlands served ≈200,000–980,000 inhabitants; Schiphol was estimated to serve 54,000 persons (14). The samples covered a period of 70 days (March 25–June 3, 2020); of all 55 samples, 2 (Franeker-92719 and AmsterdamWest-92852) were sequenced by nanopore twice. Of the 55 samples, 24 were also sequenced by Illumina (Table 1).

Figure 1

Quantitative reverse transcription PCR Ct of severe acute respiratory syndrome coronavirus 2 RNA in sewage samples as determined by N gene (N1–N3) and E gene assays against the percentage of genome covered (>10×) by nanopore reads, the Netherlands and Belgium. A) N1 gene; B) N2 gene; C) N3 gene; D) E gene. Ct, cycle threshold.

Figure 1. Quantitative reverse transcription PCR Ctof severe acute respiratory syndrome coronavirus 2 RNA in sewage samples as determined by N gene (N1–N3) and E gene assays against the percentage...

We used 4 primer–probe sets targeting the N (N1–N3) genes and E gene to evaluate the concentration of SARS-CoV-2 in sewage samples (Table 1) (14). The percentage of the genome covered by the assembly of nanopore reads (>10X coverage) ranged from 0% to 99.2%. We found an inverse sigmoidal correlation between the percentage of the genome assembled from nanopore sequencing reads and the N and E gene Ct values (Figure 1). The Ct values at which half of the genome could be obtained were 34.6 for N1, 33.8 for N2, 33.2 for N3, and 32.5 for E. No correlation was observed between Ct values and the percentage of the genome assembled from Illumina reads (Appendix Figure 1).

Consensus Sequences

Figure 2

Phylogenetic analysis of severe acute respiratory syndrome coronavirus 2 genome consensus sequences detected in sewage samples, the Netherlands and Belgium. A) The Netherlands subsample dataset; B) global subsample dataset. Lines with dots in green indicate samples sequenced in this study. Clades (19A, 19B, 20A, 20B, and 20C) were assigned by using the Nextclade tool ( For the global subsample tree, samples in orange indicate the Netherlands sequences. Samples in purple indicate Belgium sequences. Scale bars indicate inferred number of nucleotide substitutions per site.

Figure 2. Phylogenetic analysis of severe acute respiratory syndrome coronavirus 2 genome consensus sequences detected in sewage samples, the Netherlands and Belgium. A) The Netherlands subsample dataset; B) global subsample dataset. Lines...

Figure 3

Phylogenetic trees showing 4 low-frequency variants detected in sewage samples in study of severe acute respiratory syndrome coronavirus 2 circulation and diversity through community wastewater sequencing, the Netherlands and Belgium. A) The Netherlands–Belgium subsample; B) global subsample. Patient sequences containing the mutation are shown in magenta. Lines in green indicate sewage samples sequenced in this study. Clades (19A, 19B, 20A, 20B, and 20C) are indicated in colors at the left of the figure. Blue arrows show the consensus sequences (if available) of the sewage samples in which the low-frequency variant was detected. Scale bars indicate the inferred number of nucleotide substitutions per site.

Figure 3. Phylogenetic trees showing 4 low-frequency variants detected in sewage samples in study of severe acute respiratory syndrome coronavirus 2 circulation and diversity through community wastewater sequencing, the Netherlands and Belgium....

We performed phylogenetic analysis to assess whether consensus sequences from sewage could be associated with clinical samples from the same region. A total of 22 genomes (20 from nanopore and 2 from Illumina runs) with a coverage >75% of the genome were obtained from 20 samples. We used these sequences to infer a maximum-likelihood tree using all sequences from the Netherlands and Belgium available in GISAID and a maximum-likelihood tree using a subset representative of the global diversity of SARS-CoV-2 in GISAID. In general, the sequences from the Netherlands and Belgium grouped into 5 clades (Figure 2, panel A), and most of the sequences belonged to clade 20A (52.0% for the Netherlands and 47.7% for Belgium). The clades 19B and 20C were less prevalent; 8.9% of sequences from the Netherlands belonged to 19B and 1.2% to 20C, whereas 10.4% of Belgium sequences belonged to 19B and 0.3% to 20C. Both trees showed that sewage samples grouped within clades 19A, 20A, and 20B (Figure 2). Samples Franeker-92719 and HeeswijkDinther-92499 clustered with sequences isolated from patients from the same region (Figure 2, panel A), indicating that sewage samples can be linked to specific outbreaks. Included in the phylogenetic trees were 2 samples with 2 consensus sequences (AmsterdamWest-92852 and Franeker-92719), which demonstrated 2-mutation differences between consensus sequences of the same sample (Appendix Table 1). Despite this discrepancy, consensus sequences from the same sample clustered within the same clade (Appendix Figures 2, 3). Some sequences clustered close to the root of the tree, probably because of the presence of multiple strains within 1 sample, which resulted in a combination of mutations in their consensus sequences.

To associate samples with a particular clade or cluster, we compared all consensus sequences, including partial sequences, with the Wuhan-Hu-1 reference isolate. A total of 145 single-nucleotide polymorphisms (SNPs) were detected in our dataset (Appendix Table 1). Of these, 24 SNPs were detected in >1 sequence. We also detected SNPs in the Netherlands sewage sequences with a geographic regional signal, which were present in the Netherlands clinical samples at much higher frequencies than in global or Belgium clinical samples, such as T514C and C1594T (Appendix Table 2).

Finding clade-defining mutations in the consensus sequence suggests the dominance of a certain clade within a sample; the presence of these mutations can also aid in the detection of virus mixtures in a sample. During the period of wastewater-sample collection, Nextstrain defined 5 major clades (19A, 19B, 20A, 20B, and 20C). Each clade is defined by the presence of >2 linked mutations. Clade 19A is the root clade and contains the Wuhan-Hu-1 reference sequence. Both 19B and 20A emerged from 19A, where 2 and 3 linked mutations define these major clades: T28144C and C8782T define 19B; and C3037T, C14408T, and A23403G define 20A. Clades 20B and 20C emerged from 20A, where the trinucleotide substitution GGG28881–28883AAC defines 20B and the linked mutations C1059T and G25563T define 20C. Nucleotide substitution A23403G, a signature mutation of clades 20A, 20B, and 20C that generates the D614G amino-acid substitution in the S glycoprotein, was detected in 83.6% (51/61) of the samples that were sequenced at this region (Appendix Table 1). The GGG28881–28883AAC substitution was detected in 41.9% (18/43) of the sequences. One of the 2 mutations defining the low-prevalence clades 20C and 19B (C1059T and T28144C) was found in 2 and 3 consensus sequences. However, these sequences could not be assigned to these clades because regions containing the additional clade-defining mutations were not sequenced with sufficient coverage. The hCoV-19/env/Netherlands/Amersfoort-92503-N/2020 sequence contained a mix of clade-defining mutations: C1059T, which defines 20C; T28144C, which defines 19B; and GGG28881–28883AAC, which defines 20B. This finding indicates that the obtained consensus sequence does not represent a single strain.

In addition to the clade-defining mutations, we detected 49 and 63 SNPs that were not present in either the Netherlands (1,544 sequences) or Belgium (888 sequences) datasets but were seen in the global dataset (55,074 sequences), although with <1% prevalence (Appendix Table 2). Moreover, we detected 51 novel mutations in sewage consensus sequences that were not previously reported, of which 48 were supported by coverage above the thresholds set for high quality (coverage >30× for Nanopore and coverage >5× and Phred score >30 for Illumina). Discrepancies between consensus sequences of the same sewage sample can occur. AmsterdamWest-92852 was sequenced 3 times and 4 positions varied (Appendix Table 1). These differences are explained by the presence of variant sites in a single sample in similar percentages, which resulted in differences in consensus sequences between sequencing runs.

LFV Analysis

Given that sewage samples are likely to contain a mixture of SARS-CoV-2 strains, we performed a variant analysis with Illumina data to distinguish multiple strains within single samples. By using a coverage >50×, Phred score >30, and a frequency threshold of >10% as settings, we found 21 positions with at least 1 sample containing major and minor variants (Table 2). Of these, 14 mutations resulted in changes at the amino acid level (12 nonsynonymous mutations and 2 deletions). Of note, 8 of these (4497C, 10514C, 11484T, 13046A, 16538_16540delATA, 16777T, 16823T, and 28736A) are novel mutations that did not appear in the Netherlands–Belgium or global datasets. The other 7 variants appeared but demonstrated low prevalence in both datasets (0.002%–0.130%). The most prominent of these was the 28139A mutation in a wastewater sample from March, which was detected in only 4 sequences worldwide and demonstrated both a strong temporal (all detected in March 2020) and regional signal (2 sequences from the Netherlands [EPI_ISL_422640 and EPI_ISL_422880], 1 from Denmark [EPI_ISL_444879], and 1 from Belgium [EPI_ISL_458209]).

Finally, 4 variants (1440A, 11083T, 11109T, and 24862G) appeared at higher levels in both datasets (>0.5%); 11109T and 24862G were 28.5 and 14.3 times more prevalent in the Netherlands dataset than in the global dataset (Table 2). The other variants appeared at similar frequencies in all datasets.

In addition to consensus sequences, LFV analysis is of value in the identification of potential local outbreaks. This identification could be achieved by detecting cluster-defining mutations that are associated with sequences from a particular geographic area. To associate the presence of a minor variant to sequences belonging to unique clusters, we mapped the 4 most prevalent LFVs onto the Netherlands–Belgium subsample and global subsample phylogenetic trees (Figure 3). For 3 variants (1440A, 11109T, and 24862G), the presence of the mutation and their clustering on the phylogenies were clearly associated. However, when 1 of these 3 variants was detected as an LFV in a sewage sample, the consensus sequence of this sample did not group with the cluster of clinical samples that contains the variant. For example, the 24862G variant in sample Tilburg-94339 was detected in 2 unique clusters within clade 20A, whereas its consensus sequence (hCoV-19/env/Netherlands/Tilburg-94339-I/2020) clustered within clade 20B, suggesting the presence of both clades in this sample. Although mutation 11083T was most prevalent in clade 19A, it was also scattered along the trees, indicating poor association with a particular clade.


The use of wastewater sampling as a tool to learn more about the epidemiology and diversity of SARS-CoV-2 at a community level offers many advantages over human sampling. Sewage samples are relatively easy to collect, sampling bias toward severe cases does not occur, ethical issues are limited, and potentially fewer samples are required to determine temporal changes of viral infections in the community (35,36). Nevertheless, comprehensive comparisons with clinical surveillance are required to determine the extent and limits of using sewage as a surveillance or early-warning tool.

We used nanopore and Illumina NGS analysis to study the diversity of SARS-CoV-2 in sewage and compared these results to the viral diversity found in clinical samples. To evaluate this diversity in a comprehensive fashion, we used the Nextstrain clade classification system because it is based on the use of signature mutations to assign sequences to a clade (3), enabling the association of SNPs or LFV to a particular clade, especially for genome sequences with <75% coverage.

Our method enabled us to obtain complete or near-complete genomes from wastewater samples with Ct values of >5 Cts below the limit of detection and partial genomes for samples with higher Ct values. To increase the percentage of genome covered, a threshold of 10× coverage per position was used to generate consensus sequences from nanopore reads. The error rate with this threshold is <0.03%, and most of the mutations (132/145) listed have a coverage of >30×, which produces an error rate of 1/585,000 nt (30).

Of note, we found sewage samples that clustered with sequences isolated from patients of the same region and LFV with a strong regional signal. In a recent study from the United States, wastewater contained SARS-CoV-2 genomes identical to those in clinical samples from the same region (37). Sewage samples can contain a mixture of SARS-CoV-2 viruses, which can be an indication of multiple viruses circulating within a community and perhaps in domestic and livestock animals (3842). We applied a targeted amplification method and thus did not assess the presence of other viruses. Consensus sequence genomes from a wastewater sample can identify the predominant virus strain in a population, which is suitable for locations with few introductions of the virus (22,23). However, this approach is not appropriate for a population in which multiple virus strains are circulating in parallel. Moreover, it might lead to artificial consensus genomes that do not represent an existing virus.

NGS analysis can unravel the diversity of viruses within a complex sample such as wastewater, particularly by using unbiased sequencing of the sewage virome (43). Nevertheless, the detection of variants of a virus in a single sample can be challenging because of the relatively low number of reads obtained for each virus. Targeted amplification and NGS of a small genome region of the virus of interest to determine the prevalence of virus variants within a single wastewater sample is more sensitive and less expensive; use of this approach has been reported for enteroviruses, human mastadenoviruses, and noroviruses (12,18,44). Because the diversity of SARS-CoV-2 is still limited, however, this approach would not be useful since no single small piece of the genome can reliably differentiate between clades or lineages. However, we demonstrated that some LFVs and SNPs can be linked to particular clusters or clades within trees without the need for a complete genome. To confidently determine the presence of a particular cluster within a sample, at least 2 LFVs associated with the cluster should be present at substantial levels. Furthermore, variant analysis can also be used to monitor the prevalence of biologically relevant mutations, such as D614G, which has been shown to increase infectivity in vitro (45) and might be associated with higher transmission and death rates (46; M. Cortey, unpub. data, Within our dataset, clear temporal changes in the prevalence of LFVs or SNPs in sewage samples that correlated with changes in the clinical dataset were not detected during the first wave.

The combination of whole-genome sequencing of clinical samples with epidemiologic data is vital for public health decision-making (26) because it helps identify clusters of infection, new introductions of virus, and the expansion and decline of circulating strains. Cities with large numbers of visitors are expected to experience several introductions of the virus, whereas the opposite is expected for cities with low numbers of visitors. The use of NGS analysis of sewage samples to evaluate viral diversity within a geographic area and its changes over time can aid in decision-making. For example, in scenarios in which a large increase of viral diversity is detected in sewage, suggesting new introductions of virus, appropriate measures can be taken.

Wastewater can also be used to monitor novel mutations. Our consensus and LFV analyses revealed 57 mutations that were not seen in the global database. These novel mutations might not have been detected for several reasons: they represent technical errors; the mutations did not stay within the population; or the mutations are associated with asymptomatic or mild disease, viruses from animal hosts, enteric shedding, or defective genomes. The presence of defective genomes has previously been suggested for the detection of LFVs that generate stop codons in clinical samples (47). Phenotypic studies could help determine the likelihood and biologic relevance of these novel mutations.

In conclusion, this study illustrates the value of NGS analysis of wastewater to approximate the diversity of SARS-CoV-2 circulating in a community. Sequencing of wastewater samples could be a powerful tool to complement clinical surveillance or could be used independently in settings in which wide clinical sequencing is unfeasible. In addition, in-depth NGS analysis of wastewater samples can help in assessing changes in viral diversity, which can indicate the emergence of epidemiologically or clinically relevant mutations and thereby aid public health decision-making.

Mr. Izquierdo-Lara is a PhD student in the Department of Viroscience, Erasmus University Medical Center, Rotterdam, the Netherlands. His research interests are chronic norovirus infections in immunocompromised patients and virus evolution and emergence.



We thank the Water Authorities in the Netherlands (Aa en Maas, Amstel Gooi en Vecht, Delfland, De Dommel, Fryslan, Hollands Noorderkwartier, Stichtse Rijnlanden, Vallei en Veluwe, Evides, Waternet) and Belgium (Aquafin, De Watergroep) for the provision of the sewage samples. We thank Pelle van der Wal for his help with the Illumina MiSeq runs. We gratefully acknowledge the authors originating and submitting laboratories of the global sequences from the GISAID EpiCoV Database (1), on which this research is based.

This work was supported by the European Union’s Horizon H2020 grants VEO (grant no. 874735) and METASTAVA (grant no. 773830), the Erasmus MC foundation, and the Adessium Foundation.



  1. Shu  Y, McCauley  J. GISAID: Global initiative on sharing all influenza data - from vision to reality. Euro Surveill. 2017;22:30494. DOIPubMed
  2. Rambaut  A, Holmes  EC, O’Toole  Á, Hill  V, McCrone  JT, Ruis  C, et al. A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology. Nat Microbiol. 2020;5:14037. DOIPubMed
  3. Hadfield  J, Megill  C, Bell  SM, Huddleston  J, Potter  B, Callender  C, et al. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics. 2018;34:41213. DOIPubMed
  4. Lamers  MM, Beumer  J, van der Vaart  J, Knoops  K, Puschhof  J, Breugem  TI, et al. SARS-CoV-2 productively infects human gut enterocytes. Science. 2020;369:504. DOIPubMed
  5. Xiao  F, Tang  M, Zheng  X, Liu  Y, Li  X, Shan  H. Evidence for gastrointestinal infection of SARS-CoV-2. Gastroenterology. 2020;158:18311833.e3. DOIPubMed
  6. Xiao  F, Sun  J, Xu  Y, Li  F, Huang  X, Li  H, et al. Infectious SARS-CoV-2 in feces of patient with severe COVID-19. Emerg Infect Dis. 2020;26:19202. DOIPubMed
  7. Parasa  S, Desai  M, Thoguluva Chandrasekar  V, Patel  HK, Kennedy  KF, Roesch  T, et al. Prevalence of gastrointestinal symptoms and fecal viral shedding in patients with coronavirus disease 2019: a systematic review and meta-analysis. JAMA Netw Open. 2020;3:e2011335. DOIPubMed
  8. Jones  DL, Baluja  MQ, Graham  DW, Corbishley  A, McDonald  JE, Malham  SK, et al. Shedding of SARS-CoV-2 in feces and urine and its potential role in person-to-person transmission and the environment-based spread of COVID-19. Sci Total Environ. 2020;749:141364. DOIPubMed
  9. Foladori  P, Cutrupi  F, Segata  N, Manara  S, Pinto  F, Malpei  F, et al. SARS-CoV-2 from faeces to wastewater treatment: What do we know? A review. Sci Total Environ. 2020;743:140444. DOIPubMed
  10. Strubbia  S, Phan  MVT, Schaeffer  J, Koopmans  M, Cotten  M, Le Guyader  FS. Characterization of norovirus and other human enteric viruses in sewage and stool samples through next-generation sequencing. Food Environ Virol. 2019;11:4009. DOIPubMed
  11. Hellmér  M, Paxéus  N, Magnius  L, Enache  L, Arnholm  B, Johansson  A, et al. Detection of pathogenic viruses in sewage provided early warnings of hepatitis A virus and norovirus outbreaks. Appl Environ Microbiol. 2014;80:677181. DOIPubMed
  12. Bisseux  M, Colombet  J, Mirand  A, Roque-Afonso  A-M, Abravanel  F, Izopet  J, et al. Monitoring human enteric viruses in wastewater and relevance to infections encountered in the clinical setting: a one-year experiment in central France, 2014 to 2015. Euro Surveill. 2018;23:1700237. DOIPubMed
  13. Wang  W, Xu  Y, Gao  R, Lu  R, Han  K, Wu  G, et al. Detection of SARS-CoV-2 in different types of clinical specimens. JAMA. 2020;323:18434. DOIPubMed
  14. Medema  G, Heijnen  L, Elsinga  G, Italiaander  R, Brouwer  A. Presence of SARS-coronavirus-2 RNA in sewage and correlation with reported COVID-19 prevalence in the early stage of the epidemic in the Netherlands. Environ Sci Technol Lett. 2020;7:5116. DOI
  15. Heijnen  L, Medema  G. Surveillance of influenza A and the pandemic influenza A (H1N1) 2009 in sewage and surface water in the Netherlands. J Water Health. 2011;9:43442. DOIPubMed
  16. Patel  JC, Diop  OM, Gardner  T, Chavan  S, Jorba  J, Wassilak  SGF, et al. Surveillance to track progress toward polio eradication—worldwide, 2017–2018. MMWR Morb Mortal Wkly Rep. 2019;68:3128. DOIPubMed
  17. Suffredini  E, Iaconelli  M, Equestre  M, Valdazo-González  B, Ciccaglione  AR, Marcantonio  C, et al. Genetic diversity among genogroup II noroviruses and progressive emergence of GII.17 in wastewaters in Italy (2011–2016) revealed by next-generation and Sanger sequencing. Food Environ Virol. 2018;10:14150. DOIPubMed
  18. Fumian  TM, Fioretti  JM, Lun  JH, Dos Santos  IAL, White  PA, Miagostovich  MP. Detection of norovirus epidemic genotypes in raw sewage using next generation sequencing. Environ Int. 2019;123:28291. DOIPubMed
  19. Wu  F, Zhang  J, Xiao  A, Gu  X, Lee  WL, Armas  F, et al. SARS-CoV-2 titers in wastewater are higher than expected from clinically confirmed cases. mSystems. 2020;5:e0061420. DOIPubMed
  20. Randazzo  W, Truchado  P, Cuevas-Ferrando  E, Simón  P, Allende  A, Sánchez  G. SARS-CoV-2 RNA in wastewater anticipated COVID-19 occurrence in a low prevalence area. Water Res. 2020;181:115942. DOIPubMed
  21. Wurtzer  S, Marechal  V, Mouchel  JM, Maday  Y, Teyssou  R, Richard  E, et al. Evaluation of lockdown effect on SARS-CoV-2 dynamics through viral genome quantification in waste water, Greater Paris, France, 5 March to 23 April 2020. Euro Surveill. 2020;25:2000776. DOIPubMed
  22. Rimoldi  SG, Stefani  F, Gigantiello  A, Polesello  S, Comandatore  F, Mileto  D, et al. Presence and infectivity of SARS-CoV-2 virus in wastewaters and rivers. Sci Total Environ. 2020;744:140911. DOIPubMed
  23. Nemudryi  A, Nemudraia  A, Wiegand  T, Surya  K, Buyukyoruk  M, Cicha  C, et al. Temporal detection and phylogenetic assessment of SARS-CoV-2 in municipal wastewater. Cell Rep Med. 2020;1:100098. DOIPubMed
  24. 2019-novel coronavirus (2019-nCoV) real-time rRT-PCR panel primers and probes [cited 2020 Jul 23].
  25. Corman  VM, Landt  O, Kaiser  M, Molenkamp  R, Meijer  A, Chu  DK, et al. Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR. Euro Surveill. 2020;25:2000045. DOIPubMed
  26. Oude Munnink  BB, Nieuwenhuijse  DF, Stein  M, O’Toole  Á, Haverkate  M, Mollers  M, et al.; Dutch-Covid-19 response team. Rapid SARS-CoV-2 whole-genome sequencing and analysis for informed public health decision-making in the Netherlands. Nat Med. 2020;26:140510. DOIPubMed
  27. Richard  M, Kok  A, de Meulder  D, Bestebroer  TM, Lamers  MM, Okba  NMA, et al. SARS-CoV-2 is transmitted via contact and via the air between ferrets. Nat Commun. 2020;11:3496. DOIPubMed
  28. Martin  M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 2011;17:102. DOI
  29. Okonechnikov  K, Golosova  O, Fursov  M; UGENE team. Unipro UGENE: a unified bioinformatics toolkit. Bioinformatics. 2012;28:11667. DOIPubMed
  30. Oude Munnink  BB, Nieuwenhuijse  DF, Sikkema  RS, Koopmans  M. Validating whole genome nanopore sequencing, using Usutu virus as an example. J Vis Exp. 2020;157:e60906. DOIPubMed
  31. Afgan  E, Baker  D, Batut  B, van den Beek  M, Bouvier  D, Cech  M, et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Res. 2018;46(W1):W53744. DOIPubMed
  32. Chen  S, Zhou  Y, Chen  Y, Gu  J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i88490. DOIPubMed
  33. Grubaugh  ND, Gangavarapu  K, Quick  J, Matteson  NL, De Jesus  JG, Main  BJ, et al. An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar. Genome Biol. 2019;20:8. DOIPubMed
  34. Wu  F, Zhao  S, Yu  B, Chen  Y-M, Wang  W, Song  Z-G, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579:2659. DOIPubMed
  35. Farkas  K, Hillary  LS, Malham  SK, McDonald  JE, Jones  DL. Wastewater and public health: the potential of wastewater surveillance for monitoring COVID-19. Curr Opin Environ Sci Health. 2020;17:1420. DOIPubMed
  36. Michael-Kordatou  I, Karaolia  P, Fatta-Kassinos  D. Sewage analysis as a tool for the COVID-19 pandemic response and management: the urgent need for optimised protocols for SARS-CoV-2 detection and quantification. J Environ Chem Eng. 2020;8:104306. DOIPubMed
  37. Crits-Christoph  A, Kantor  RS, Olm  MR, Whitney  ON, Al-Shayeb  B, Lou  YC, et al. Genome sequencing of sewage detects regionally prevalent SARS-CoV-2 variants. mBio. 2021;12:e0270320. DOIPubMed
  38. Mykytyn  AZ, Lamers  MM, Okba  NMA, Breugem  TI, Schipper  D, van den Doel  PB, et al. Susceptibility of rabbits to SARS-CoV-2. Emerg Microbes Infect. 2021;10:17. DOIPubMed
  39. Oreshkova  N, Molenaar  RJ, Vreman  S, Harders  F, Oude Munnink  BB, Hakze-van der Honing  RW, et al. SARS-CoV-2 infection in farmed minks, the Netherlands, April and May 2020. Euro Surveill. 2020;25:2001005. DOIPubMed
  40. Halfmann  PJ, Hatta  M, Chiba  S, Maemura  T, Fan  S, Takeda  M, et al. Transmission of SARS-CoV-2 in domestic cats. N Engl J Med. 2020;383:5924. DOIPubMed
  41. Schlottau  K, Rissmann  M, Graaf  A, Schön  J, Sehl  J, Wylezich  C, et al. SARS-CoV-2 in fruit bats, ferrets, pigs, and chickens: an experimental transmission study. Lancet Microbe. 2020;1:e21825. DOIPubMed
  42. Shi  J, Wen  Z, Zhong  G, Yang  H, Wang  C, Huang  B, et al. Susceptibility of ferrets, cats, dogs, and other domesticated animals to SARS-coronavirus 2. Science. 2020;368:101620. DOIPubMed
  43. Nieuwenhuijse  DF, Oude Munnink  BB, Phan  MVT, Munk  P, Venkatakrishnan  S, Aarestrup  FM, et al.; Global Sewage Surveillance project consortium. Setting a baseline for global urban virome surveillance in sewage. Sci Rep. 2020;10:13748. DOIPubMed
  44. Lun  JH, Crosbie  ND, White  PA. Genetic diversity and quantification of human mastadenoviruses in wastewater from Sydney and Melbourne, Australia. Sci Total Environ. 2019;675:30512. DOIPubMed
  45. Zhang  L, Jackson  CB, Mou  H, Ojha  A, Peng  H, Quinlan  BD, et al. SARS-CoV-2 spike-protein D614G mutation increases virion spike density and infectivity. Nat Commun. 2020;11:6013. DOIPubMed
  46. Toyoshima  Y, Nemoto  K, Matsumoto  S, Nakamura  Y, Kiyotani  K. SARS-CoV-2 genomic variations associated with mortality rate of COVID-19. J Hum Genet. 2020;65:107582. DOIPubMed
  47. Karamitros  T, Papadopoulou  G, Bousali  M, Mexias  A, Tsiodras  S, Mentis  A. SARS-CoV-2 exhibits intra-host genomic plasticity and low-frequency polymorphic quasispecies. J Clin Virol. 2020;131:104585. DOIPubMed




Suggested citation for this article: Izquierdo-Lara R, Elsinga G, Heijnen L, Oude Munnink BB, Schapendonk CME, Nieuwenhuijse D, et al. Monitoring SARS-CoV-2 circulation and diversity through community wastewater sequencing, the Netherlands and Belgium. Emerg Infect Dis. 2021 May [date cited].

DOI: 10.3201/eid2705.204410

Original Publication Date: March 31, 2021

1These senior authors contributed equally to this article.

Table of Contents – Volume 27, Number 5—May 2021


Please use the form below to submit correspondence to the authors or contact them at the following address:

Miranda de Graaf, Department of Viroscience, Wytemaweg 80, 3015CN, Rotterdam, The Netherlands

Send To

10000 character(s) remaining.


Page created: March 02, 2021
Page updated: March 31, 2021
Page reviewed: March 31, 2021
The conclusions, findings, and opinions expressed by authors contributing to this journal do not necessarily reflect the official position of the U.S. Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors' affiliated institutions. Use of trade names is for identification only and does not imply endorsement by any of the groups named above.