Scraping the bottom of the barrel: are rare high throughput sequences artifacts?

Metabarcoding data generated using next-generation sequencing (NGS) technologies are overwhelmed with rare taxa and skewed in Operational Taxonomic Unit (OTU) frequencies comprised of few dominant taxa. Low frequency OTUs comprise a rare biosphere of singleton and doubleton OTUs, which may include m...

Full description

Saved in:
Bibliographic Details
Published inFungal ecology Vol. 13; no. C; pp. 221 - 225
Main Authors Brown, Shawn P., Veach, Allison M., Rigdon-Huss, Anne R., Grond, Kirsten, Lickteig, Spencer K., Lothamer, Kale, Oliver, Alena K., Jumpponen, Ari
Format Journal Article
LanguageEnglish
Published United Kingdom Elsevier 01.02.2015
Subjects
Online AccessGet full text
ISSN1754-5048
DOI10.1016/j.funeco.2014.08.006

Cover

More Information
Summary:Metabarcoding data generated using next-generation sequencing (NGS) technologies are overwhelmed with rare taxa and skewed in Operational Taxonomic Unit (OTU) frequencies comprised of few dominant taxa. Low frequency OTUs comprise a rare biosphere of singleton and doubleton OTUs, which may include many artifacts. We present an in-depth analysis of global singletons across sixteen NGS libraries representing different ribosomal RNA gene regions, NGS technologies and chemistries. Our data indicate that many singletons (average of 38 % across gene regions) are likely artifacts or potential artifacts, but a large fraction can be assigned to lower taxonomic levels with very high bootstrap support ( similar to 32 % of sequences to genus with greater than or equal to 90 % bootstrap cutoff). Further, many singletons clustered into rare OTUs from other datasets highlighting their overlap across datasets or the poor performance of clustering algorithms. These data emphasize a need for caution when discarding rare sequence data en masse: such practices may result in throwing the baby out with the bathwater, and underestimating the biodiversity. Yet, the rare sequences are unlikely to greatly affect ecological metrics. As a result, it may be prudent to err on the side of caution and omit rare OTUs prior to downstream analyses.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
SC0004953
USDOE Office of Science (SC), Biological and Environmental Research (BER)
ISSN:1754-5048
DOI:10.1016/j.funeco.2014.08.006