Customization of a DADA2-based pipeline for fungal internal transcribed spacer 1 (ITS1) amplicon data sets

Identification and analysis of fungal communities commonly rely on internal transcribed spacer-based (ITS-based) amplicon sequencing. There is no gold standard used to infer and classify fungal constituents since methodologies have been adapted from analyses of bacterial communities. To achieve high...

Full description

Saved in:

Bibliographic Details
Published in	JCI insight Vol. 7; no. 1
Main Authors	Rolling, Thierry, Zhai, Bing, Frame, John, Hohl, Tobias M., Taur, Ying
Format	Journal Article
Language	English
Published	United States American Society for Clinical Investigation 11.01.2022 American Society for Clinical investigation
Subjects	Annotations Bayesian analysis Biological variation Datasets Discrimination Infectious disease Microbiology Microbiota Resource and Technical Advance Ribosomal DNA Saccharomyces cerevisiae Taxonomy Yeast Microbiology Infectious disease Fungal infections
Online Access	Get full text
ISSN	2379-3708 2379-3708
DOI	10.1172/jci.insight.151663

Cover

More Information
Summary:	Identification and analysis of fungal communities commonly rely on internal transcribed spacer-based (ITS-based) amplicon sequencing. There is no gold standard used to infer and classify fungal constituents since methodologies have been adapted from analyses of bacterial communities. To achieve high-resolution inference of fungal constituents, we customized a DADA2-based pipeline using a mix of 11 medically relevant fungi. While DADA2 allowed the discrimination of ITS1 sequences differing by single nucleotides, quality filtering, sequencing bias, and database selection were identified as key variables determining the accuracy of sample inference. Due to species-specific differences in sequencing quality, default filtering settings removed most reads that originated from Aspergillus species, Saccharomyces cerevisiae, and Candida glabrata. By fine-tuning the quality filtering process, we achieved an improved representation of the fungal communities. By adapting a wobble nucleotide in the ITS1 forward primer region, we further increased the yield of S. cerevisiae and C. glabrata sequences. Finally, we showed that a BLAST-based algorithm based on the UNITE+INSD or the NCBI NT database achieved a higher reliability in species-level taxonomic annotation compared with the naive Bayesian classifier implemented in DADA2. These steps optimized a robust fungal ITS1 sequencing pipeline that, in most instances, enabled species-level assignment of community members.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Authorship note: TMH and YT are co–senior authors.
ISSN:	2379-3708 2379-3708
DOI:	10.1172/jci.insight.151663