NCBITaxonomy.jl: rapid biological names finding and reconciliation

NCBITaxonomy.jl is a Julia package designed to address the complex challenges of taxonomic name reconciliation using a local copy of the NCBI taxonomic backbone (Federhen in Nucleic Acids Res 40:D136–D143, 2012, Schoch et al. in Database 2020:baaa062, 2020). The package provides advanced name matchi...

Full description

Saved in:
Bibliographic Details
Published inBMC ecology and evolution Vol. 25; no. 1; pp. 84 - 5
Main Authors Poisot, Timothée, Gibb, Rory, Ryan, Sadie J., Carlson, Colin J.
Format Journal Article
LanguageEnglish
Published London BioMed Central 20.08.2025
BioMed Central Ltd
BMC
Subjects
Online AccessGet full text
ISSN2730-7182
2730-7182
DOI10.1186/s12862-025-02425-4

Cover

More Information
Summary:NCBITaxonomy.jl is a Julia package designed to address the complex challenges of taxonomic name reconciliation using a local copy of the NCBI taxonomic backbone (Federhen in Nucleic Acids Res 40:D136–D143, 2012, Schoch et al. in Database 2020:baaa062, 2020). The package provides advanced name matching capabilities that handle common issues in taxonomic data, including synonyms, homonyms, vernacular names, nomenclatural changes, and typographical errors. Core functionalities include case-insensitive search, customizable fuzzy string matching, and taxonomically-restricted searches. The package implements a robust exception system that explicitly handles ambiguous matches without interrupting workflow execution, enabling automated processing of large datasets. NCBITaxonomy.jl works with Julia 1.6 and up, uses Apache Arrow format for efficient local storage. It provides lineage navigation and taxonomic distance functions. The package has been successfully deployed in large-scale projects for automated name reconciliation and cleaning, demonstrating its effectiveness for high-throughput name reconciliation across heterogeneous biological datasets. The design prioritizes programmatic access over command-line usage, making it well-suited for integration into bioinformatics pipelines requiring reliable taxonomic standardization.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2730-7182
2730-7182
DOI:10.1186/s12862-025-02425-4