Utilizing Descriptive Statements from the Biodiversity Heritage Library to Expand the Hymenoptera Anatomy Ontology
Hymenoptera, the insect order that includes sawflies, bees, wasps, and ants, exhibits an incredible diversity of phenotypes, with over 145,000 species described in a corpus of textual knowledge since Carolus Linnaeus. In the absence of specialized training, often spanning decades, however, these art...
Saved in:
Published in | PloS one Vol. 8; no. 2; p. e55674 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
United States
Public Library of Science
18.02.2013
Public Library of Science (PLoS) |
Subjects | |
Online Access | Get full text |
ISSN | 1932-6203 1932-6203 |
DOI | 10.1371/journal.pone.0055674 |
Cover
Summary: | Hymenoptera, the insect order that includes sawflies, bees, wasps, and ants, exhibits an incredible diversity of phenotypes, with over 145,000 species described in a corpus of textual knowledge since Carolus Linnaeus. In the absence of specialized training, often spanning decades, however, these articles can be challenging to decipher. Much of the vocabulary is domain-specific (e.g., Hymenoptera biology), historically without a comprehensive glossary, and contains much homonymous and synonymous terminology. The Hymenoptera Anatomy Ontology was developed to surmount this challenge and to aid future communication related to hymenopteran anatomy, as well as provide support for domain experts so they may actively benefit from the anatomy ontology development. As part of HAO development, an active learning, dictionary-based, natural language recognition tool was implemented to facilitate Hymenoptera anatomy term discovery in literature. We present this tool, referred to as the 'Proofer', as part of an iterative approach to growing phenotype-relevant ontologies, regardless of domain. The process of ontology development results in a critical mass of terms that is applied as a filter to the source collection of articles in order to reveal term occurrence and biases in natural language species descriptions. Our results indicate that taxonomists use domain-specific terminology that follows taxonomic specialization, particularly at superfamily and family level groupings and that the developed Proofer tool is effective for term discovery, facilitating ontology construction. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Wrote the software: MJY KCS ZP. Conceived and designed the experiments: KCS ZP MJY MAB ARD. Performed the experiments: KCS ZP MAB MJY. Analyzed the data: KCS ZP. Contributed reagents/materials/analysis tools: ZP MJY KCS. Wrote the paper: KCS ARD ZP MAB MJY. Competing Interests: The authors have declared that no competing interests exist. |
ISSN: | 1932-6203 1932-6203 |
DOI: | 10.1371/journal.pone.0055674 |