A sequential naïve Bayes classifier for DNA barcodes
DNA barcodes are short strands of 255–700 nucleotide bases taken from the cytochrome c oxidase subunit 1 (COI) region of the mitochondrial DNA. It has been proposed that these barcodes may be used as a method of differentiating between biological species. Current methods of species classification ut...
Saved in:
| Published in | Statistical applications in genetics and molecular biology Vol. 13; no. 4; pp. 423 - 434 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
Germany
De Gruyter
01.08.2014
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2194-6302 1544-6115 1544-6115 |
| DOI | 10.1515/sagmb-2013-0025 |
Cover
| Summary: | DNA barcodes are short strands of 255–700 nucleotide bases taken from the cytochrome c oxidase subunit 1 (COI) region of the mitochondrial DNA. It has been proposed that these barcodes may be used as a method of differentiating between biological species. Current methods of species classification utilize distance measures that are heavily dependent on both evolutionary model assumptions as well as a clearly defined “gap” between intra- and interspecies variation. Such distance measures fail to measure classification uncertainty or to indicate how much of the barcode is necessary for classification. We propose a sequential naïve Bayes classifier for species classification to address these limitations. The proposed method is shown to provide accurate species-level classification on real and simulated data. The method proposed here quantifies the uncertainty of each classification and addresses how much of the barcode is necessary. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 2194-6302 1544-6115 1544-6115 |
| DOI: | 10.1515/sagmb-2013-0025 |