Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics

Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model...

Full description

Saved in:

Bibliographic Details
Published in	PloS one Vol. 4; no. 12; p. e7891
Main Authors	Kolaczkowski, Bryan, Thornton, Joseph W.
Format	Journal Article
Language	English
Published	United States Public Library of Science 09.12.2009 Public Library of Science (PLoS)
Subjects	Analysis Attraction Bayes Theorem Bayesian analysis Bias Branches Cladistic analysis Computer Simulation Data processing Databases, Genetic Datasets Economic models Empirical analysis Evolution Evolution (Biology) Evolutionary Biology Evolutionary Biology/Genomics Genetics and Genomics/Comparative Genomics Humans Likelihood Functions Mathematical models Models, Genetic Numerical analysis Phylogenetics Phylogeny Probabilistic inference Probabilistic models Robustness (mathematics) Simulation Statistical analysis Statistical inference Topology Trees Oregon United States > US
Online Access	Get full text
ISSN	1932-6203 1932-6203
DOI	10.1371/journal.pone.0007891

Cover

More Information
Summary:	Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias--which is apparent under both controlled simulation conditions and in analyses of empirical sequence data--also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages--that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Current address: Biology Department, Dartmouth College, Hanover, New Hampshire, United States of America Conceived and designed the experiments: BK JWT. Performed the experiments: BK JWT. Analyzed the data: BK JWT. Wrote the paper: BK JWT.
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0007891