Optimal, efficient reconstruction of root-unknown phylogenetic networks with constrained and structured recombination

Phylogenetic networks are models of sequence evolution that go beyond trees, allowing biological operations that are not consistent with tree-like evolution. One of the most important of these biological operations is (single-crossover) recombination between two sequences. An established problem (Ma...

Full description

Saved in:
Bibliographic Details
Published inJournal of computer and system sciences Vol. 70; no. 3; pp. 381 - 398
Main Author Gusfield, Dan
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.05.2005
Subjects
Online AccessGet full text
ISSN0022-0000
1090-2724
DOI10.1016/j.jcss.2004.12.009

Cover

More Information
Summary:Phylogenetic networks are models of sequence evolution that go beyond trees, allowing biological operations that are not consistent with tree-like evolution. One of the most important of these biological operations is (single-crossover) recombination between two sequences. An established problem (Math. Biosci. 98 (1990) 185; J. Mol. Evol. 36 (1993) 396; Proceedings of the 2003 Workshop on Algorithms in Bioinformatics, Berlin, Germany, Lecture Notes in Computer Science, Springer, Berlin, 2003; J. Math. Biol. 98 (2003) 160; J. Comput. Biol. 8 (2001) 69; Genetics 163 (2003) 375; Genetics 111 (1985) 147) is to find a phylogenetic network that derives an input set of sequences, minimizing the number of recombinations used. No efficient, general algorithm is known for that problem. An efficient algorithm does exist for the problem when the network is constrained to be a “galled-tree”, and the ancestral sequence for the galled-tree is specified in advance (Proceedings of Second CSB Bioinformatics Conference, Los Alamitos, CA, 2003, IEEE Press, New York; J. Bioinform. Comput. Biol. 2(1) (2004) 173; INFORMS J. Comput. 16 (2004) 459). However, the more biologically realistic case is that no ancestral sequence is known in advance, and the only previous algorithmic solution for that case takes exponential time. In this paper we give an efficient solution to the galled-tree problem when no ancestral sequence is known in advance, and show that the solution produced has very strong global optimality properties. We also indicate how these results generalize to other complex biological phenomena such as gene-conversion, lateral gene transfer, hybrid speciation, and back and recurrent mutation.
ISSN:0022-0000
1090-2724
DOI:10.1016/j.jcss.2004.12.009