Sequence Repeats

DNA or protein repeats are recurring subsequences in these molecules. These repeats may be adjacent to each other in which case they are called tandem repeats or they may be dispersed named as sequence motifs. Discovery of such subsequences has various implications such as locating genes as they are...

Full description

Saved in:

Bibliographic Details
Published in	Distributed and Sequential Algorithms for Bioinformatics Vol. 23; pp. 161 - 182
Main Author	Erciyes, Kayhan
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2015 Springer International Publishing
Series	Computational Biology
Subjects	Algorithms & data structures Edit Distance Life sciences: general issues Maths for computer scientists Motif Search Short Tandem Repeat Suffix Tree Tandem Repeat
Online Access	Get full text
ISBN	9783319249643 3319249649
ISSN	1568-2684
DOI	10.1007/978-3-319-24966-7_8

Cover

More Information
Summary:	DNA or protein repeats are recurring subsequences in these molecules. These repeats may be adjacent to each other in which case they are called tandem repeats or they may be dispersed named as sequence motifs. Discovery of such subsequences has various implications such as locating genes as they are frequently found near genes, comparing sequences, or disease analysis as the number of repeats is elevated in certain diseases. Instead of searching for exact repeats, we may be interested in finding approximate repeats, as these are encountered more frequently in experiments than exact ones due to mutations in sequences and erroneous measurements. We may search for repeats in a single sequence or a set of sequences. The detected repeats in the latter case provide also the conserved structures in the set which can be used to infer phylogenetic relationships. Discovery of these structures can be performed by combinatorial and probabilistic algorithms as we describe. Graph-based methods involve building of a k-partite similarity graph among k input sequences and then searching for cliques in this graph. A clique found this way will have a vertex in each partition and represent a common motif in all sequences. The distributed algorithms for this purpose are scarce and we propose two new algorithms to detect repeating sequences which can be easily experimented.
ISBN:	9783319249643 3319249649
ISSN:	1568-2684
DOI:	10.1007/978-3-319-24966-7_8