Sequence Repeats

DNA or protein repeats are recurring subsequences in these molecules. These repeats may be adjacent to each other in which case they are called tandem repeats or they may be dispersed named as sequence motifs. Discovery of such subsequences has various implications such as locating genes as they are...

Full description

Saved in:
Bibliographic Details
Published inDistributed and Sequential Algorithms for Bioinformatics Vol. 23; pp. 161 - 182
Main Author Erciyes, Kayhan
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2015
Springer International Publishing
SeriesComputational Biology
Subjects
Online AccessGet full text
ISBN9783319249643
3319249649
ISSN1568-2684
DOI10.1007/978-3-319-24966-7_8

Cover

More Information
Summary:DNA or protein repeats are recurring subsequences in these molecules. These repeats may be adjacent to each other in which case they are called tandem repeats or they may be dispersed named as sequence motifs. Discovery of such subsequences has various implications such as locating genes as they are frequently found near genes, comparing sequences, or disease analysis as the number of repeats is elevated in certain diseases. Instead of searching for exact repeats, we may be interested in finding approximate repeats, as these are encountered more frequently in experiments than exact ones due to mutations in sequences and erroneous measurements. We may search for repeats in a single sequence or a set of sequences. The detected repeats in the latter case provide also the conserved structures in the set which can be used to infer phylogenetic relationships. Discovery of these structures can be performed by combinatorial and probabilistic algorithms as we describe. Graph-based methods involve building of a k-partite similarity graph among k input sequences and then searching for cliques in this graph. A clique found this way will have a vertex in each partition and represent a common motif in all sequences. The distributed algorithms for this purpose are scarce and we propose two new algorithms to detect repeating sequences which can be easily experimented.
ISBN:9783319249643
3319249649
ISSN:1568-2684
DOI:10.1007/978-3-319-24966-7_8