Sequence Repeats
DNA or protein repeats are recurring subsequences in these molecules. These repeats may be adjacent to each other in which case they are called tandem repeats or they may be dispersed named as sequence motifs. Discovery of such subsequences has various implications such as locating genes as they are...
Saved in:
| Published in | Distributed and Sequential Algorithms for Bioinformatics Vol. 23; pp. 161 - 182 |
|---|---|
| Main Author | |
| Format | Book Chapter |
| Language | English |
| Published |
Switzerland
Springer International Publishing AG
2015
Springer International Publishing |
| Series | Computational Biology |
| Subjects | |
| Online Access | Get full text |
| ISBN | 9783319249643 3319249649 |
| ISSN | 1568-2684 |
| DOI | 10.1007/978-3-319-24966-7_8 |
Cover
| Summary: | DNA or protein repeats are recurring subsequences in these molecules. These repeats may be adjacent to each other in which case they are called tandem repeats or they may be dispersed named as sequence motifs. Discovery of such subsequences has various implications such as locating genes as they are frequently found near genes, comparing sequences, or disease analysis as the number of repeats is elevated in certain diseases. Instead of searching for exact repeats, we may be interested in finding approximate repeats, as these are encountered more frequently in experiments than exact ones due to mutations in sequences and erroneous measurements. We may search for repeats in a single sequence or a set of sequences. The detected repeats in the latter case provide also the conserved structures in the set which can be used to infer phylogenetic relationships. Discovery of these structures can be performed by combinatorial and probabilistic algorithms as we describe. Graph-based methods involve building of a k-partite similarity graph among k input sequences and then searching for cliques in this graph. A clique found this way will have a vertex in each partition and represent a common motif in all sequences. The distributed algorithms for this purpose are scarce and we propose two new algorithms to detect repeating sequences which can be easily experimented. |
|---|---|
| ISBN: | 9783319249643 3319249649 |
| ISSN: | 1568-2684 |
| DOI: | 10.1007/978-3-319-24966-7_8 |