FMSA: FPGA-Accelerated ClustalW-Based Multiple Sequence Alignment through Pipelined Prefiltering

Multiple Sequence Alignment (MSA) is perhaps second only to sequence alignment in overall importance in Bioinformatics, being critical, e.g., in determining the structure and function of molecules from putative families of sequences. But while pair wise sequence alignment has been the subject of sco...

Full description

Saved in:
Bibliographic Details
Published in2012 IEEE 20th Annual International Symposium on Field-Programmable Custom Computing Machines pp. 177 - 183
Main Authors Mahram, A., Herbordt, M. C.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.04.2012
Subjects
Online AccessGet full text
ISBN9781467316057
1467316059
DOI10.1109/FCCM.2012.38

Cover

More Information
Summary:Multiple Sequence Alignment (MSA) is perhaps second only to sequence alignment in overall importance in Bioinformatics, being critical, e.g., in determining the structure and function of molecules from putative families of sequences. But while pair wise sequence alignment has been the subject of scores of FPGA acceleration studies, MSA only a few. The most important of these accelerate Clustal-W, the most commonly used MSA code, by either implementing the first of three phases (over 90% of the run time) with Dynamic Programming (DP) methods, or by accelerating the third phase which consumes most of the remaining time. We use a new approach: we apply prefiltering of the kind commonly used in BLAST to perform the initial all-pairs alignments. This results in a speedup of from 80× to 190× over the CPU code (8 cores) and speedup of from 2.5× to 8× over DP/FPGA- and GPU-based methods. When combined with a recently published method for phase 3, and using the original software for phase 2, the end-to-end speedup is at least 50× over an 8-core implementation of the original code. The quality is comparable to the original according to a commonly used benchmark suite evaluated with respect to multiple distance metrics.
ISBN:9781467316057
1467316059
DOI:10.1109/FCCM.2012.38