BLESS: Bandwidth and Locality Enhanced SMEM Seeding Acceleration for DNA Sequencing
In an era marked by the pervasive spread of harmful viruses like COVID-19, the importance of DNA sequencing has grown significantly, given its crucial role in devising effective countermeasures. The seeding process, which aims to find locations of super-maximal exact matches (SMEM) between the DNA s...
Saved in:
| Published in | 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) pp. 582 - 596 |
|---|---|
| Main Authors | , , , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
29.06.2024
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.1109/ISCA59077.2024.00049 |
Cover
| Summary: | In an era marked by the pervasive spread of harmful viruses like COVID-19, the importance of DNA sequencing has grown significantly, given its crucial role in devising effective countermeasures. The seeding process, which aims to find locations of super-maximal exact matches (SMEM) between the DNA samples and reference genome for comparative analysis, has emerged as a major bottleneck due to its memory-intensive characteristics. The learned index approach has been developed that uses machine learning model to partially predict the location of the exact matches, which has effectively reduced the memory access. However, the lack of locality in the current in dexing structure and randomness at runtime of the seeding workload have constrained the memory bandwidth usage and have limited further performance advantage. In this paper, we propose BLESS, a bandwidth and locality enhanced SMEM seeding accelerator for learned-index-based DNA sequence alignment. BLESS is the first domain-specific seeding accelerator to maximize the potential hardware advantage of the learned index approach. We introduce coarse-fine (CF) block data structure, a novel memory mapping of seeding parameters to exploit spatial locality and increase effective bandwidth usage for any memory type, including high bandwidth memory (HBM). We also develop guaranteed search range update (GSRU) algorithm, a method that exploits caching in the search procedure to enable temporal locality and data reuse. Utilizing the CF block and GSRU algorithm, we develop a multi-core seeding accelerator using HBM with context switching and runtime scheduling for maximum core and memory bandwidth utilization. With these improvements, BLESS achieves 35.65 \times and 15.49 \times speedup over the state-of-the-art seeding system BWA-MEME and ERT-ASIC, respectively, in raw system performance. |
|---|---|
| DOI: | 10.1109/ISCA59077.2024.00049 |