Faster computation of left-bounded shortest unique substrings

Finding shortest unique substrings (SUS) is a fundamental problem in string processing with applications in bioinformatics. In this paper, we present an algorithm for solving a variant of the SUS problem, the left-bounded shortest unique substrings (LSUS). This variant is particularly important in a...

Full description

Saved in:

Bibliographic Details
Published in	Algorithms for molecular biology Vol. 20; no. 1; pp. 11 - 7
Main Authors	Aguiar, Larissa L. M., Louza, Felipe A.
Format	Journal Article
Language	English
Published	London BioMed Central 20.06.2025 BioMed Central Ltd Springer Nature B.V BMC
Subjects	Algorithms Arrays Bioinformatics Biomedical and Life Sciences Cellular and Medical Topics Compact data structures Computational Biology/Bioinformatics Data compression Extraction Genomes Grammar Life Sciences Numbers Physiological Strings Extraction Algorithms Grammar Data compression Compact data structures
Online Access	Get full text
ISSN	1748-7188 1748-7188
DOI	10.1186/s13015-025-00287-5

Cover

More Information
Summary:	Finding shortest unique substrings (SUS) is a fundamental problem in string processing with applications in bioinformatics. In this paper, we present an algorithm for solving a variant of the SUS problem, the left-bounded shortest unique substrings (LSUS). This variant is particularly important in applications such as PCR primer design. Our algorithm runs in O ( n ) time using 2 n memory words plus n bytes for an input string of length n . Experimental results with real and artificial datasets show that our algorithm is the fastest alternative in practice, being two times faster (on the average) than related works, while using a similar peak memory footprint.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1748-7188 1748-7188
DOI:	10.1186/s13015-025-00287-5