Formally Verified Suffix Array Construction Formally Verified Suffix Array Construction
Suffix arrays are a data structure with numerous real-world applications. They are extensively used in text retrieval and data compression applications, including query suggestion mechanisms in web search, and in bioinformatics tools for DNA sequencing and matching. This wide applicability means tha...
Saved in:
| Published in | Journal of automated reasoning Vol. 69; no. 3; p. 21 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
Dordrecht
Springer Netherlands
01.09.2025
Springer Nature B.V |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0168-7433 1573-0670 1573-0670 |
| DOI | 10.1007/s10817-025-09735-8 |
Cover
| Summary: | Suffix arrays are a data structure with numerous real-world applications. They are extensively used in text retrieval and data compression applications, including query suggestion mechanisms in web search, and in bioinformatics tools for DNA sequencing and matching. This wide applicability means that algorithms for constructing suffix arrays are of great practical importance. The SA-IS algorithm is an efficient but conceptually complex suffix array construction technique, and implementing it requires a deep understanding of its underlying theory. As a critical step towards developing a provably correct and efficient implementation, we have developed the SA-IS algorithm in Isabelle/HOL and formally verified that it is equivalent to a mathematical functional specification of suffix arrays, a task that required verifying a wide range of underlying properties of strings and suffixes. We also used Isabelle’s code extraction facilities to extract an executable Haskell implementation of SA-IS, which albeit is inefficient due to using lists and natural numbers rather than arrays and machine words, demonstrates that our verified HOL implementation of SA-IS can be refined to an executable implementation in its current form. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0168-7433 1573-0670 1573-0670 |
| DOI: | 10.1007/s10817-025-09735-8 |