Space–time trade-offs for finding shortest unique substrings and maximal unique matches
Given a string X[1,n] and a position k∈[1,n], a Shortest Unique Substring of X covering k, denoted by Sk, is a substring X[i,j] of X which satisfies the following conditions: (i)i≤k≤j, (ii)i is the only position where there is an occurrence of X[i,j], and (iii)j−i is minimized. Current best-known al...
        Saved in:
      
    
          | Published in | Theoretical computer science Vol. 700; pp. 75 - 88 | 
|---|---|
| Main Authors | , , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
            Elsevier B.V
    
        14.11.2017
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 0304-3975 1879-2294  | 
| DOI | 10.1016/j.tcs.2017.08.002 | 
Cover
| Summary: | Given a string X[1,n] and a position k∈[1,n], a Shortest Unique Substring of X covering k, denoted by Sk, is a substring X[i,j] of X which satisfies the following conditions: (i)i≤k≤j, (ii)i is the only position where there is an occurrence of X[i,j], and (iii)j−i is minimized. Current best-known algorithms for finding Sk require Θ(n) words of working space, and O(n) time. Let τ be a given parameter. We present the following new results.•Given a k∈[1,n], we can compute Sk in O(nτ2lognτ) time using X and an additional O(n/τ) words of working space.•For every k∈[1,n], we can compute Sk in O(nτ2logn) time using X, and an additional O(n/τ) words and 4n+o(n) bits of working space.•We present an O(nτlogc+1n)-time randomized algorithm that uses n/logcn words in addition to that mentioned above, where c≥0 is an arbitrary constant. In this case, the reported string is unique and covers k, but, with probability at most n−O(1), may not be the shortest.
By choosing τ=ω(1), our results imply the first sub-linear space (in addition to the input string) solution to these problems. We also present the following two results.•An algorithm that finds Sk for every k∈[1,n] using O(nlogσ) bits of working space in O(nlogn) time, where σ≤n is the number of distinct symbols in X.•A 4n+o(n)-bit index that can report Sk for any k in O(1) time. As a consequence of our techniques, we also obtain similar space-and-time tradeoffs for a related problem of finding Maximal Unique Matches of two strings (Delcher et al., 1999 [14]). | 
|---|---|
| ISSN: | 0304-3975 1879-2294  | 
| DOI: | 10.1016/j.tcs.2017.08.002 |