FMtree: a fast locating algorithm of FM-indexes for genomic data

Abstract Motivation As a fundamental task in bioinformatics, searching for massive short patterns over a long text has been accelerated by various compressed full-text indexes. These indexes are able to provide similar searching functionalities to classical indexes, e.g. suffix trees and suffix arra...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Vol. 34; no. 3; pp. 416 - 424
Main Authors Cheng, Haoyu, Wu, Ming, Xu, Yun
Format Journal Article
LanguageEnglish
Published England Oxford University Press 01.02.2018
Subjects
Online AccessGet full text
ISSN1367-4803
1367-4811
1460-2059
1367-4811
DOI10.1093/bioinformatics/btx596

Cover

Abstract Abstract Motivation As a fundamental task in bioinformatics, searching for massive short patterns over a long text has been accelerated by various compressed full-text indexes. These indexes are able to provide similar searching functionalities to classical indexes, e.g. suffix trees and suffix arrays, while requiring less space. For genomic data, a well-known family of compressed full-text indexes, called FM-indexes, presents unmatched performance in practice. One major drawback of FM-indexes is that their locating operations, which report all occurrence positions of patterns in a given text, are not efficient, especially for the patterns with many occurrences. Results In this paper, we introduce a novel locating algorithm, FMtree, to fast retrieve all occurrence positions of any pattern via FM-indexes. When searching for a pattern over a given text, FMtree organizes the search space of the locating operation into a conceptual multiway tree. As a result, multiple occurrence positions of this pattern can be retrieved simultaneously by traversing the multiway tree. Compared with existing locating algorithms, our tree-based algorithm reduces large numbers of redundant operations and presents better data locality. Experimental results show that FMtree is usually one order of magnitude faster than the state-of-the-art algorithms, and still memory-efficient. Availability and implementation FMtree is freely available at https://github.com/chhylp123/FMtree. Supplementary information Supplementary data are available at Bioinformatics online.
AbstractList Abstract Motivation As a fundamental task in bioinformatics, searching for massive short patterns over a long text has been accelerated by various compressed full-text indexes. These indexes are able to provide similar searching functionalities to classical indexes, e.g. suffix trees and suffix arrays, while requiring less space. For genomic data, a well-known family of compressed full-text indexes, called FM-indexes, presents unmatched performance in practice. One major drawback of FM-indexes is that their locating operations, which report all occurrence positions of patterns in a given text, are not efficient, especially for the patterns with many occurrences. Results In this paper, we introduce a novel locating algorithm, FMtree, to fast retrieve all occurrence positions of any pattern via FM-indexes. When searching for a pattern over a given text, FMtree organizes the search space of the locating operation into a conceptual multiway tree. As a result, multiple occurrence positions of this pattern can be retrieved simultaneously by traversing the multiway tree. Compared with existing locating algorithms, our tree-based algorithm reduces large numbers of redundant operations and presents better data locality. Experimental results show that FMtree is usually one order of magnitude faster than the state-of-the-art algorithms, and still memory-efficient. Availability and implementation FMtree is freely available at https://github.com/chhylp123/FMtree. Supplementary information Supplementary data are available at Bioinformatics online.
As a fundamental task in bioinformatics, searching for massive short patterns over a long text has been accelerated by various compressed full-text indexes. These indexes are able to provide similar searching functionalities to classical indexes, e.g. suffix trees and suffix arrays, while requiring less space. For genomic data, a well-known family of compressed full-text indexes, called FM-indexes, presents unmatched performance in practice. One major drawback of FM-indexes is that their locating operations, which report all occurrence positions of patterns in a given text, are not efficient, especially for the patterns with many occurrences.MotivationAs a fundamental task in bioinformatics, searching for massive short patterns over a long text has been accelerated by various compressed full-text indexes. These indexes are able to provide similar searching functionalities to classical indexes, e.g. suffix trees and suffix arrays, while requiring less space. For genomic data, a well-known family of compressed full-text indexes, called FM-indexes, presents unmatched performance in practice. One major drawback of FM-indexes is that their locating operations, which report all occurrence positions of patterns in a given text, are not efficient, especially for the patterns with many occurrences.In this paper, we introduce a novel locating algorithm, FMtree, to fast retrieve all occurrence positions of any pattern via FM-indexes. When searching for a pattern over a given text, FMtree organizes the search space of the locating operation into a conceptual multiway tree. As a result, multiple occurrence positions of this pattern can be retrieved simultaneously by traversing the multiway tree. Compared with existing locating algorithms, our tree-based algorithm reduces large numbers of redundant operations and presents better data locality. Experimental results show that FMtree is usually one order of magnitude faster than the state-of-the-art algorithms, and still memory-efficient.ResultsIn this paper, we introduce a novel locating algorithm, FMtree, to fast retrieve all occurrence positions of any pattern via FM-indexes. When searching for a pattern over a given text, FMtree organizes the search space of the locating operation into a conceptual multiway tree. As a result, multiple occurrence positions of this pattern can be retrieved simultaneously by traversing the multiway tree. Compared with existing locating algorithms, our tree-based algorithm reduces large numbers of redundant operations and presents better data locality. Experimental results show that FMtree is usually one order of magnitude faster than the state-of-the-art algorithms, and still memory-efficient.FMtree is freely available at https://github.com/chhylp123/FMtree.Availability and implementationFMtree is freely available at https://github.com/chhylp123/FMtree.xuyun@ustc.edu.cn.Contactxuyun@ustc.edu.cn.Supplementary data are available at Bioinformatics online.Supplementary informationSupplementary data are available at Bioinformatics online.
As a fundamental task in bioinformatics, searching for massive short patterns over a long text has been accelerated by various compressed full-text indexes. These indexes are able to provide similar searching functionalities to classical indexes, e.g. suffix trees and suffix arrays, while requiring less space. For genomic data, a well-known family of compressed full-text indexes, called FM-indexes, presents unmatched performance in practice. One major drawback of FM-indexes is that their locating operations, which report all occurrence positions of patterns in a given text, are not efficient, especially for the patterns with many occurrences. In this paper, we introduce a novel locating algorithm, FMtree, to fast retrieve all occurrence positions of any pattern via FM-indexes. When searching for a pattern over a given text, FMtree organizes the search space of the locating operation into a conceptual multiway tree. As a result, multiple occurrence positions of this pattern can be retrieved simultaneously by traversing the multiway tree. Compared with existing locating algorithms, our tree-based algorithm reduces large numbers of redundant operations and presents better data locality. Experimental results show that FMtree is usually one order of magnitude faster than the state-of-the-art algorithms, and still memory-efficient. FMtree is freely available at https://github.com/chhylp123/FMtree. xuyun@ustc.edu.cn. Supplementary data are available at Bioinformatics online.
Author Xu, Yun
Cheng, Haoyu
Wu, Ming
Author_xml – sequence: 1
  givenname: Haoyu
  surname: Cheng
  fullname: Cheng, Haoyu
  organization: School of Computer Science, University of Science and Technology of China, Heifei, Anhui, China
– sequence: 2
  givenname: Ming
  surname: Wu
  fullname: Wu, Ming
  organization: School of Computer Science, University of Science and Technology of China, Heifei, Anhui, China
– sequence: 3
  givenname: Yun
  surname: Xu
  fullname: Xu, Yun
  email: xuyun@ustc.edu.cn
  organization: School of Computer Science, University of Science and Technology of China, Heifei, Anhui, China
BackLink https://www.ncbi.nlm.nih.gov/pubmed/28968761$$D View this record in MEDLINE/PubMed
BookMark eNqNkE1LxDAURYMofv8EJUs31aT5MNWNMjgqjLjRdUkzL2OkTcYkReffW6kKutHVC49z3g13B6374AGhA0qOKanYSeOC8zbETmdn0kmT30Ql19A25ZIUJRHV-vBm8rTgirAttJPSMyGCcs430VapKqlOJd1GF9O7HAHOsMZWp4zbYIaDfoF1uwjR5acOB4und4Xzc3iDhIdIvAAfOmfwXGe9hzasbhPsf85d9Di9epjcFLP769vJ5awwnJFcKAWGW2vnFVHKEG4FLyWFikoCRnBrSip4pYkWzDbALKVWU6CNZYIzNmx2kRzv9n6pV6-6betldJ2Oq5qS-qOS-mcl9VjJIB6N4jKGlx5SrjuXDLSt9hD6VNOKS14qqcSAHn6ifdPB_Dvgq64BECNgYkgpgv33H85_ecblgQg-R-3aP20y2qFf_jPwHeLxrI4
CitedBy_id crossref_primary_10_1002_smtd_202101251
crossref_primary_10_1186_s12859_023_05151_0
crossref_primary_10_1093_bioinformatics_btae409
crossref_primary_10_1186_s13015_021_00204_6
Cites_doi 10.1093/bioinformatics/bts280
10.1038/nmeth0810-576
10.1145/1216370.1216372
10.1016/S0196-6774(03)00087-7
10.1038/nmeth.2221
10.1007/11780441_29
10.1016/j.jda.2010.09.004
10.1007/s00453-010-9443-8
10.1186/s12859-015-0626-9
10.1093/bioinformatics/btu440
10.1137/0222058
10.1007/s00453-013-9782-3
10.1137/S0097539702402354
10.1093/bioinformatics/btp336
10.1038/nmeth.1923
10.1093/nar/gks408
10.1186/1748-7188-8-25
10.1093/bioinformatics/btv662
10.1093/bioinformatics/btv670
10.1145/1070838.1070856
10.1093/nar/gkr1246
10.1016/j.jda.2015.01.006
10.1101/gr.126953.111
ContentType Journal Article
Copyright The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com 2017
The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Copyright_xml – notice: The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com 2017
– notice: The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7X8
ADTOC
UNPAY
DOI 10.1093/bioinformatics/btx596
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList
MEDLINE - Academic
MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 3
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1460-2059
1367-4811
EndPage 424
ExternalDocumentID 10.1093/bioinformatics/btx596
28968761
10_1093_bioinformatics_btx596
Genre Research Support, Non-U.S. Gov't
Journal Article
GroupedDBID -~X
.2P
5GY
AAMVS
ABJNI
ABPTD
ACGFS
ADZXQ
ALMA_UNASSIGNED_HOLDINGS
F5P
HW0
Q5Y
RD5
ROZ
TLC
TN5
TOX
WH7
---
-E4
.DC
.I3
0R~
23N
2WC
4.4
48X
53G
5WA
70D
AAIJN
AAIMJ
AAJKP
AAJQQ
AAKPC
AAMDB
AAOGV
AAPQZ
AAPXW
AAUQX
AAVAP
AAVLN
AAYXX
ABEJV
ABEUO
ABGNP
ABIXL
ABNKS
ABPQP
ABQLI
ABWST
ABXVV
ABZBJ
ACIWK
ACPRK
ACUFI
ACUXJ
ACYTK
ADBBV
ADEYI
ADEZT
ADFTL
ADGKP
ADGZP
ADHKW
ADHZD
ADMLS
ADOCK
ADPDF
ADRDM
ADRTK
ADVEK
ADYVW
ADZTZ
AECKG
AEGPL
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFZL
AFGWE
AFIYH
AFOFC
AFRAH
AGINJ
AGKEF
AGQXC
AGSYK
AHMBA
AHXPO
AIJHB
AJEEA
AJEUX
AKHUL
AKWXX
ALTZX
ALUQC
AMNDL
APIBT
APWMN
ARIXL
ASPBG
AVWKF
AXUDD
AYOIW
AZVOD
BAWUL
BAYMD
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
C45
CDBKE
CITATION
CS3
CZ4
DAKXR
DIK
DILTD
DU5
D~K
EBD
EBS
EE~
EJD
EMOBN
F9B
FEDTE
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GROUPED_DOAJ
GX1
H13
H5~
HAR
HZ~
IOX
J21
JXSIZ
KAQDR
KOP
KQ8
KSI
KSN
M-Z
MK~
ML0
N9A
NGC
NLBLG
NMDNZ
NOMLY
NU-
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
P2P
PAFKI
PEELM
PQQKQ
Q1.
R44
RNS
ROL
RPM
RUSNO
RW1
RXO
SV3
TEORI
TJP
TR2
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZKX
~91
~KM
ADRIX
AFXEN
BCRHZ
CGR
CUY
CVF
ECM
EIF
M49
NPM
ROX
7X8
.-4
.GJ
1TH
ABEFU
ABNGD
ACUKT
ADTOC
AFFNX
AGQPQ
AI.
AQDSO
ATTQO
AZFZN
C1A
CAG
COF
ELUNK
HVGLF
NTWIH
NVLIB
O0~
O~Y
PB-
RNI
RZF
RZO
UNPAY
VH1
ZGI
ID FETCH-LOGICAL-c430t-88ec4fffd9088c04f54261e9160ec54fc21549a0a53fbe3f11fa1e1bf35433be3
IEDL.DBID UNPAY
ISSN 1367-4803
1367-4811
IngestDate Tue Aug 19 17:37:45 EDT 2025
Fri Jul 11 12:38:49 EDT 2025
Wed Feb 19 02:32:20 EST 2025
Tue Jul 01 03:27:23 EDT 2025
Thu Apr 24 23:11:08 EDT 2025
Wed Apr 02 07:03:29 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 3
Language English
License This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)
https://academic.oup.com/journals/pages/about_us/legal/notices
The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c430t-88ec4fffd9088c04f54261e9160ec54fc21549a0a53fbe3f11fa1e1bf35433be3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://proxy.k.utb.cz/login?url=https://academic.oup.com/bioinformatics/article-pdf/34/3/416/25117149/btx596.pdf
PMID 28968761
PQID 1946428685
PQPubID 23479
PageCount 9
ParticipantIDs unpaywall_primary_10_1093_bioinformatics_btx596
proquest_miscellaneous_1946428685
pubmed_primary_28968761
crossref_primary_10_1093_bioinformatics_btx596
crossref_citationtrail_10_1093_bioinformatics_btx596
oup_primary_10_1093_bioinformatics_btx596
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 20180201
2018-02-01
PublicationDateYYYYMMDD 2018-02-01
PublicationDate_xml – month: 02
  year: 2018
  text: 20180201
  day: 01
PublicationDecade 2010
PublicationPlace England
PublicationPlace_xml – name: England
PublicationTitle Bioinformatics
PublicationTitleAlternate Bioinformatics
PublicationYear 2018
Publisher Oxford University Press
Publisher_xml – name: Oxford University Press
References Gog (2023012712270154900_btx596-B14) 2017
Schulz (2023012712270154900_btx596-B32) 2014; 30
Burrows (2023012712270154900_btx596-B4) 1994
Li (2023012712270154900_btx596-B24) 2013
Cheng (2023012712270154900_btx596-B5) 2015; 16
Vyverman (2023012712270154900_btx596-B34) 2012; 40
Gog (2023012712270154900_btx596-B13) 2015; 32
Hach (2023012712270154900_btx596-B19) 2010; 7
Xin (2023012712270154900_btx596-B35) 2016; 32
González (2023012712270154900_btx596-B15) 2007
Manber (2023012712270154900_btx596-B28) 1993; 22
Arroyuelo (2023012712270154900_btx596-B3) 2012; 62
González (2023012712270154900_btx596-B16) 2015; 19
Gog (2023012712270154900_btx596-B12) 2014; 44
Grabowski (2023012712270154900_btx596-B17) 2004
Langmead (2023012712270154900_btx596-B22) 2012; 9
Hon (2023012712270154900_btx596-B20) 2004
Navarro (2023012712270154900_btx596-B30) 2007; 39
Ferragina (2023012712270154900_btx596-B9) 2000
Mäkinen (2023012712270154900_btx596-B27) 2005
Li (2023012712270154900_btx596-B23) 2012; 28
Li (2023012712270154900_btx596-B25) 2009; 25
Simpson (2023012712270154900_btx596-B33) 2012; 22
Ahmadi (2023012712270154900_btx596-B1) 2012; 40
Arroyuelo (2023012712270154900_btx596-B2) 2006
Deorowicz (2023012712270154900_btx596-B8) 2013; 8
Kärkkäinen (2023012712270154900_btx596-B21) 1996
Denning (2023012712270154900_btx596-B7) 2005; 48
Claude (2023012712270154900_btx596-B6) 2012; 11
Liu (2023012712270154900_btx596-B26) 2016; 32
Sadakane (2023012712270154900_btx596-B31) 2003; 48
Grossi (2023012712270154900_btx596-B18) 2005; 35
Ferragina (2023012712270154900_btx596-B10) 2009; 13
Ferragina (2023012712270154900_btx596-B11) 2013; 67
Marco-Sola (2023012712270154900_btx596-B29) 2012; 9
References_xml – volume: 28
  start-page: 1838
  year: 2012
  ident: 2023012712270154900_btx596-B23
  article-title: Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bts280
– start-page: 216
  year: 2007
  ident: 2023012712270154900_btx596-B15
– volume: 7
  start-page: 576
  year: 2010
  ident: 2023012712270154900_btx596-B19
  article-title: mrsFAST: a cache-oblivious algorithm for short-read mapping
  publication-title: Nat. Methods
  doi: 10.1038/nmeth0810-576
– volume: 13
  start-page: 12.
  year: 2009
  ident: 2023012712270154900_btx596-B10
  article-title: Compressed text indexes: from theory to practice
  publication-title: J. Exp. Algorith. (JEA)
– volume: 39
  start-page: 2.
  year: 2007
  ident: 2023012712270154900_btx596-B30
  article-title: Compressed full-text indexes
  publication-title: ACM Comput. Surveys (CSUR)
  doi: 10.1145/1216370.1216372
– start-page: 210
  year: 2004
  ident: 2023012712270154900_btx596-B17
– volume: 48
  start-page: 294
  year: 2003
  ident: 2023012712270154900_btx596-B31
  article-title: New text indexing functionalities of the compressed suffix arrays
  publication-title: J. Algorith
  doi: 10.1016/S0196-6774(03)00087-7
– volume: 44
  start-page: 1287
  year: 2014
  ident: 2023012712270154900_btx596-B12
  article-title: Optimized succinct data structures for massive data
  publication-title: Software: Prac. Exp
– volume: 9
  start-page: 1185
  year: 2012
  ident: 2023012712270154900_btx596-B29
  article-title: The GEM mapper: fast, accurate and versatile alignment by filtration
  publication-title: Nat. Methods
  doi: 10.1038/nmeth.2221
– year: 1994
  ident: 2023012712270154900_btx596-B4
– start-page: 45
  year: 2005
  ident: 2023012712270154900_btx596-B27
– start-page: 318
  year: 2006
  ident: 2023012712270154900_btx596-B2
  article-title: Reducing the space requirement of LZ-index
  publication-title: Annual Symposium on Combinatorial Pattern Matching
  doi: 10.1007/11780441_29
– volume: 11
  start-page: 37
  year: 2012
  ident: 2023012712270154900_btx596-B6
  article-title: String matching with alphabet sampling
  publication-title: J. Discrete Algorith
  doi: 10.1016/j.jda.2010.09.004
– volume: 62
  start-page: 54
  year: 2012
  ident: 2023012712270154900_btx596-B3
  article-title: Stronger Lempel-Ziv based compressed text indexing
  publication-title: Algorithmica
  doi: 10.1007/s00453-010-9443-8
– volume: 16
  start-page: 192.
  year: 2015
  ident: 2023012712270154900_btx596-B5
  article-title: BitMapper: an efficient all-mapper based on bit-vector computing
  publication-title: BMC Bioinform
  doi: 10.1186/s12859-015-0626-9
– start-page: 390
  year: 2000
  ident: 2023012712270154900_btx596-B9
– volume: 30
  start-page: i356
  year: 2014
  ident: 2023012712270154900_btx596-B32
  article-title: Fiona: a parallel and automatic strategy for read error correction
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btu440
– volume: 22
  start-page: 935
  year: 1993
  ident: 2023012712270154900_btx596-B28
  article-title: Suffix arrays: a new method for on-line string searches
  publication-title: SIAM J. Comput
  doi: 10.1137/0222058
– volume: 67
  start-page: 529
  year: 2013
  ident: 2023012712270154900_btx596-B11
  article-title: Distribution-aware compressed full-text indexes
  publication-title: Algorithmica
  doi: 10.1007/s00453-013-9782-3
– volume: 35
  start-page: 378
  year: 2005
  ident: 2023012712270154900_btx596-B18
  article-title: Compressed suffix arrays and suffix trees with applications to text indexing and string matching
  publication-title: SIAM J. Comput
  doi: 10.1137/S0097539702402354
– year: 1996
  ident: 2023012712270154900_btx596-B21
  article-title: Sparse suffix trees
  publication-title: COCOON: International Computing and Combinatorics Conference
– start-page: 31
  volume-title: ALENEX/ANALC
  year: 2004
  ident: 2023012712270154900_btx596-B20
  article-title: Practical aspects of Compressed Suffix Arrays and FM-Index in Searching DNA Sequences
– start-page: 73
  year: 2017
  ident: 2023012712270154900_btx596-B14
– volume: 25
  start-page: 1966
  year: 2009
  ident: 2023012712270154900_btx596-B25
  article-title: SOAP2: an improved ultrafast tool for short read alignment
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btp336
– volume: 9
  start-page: 357
  year: 2012
  ident: 2023012712270154900_btx596-B22
  article-title: Fast gapped-read alignment with Bowtie 2
  publication-title: Nat. Methods
  doi: 10.1038/nmeth.1923
– volume: 40
  start-page: 6993
  year: 2012
  ident: 2023012712270154900_btx596-B34
  article-title: Prospects and limitations of full-text index structures in genome analysis
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gks408
– volume: 8
  start-page: 25.
  year: 2013
  ident: 2023012712270154900_btx596-B8
  article-title: Data compression for sequencing data
  publication-title: Algorith. Mol. Biol
  doi: 10.1186/1748-7188-8-25
– volume: 32
  start-page: 1625
  year: 2016
  ident: 2023012712270154900_btx596-B26
  article-title: rHAT: fast alignment of noisy long reads with regional hashing
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btv662
– volume: 32
  start-page: 1632
  year: 2016
  ident: 2023012712270154900_btx596-B35
  article-title: Optimal seed solver: optimizing seed selection in read mapping
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btv670
– volume: 48
  start-page: 19
  year: 2005
  ident: 2023012712270154900_btx596-B7
  article-title: The locality principle
  publication-title: Commun. ACM
  doi: 10.1145/1070838.1070856
– year: 2013
  ident: 2023012712270154900_btx596-B24
– volume: 40
  start-page: e41
  year: 2012
  ident: 2023012712270154900_btx596-B1
  article-title: Hobbes: optimized gram-based methods for efficient read alignment
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkr1246
– volume: 32
  start-page: 53
  year: 2015
  ident: 2023012712270154900_btx596-B13
  article-title: Improved and extended locating functionality on compressed suffix arrays
  publication-title: J. Discrete Algorith
  doi: 10.1016/j.jda.2015.01.006
– volume: 19
  start-page: 1
  year: 2015
  ident: 2023012712270154900_btx596-B16
  article-title: Locally compressed suffix arrays
  publication-title: J. Exp. Algorith. (JEA)
– volume: 22
  start-page: 549
  year: 2012
  ident: 2023012712270154900_btx596-B33
  article-title: Efficient de novo assembly of large genomes using compressed data structures
  publication-title: Genome Res
  doi: 10.1101/gr.126953.111
SSID ssj0051444
ssj0005056
Score 2.3007016
Snippet Abstract Motivation As a fundamental task in bioinformatics, searching for massive short patterns over a long text has been accelerated by various compressed...
As a fundamental task in bioinformatics, searching for massive short patterns over a long text has been accelerated by various compressed full-text indexes....
SourceID unpaywall
proquest
pubmed
crossref
oup
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 416
SubjectTerms Algorithms
Animals
Genome
Genomics - methods
Humans
Mice
Sequence Analysis, DNA - methods
Software
Title FMtree: a fast locating algorithm of FM-indexes for genomic data
URI https://www.ncbi.nlm.nih.gov/pubmed/28968761
https://www.proquest.com/docview/1946428685
https://academic.oup.com/bioinformatics/article-pdf/34/3/416/25117149/btx596.pdf
UnpaywallVersion publishedVersion
Volume 34
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAFT
  databaseName: Open Access Digital Library
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: KQ8
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html
  providerName: Colorado Alliance of Research Libraries
– providerCode: PRVEBS
  databaseName: Inspec with Full Text
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: ADMLS
  dateStart: 19980101
  isFulltext: true
  titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text
  providerName: EBSCOhost
– providerCode: PRVBFR
  databaseName: Free Medical Journals
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 20241105
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: DIK
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: http://www.freemedicaljournals.com
  providerName: Flying Publisher
– providerCode: PRVFQY
  databaseName: GFMER Free Medical Journals
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 20241105
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: GX1
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php
  providerName: Geneva Foundation for Medical Education and Research
– providerCode: PRVAQN
  databaseName: PubMed Central
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: RPM
  dateStart: 20070101
  isFulltext: true
  titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/
  providerName: National Library of Medicine
– providerCode: PRVOVD
  databaseName: Journals@Ovid LWW All Open Access Journal Collection Rolling
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: OVEED
  dateStart: 20010101
  isFulltext: true
  titleUrlDefault: http://ovidsp.ovid.com/
  providerName: Ovid
– providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
– providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 20220930
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lj9MwEB6VVghxWF4LFEHllbhwcB7YThNOVIiqsGrpYSuVU-Qk9m5FSUqbarf76xnXSdX2wq7EJbLiTPyYseYbeR4A77tZJLIwFDRjUUi5UBlFFK5oKlH9eUpEWpvg5OEoGEz496mYNmBcx8LIyivcqUMakllRpRA1aYvdaj_pItMu4y5zEVC4BiZ3Eeq7SXkjosDBzgfQCrDpNaE1GY17P234VZfycFssuWr7fh3UE7Hjkey_DtTVQQjcHhJ9DI_W-UJuruV8vqed-k_gT70u65Tyy1mXiZPeHqV8_J8LfwonFZQlPUvyDBoqfw4PbXHLzQv43B-aG-9PRBItVyUxWtP4WBM5vyyWs_LqNyk06Q_pNmGjWhGcBTEpY3F6xDiunsKk__Xiy4BW9RpoinwtaRiqlGutM-M7lXpcC2OfKQSgnkoF1-lHkw9OelIwnSimfV9LX_mJZoIzhm9eQjMvcvUaiDYRuzJCKeKmrysZwpJMeRIZp4QUbeA1W-K0SmZuamrMY3upzuLD7YvtBrXB2ZEtbDaPfxF8QH7c9duzWjJiPKPm4kXmqlivYj_ixswLQpz3Kysyu1-iwRugRvLb4O5k6G7jvbk3xVtolsu1eocwqkw6aEB8O8fnxY9ppzolfwGzYyXE
linkProvider Unpaywall
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8QwEB50RcSD78eKSgQvHtKHSbqtJ0VcRFjx4IKeStomuri2q9vFx693YtrF9aKCt9B08pqE-UJmvgHYb2WRyMJQ0IxFIeVCZRRRuKKpRPPnKRFpbYKTO5fBeZdf3IibKbiqY2Fk5RXu1CENSa-oKEQNbbFbrScdZNpl3GUuAgrXwOQWQn03KV9FFDhYOQ0zARa9Bsx0L69Obm34VYvy8DNZclX2_TqoJ2Lfe7JtTZiriRC4L0h0HuZG-UC-vch-_4t1ai_CUz0v65Ty4IzKxEnfv1E-_ufEl2ChgrLkxIosw5TKV2DWJrd8W4Xjdse8eB8RSbQclsRYTeNjTWT_rnjulfePpNCk3aGfhI1qSHAUxFDG4vCIcVxdg2777Pr0nFb5GmiKei1pGKqUa60z4zuVelwLcz9TCEA9lQqu00PDByc9KZhOFNO-r6Wv_EQzwRnDL-vQyItcbQLRJmJXRriLuKlrSYawJFOeRMUpIUUTeK2WOK3IzE1OjX5sH9VZPLl8sV2gJjhjsYFl8_hJ4AD18dt_9-qdEeMZNQ8vMlfFaBj7ETfXvCDEcW_YLTNuEi-8AVokvwnueA_9rr-tP0tsQ6N8HqkdhFFlsludjA__uCOv
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=FMtree%3A+a+fast+locating+algorithm+of+FM-indexes+for+genomic+data&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Cheng%2C+Haoyu&rft.au=Wu%2C+Ming&rft.au=Xu%2C+Yun&rft.date=2018-02-01&rft.issn=1367-4803&rft.eissn=1367-4811&rft.volume=34&rft.issue=3&rft.spage=416&rft.epage=424&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbtx596&rft.externalDBID=n%2Fa&rft.externalDocID=10_1093_bioinformatics_btx596
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon