Differential direct coding: a compression algorithm for nucleotide sequence data
While modern hardware can provide vast amounts of inexpensive storage for biological databases, the compression of nucleotide sequence data is still of paramount importance in order to facilitate fast search and retrieval operations through a reduction in disk traffic. This issue becomes even more i...
Saved in:
| Published in | Database : the journal of biological databases and curation Vol. 2009; p. bap013 |
|---|---|
| Main Author | |
| Format | Journal Article |
| Language | English |
| Published |
England
Oxford University Press
01.01.2009
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 1758-0463 1758-0463 |
| DOI | 10.1093/database/bap013 |
Cover
| Abstract | While modern hardware can provide vast amounts of inexpensive storage for biological databases, the compression of nucleotide sequence data is still of paramount importance in order to facilitate fast search and retrieval operations through a reduction in disk traffic. This issue becomes even more important in light of the recent increase of very large data sets, such as metagenomes. In this article, I propose the Differential Direct Coding algorithm, a general-purpose nucleotide compression protocol that can differentiate between sequence data and auxiliary data by supporting the inclusion of supplementary symbols that are not members of the set of expected nucleotide bases, thereby offering reconciliation between sequence-specific and general-purpose compression strategies. This algorithm permits a sequence to contain a rich lexicon of auxiliary symbols that can represent wildcards, annotation data and special subsequences, such as functional domains or special repeats. In particular, the representation of special subsequences can be incorporated to provide structure-based coding that increases the overall degree of compression. Moreover, supporting a robust set of symbols removes the requirement of wildcard elimination and restoration phases, resulting in a complexity of O(n) for execution time, making this algorithm suitable for very large data sets. Because this algorithm compresses data on the basis of triplets, it is highly amenable to interpretation as a polypeptide at decompression time. Also, an encoded sequence may be further compressed using other existing algorithms, like gzip, thereby maximizing the final degree of compression. Overall, the Differential Direct Coding algorithm can offer a beneficial impact on disk traffic for database queries and other disk-intensive operations. |
|---|---|
| AbstractList | While modern hardware can provide vast amounts of inexpensive storage for biological databases, the compression of nucleotide sequence data is still of paramount importance in order to facilitate fast search and retrieval operations through a reduction in disk traffic. This issue becomes even more important in light of the recent increase of very large data sets, such as metagenomes. In this article, I propose the Differential Direct Coding algorithm, a general-purpose nucleotide compression protocol that can differentiate between sequence data and auxiliary data by supporting the inclusion of supplementary symbols that are not members of the set of expected nucleotide bases, thereby offering reconciliation between sequence-specific and general-purpose compression strategies. This algorithm permits a sequence to contain a rich lexicon of auxiliary symbols that can represent wildcards, annotation data and special subsequences, such as functional domains or special repeats. In particular, the representation of special subsequences can be incorporated to provide structure-based coding that increases the overall degree of compression. Moreover, supporting a robust set of symbols removes the requirement of wildcard elimination and restoration phases, resulting in a complexity of O(n) for execution time, making this algorithm suitable for very large data sets. Because this algorithm compresses data on the basis of triplets, it is highly amenable to interpretation as a polypeptide at decompression time. Also, an encoded sequence may be further compressed using other existing algorithms, like gzip, thereby maximizing the final degree of compression. Overall, the Differential Direct Coding algorithm can offer a beneficial impact on disk traffic for database queries and other disk-intensive operations. While modern hardware can provide vast amounts of inexpensive storage for biological databases, the compression of nucleotide sequence data is still of paramount importance in order to facilitate fast search and retrieval operations through a reduction in disk traffic. This issue becomes even more important in light of the recent increase of very large data sets, such as metagenomes. In this article, I propose the Differential Direct Coding algorithm, a general-purpose nucleotide compression protocol that can differentiate between sequence data and auxiliary data by supporting the inclusion of supplementary symbols that are not members of the set of expected nucleotide bases, thereby offering reconciliation between sequence-specific and general-purpose compression strategies. This algorithm permits a sequence to contain a rich lexicon of auxiliary symbols that can represent wildcards, annotation data and special subsequences, such as functional domains or special repeats. In particular, the representation of special subsequences can be incorporated to provide structure-based coding that increases the overall degree of compression. Moreover, supporting a robust set of symbols removes the requirement of wildcard elimination and restoration phases, resulting in a complexity of O(n) for execution time, making this algorithm suitable for very large data sets. Because this algorithm compresses data on the basis of triplets, it is highly amenable to interpretation as a polypeptide at decompression time. Also, an encoded sequence may be further compressed using other existing algorithms, like gzip, thereby maximizing the final degree of compression. Overall, the Differential Direct Coding algorithm can offer a beneficial impact on disk traffic for database queries and other disk-intensive operations.While modern hardware can provide vast amounts of inexpensive storage for biological databases, the compression of nucleotide sequence data is still of paramount importance in order to facilitate fast search and retrieval operations through a reduction in disk traffic. This issue becomes even more important in light of the recent increase of very large data sets, such as metagenomes. In this article, I propose the Differential Direct Coding algorithm, a general-purpose nucleotide compression protocol that can differentiate between sequence data and auxiliary data by supporting the inclusion of supplementary symbols that are not members of the set of expected nucleotide bases, thereby offering reconciliation between sequence-specific and general-purpose compression strategies. This algorithm permits a sequence to contain a rich lexicon of auxiliary symbols that can represent wildcards, annotation data and special subsequences, such as functional domains or special repeats. In particular, the representation of special subsequences can be incorporated to provide structure-based coding that increases the overall degree of compression. Moreover, supporting a robust set of symbols removes the requirement of wildcard elimination and restoration phases, resulting in a complexity of O(n) for execution time, making this algorithm suitable for very large data sets. Because this algorithm compresses data on the basis of triplets, it is highly amenable to interpretation as a polypeptide at decompression time. Also, an encoded sequence may be further compressed using other existing algorithms, like gzip, thereby maximizing the final degree of compression. Overall, the Differential Direct Coding algorithm can offer a beneficial impact on disk traffic for database queries and other disk-intensive operations. While modern hardware can provide vast amounts of inexpensive storage for biological databases, the compression of nucleotide sequence data is still of paramount importance in order to facilitate fast search and retrieval operations through a reduction in disk traffic. This issue becomes even more important in light of the recent increase of very large data sets, such as metagenomes. In this article, I propose the Differential Direct Coding algorithm, a general-purpose nucleotide compression protocol that can differentiate between sequence data and auxiliary data by supporting the inclusion of supplementary symbols that are not members of the set of expected nucleotide bases, thereby offering reconciliation between sequence-specific and general-purpose compression strategies. This algorithm permits a sequence to contain a rich lexicon of auxiliary symbols that can represent wildcards, annotation data and special subsequences, such as functional domains or special repeats. In particular, the representation of special subsequences can be incorporated to provide structure-based coding that increases the overall degree of compression. Moreover, supporting a robust set of symbols removes the requirement of wildcard elimination and restoration phases, resulting in a complexity of O(n) for execution time, making this algorithm suitable for very large data sets. Because this algorithm compresses data on the basis of triplets, it is highly amenable to interpretation as a polypeptide at decompression time. Also, an encoded sequence may be further compressed using other existing algorithms, like gzip, thereby maximizing the final degree of compression. Overall, the Differential Direct Coding algorithm can offer a beneficial impact on disk traffic for database queries and other disk-intensive operations. [PUBLICATION ABSTRACT] |
| Author | Vey, Gregory |
| AuthorAffiliation | Department of Biology, Wilfrid Laurier University, 75 University Avenue West, Waterloo ON, Canada N2L 3C5 |
| AuthorAffiliation_xml | – name: Department of Biology, Wilfrid Laurier University, 75 University Avenue West, Waterloo ON, Canada N2L 3C5 |
| Author_xml | – sequence: 1 givenname: Gregory surname: Vey fullname: Vey, Gregory organization: Department of Biology, Wilfrid Laurier University, 75 University Avenue West, Waterloo ON, Canada N2L 3C5 |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/20157486$$D View this record in MEDLINE/PubMed |
| BookMark | eNqFkc1rFTEUxYO02Pbp2p0MCMXN8-VzJuNCKPWrUNCFrsOdzJ3XlJlkTDJK_3vzeK3WLuoqF_I7h3vOPSEHPngk5AWjbxhtxaaHDB0k3HQwUyaekGPWKL2mshYH9-YjcpLSNaV1o7V8So44ZaqRuj4mX9-7YcCIPjsYq95FtLmyoXd--7aCMk1zxJRc8BWM2xBdvpqqIcTKL3bEkF2PVcIfC3qL1W6bZ-RwgDHh89t3Rb5__PDt_PP68suni_Ozy7WViue1kABCCUF5W_c9SLSyawRV0qLogVnLWz6ArpHxXnSade3QCNZ1XDGphRzEitC97-JnuPkF42jm6CaIN4ZRsyvH3JVj9uUUybu9ZF66CXtbQkf4KwvgzL8_3l2ZbfhpeNM2Uu0MTm8NYiiRUzaTSxbHETyGJRnOpOSK1wV8_SjItGqV1qJhBX31AL0OS_SluhKCF0cuSpYVeXl_9T873x2yAGoP2BhSijgY6zLkcreSxI2PdLJ5oPtfi78BF_nHLw |
| CitedBy_id | crossref_primary_10_3390_e21111074 crossref_primary_10_1093_gigascience_giac079 crossref_primary_10_1007_s11227_016_1753_4 crossref_primary_10_1007_s13222_012_0098_2 crossref_primary_10_1136_amiajnl_2013_001694 crossref_primary_10_1093_gigascience_giaa119 |
| Cites_doi | 10.1093/bioinformatics/18.12.1696 10.1016/0306-4573(94)90014-0 10.1016/j.jtbi.2008.03.011 10.1093/nar/gkn942 10.1109/JRPROC.1952.273898 10.1093/nar/gkm929 10.1016/0300-9084(96)84763-8 10.1007/11496656_17 10.1186/1471-2105-9-176 10.1109/TIT.1978.1055934 10.1007/978-1-84800-072-8 10.1109/TIT.1977.1055714 10.1109/51.940049 10.1016/j.bulm.2004.10.005 10.1126/science.1093857 10.1016/S0378-4371(01)00661-6 |
| ContentType | Journal Article |
| Copyright | The Author(s) 2009. Published by Oxford University Press. This is Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. The Author(s) 2009. Published by Oxford University Press. 2009 |
| Copyright_xml | – notice: The Author(s) 2009. Published by Oxford University Press. This is Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. – notice: The Author(s) 2009. Published by Oxford University Press. 2009 |
| DBID | AAYXX CITATION NPM K9. 7X8 7TM 5PM ADTOC UNPAY |
| DOI | 10.1093/database/bap013 |
| DatabaseName | CrossRef PubMed ProQuest Health & Medical Complete (Alumni) MEDLINE - Academic Nucleic Acids Abstracts PubMed Central (Full Participant titles) Unpaywall for CDI: Periodical Content Unpaywall |
| DatabaseTitle | CrossRef PubMed ProQuest Health & Medical Complete (Alumni) MEDLINE - Academic Nucleic Acids Abstracts |
| DatabaseTitleList | Nucleic Acids Abstracts MEDLINE - Academic PubMed ProQuest Health & Medical Complete (Alumni) |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Biology |
| EISSN | 1758-0463 |
| EndPage | bap013 |
| ExternalDocumentID | 10.1093/database/bap013 PMC2797453 2696197311 20157486 10_1093_database_bap013 |
| Genre | Journal Article |
| GroupedDBID | --- .I3 0R~ 18M 53G 5VS 5WA 70E AAHBH AAMVS AAPXW AAVAP AAYXX ABDBF ABEJV ABGNP ABPTD ABXVV ACGFO ACGFS ACPRK ACUHS ADBBV ADHZD ADRAZ AENZO AHMBA AIAGR ALMA_UNASSIGNED_HOLDINGS ALUQC AMNDL AOIJS BAWUL BAYMD BCNDV CIDKT CITATION CZ4 DIK D~K E3Z EBD EBS EMOBN ESX GROUPED_DOAJ GX1 H13 HYE HZ~ KSI M48 MK~ M~E O5R O5S OAWHX OJQWA OK1 O~Y P2P PEELM PQQKQ RD5 RPM RXO SV3 TOX TR2 TUS X7H ZBA ~91 ~D7 ~S- EJD NPM K9. 7X8 7TM 5PM ADTOC UNPAY |
| ID | FETCH-LOGICAL-c452t-34aa35330296dda4ec4b73054ce3da1cc292fa86e12d3b81b9f731bb2514834f3 |
| IEDL.DBID | M48 |
| ISSN | 1758-0463 |
| IngestDate | Wed Oct 29 11:22:15 EDT 2025 Tue Sep 30 16:39:56 EDT 2025 Wed Oct 01 14:20:16 EDT 2025 Fri Jul 11 08:50:19 EDT 2025 Tue Oct 07 06:07:22 EDT 2025 Thu Apr 03 06:59:15 EDT 2025 Tue Jul 01 04:03:36 EDT 2025 Thu Apr 24 23:11:10 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Language | English |
| License | http://creativecommons.org/licenses/by-nc/2.5/uk This is Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. cc-by |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c452t-34aa35330296dda4ec4b73054ce3da1cc292fa86e12d3b81b9f731bb2514834f3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| OpenAccessLink | http://journals.scholarsportal.info/openUrl.xqy?doi=10.1093/database/bap013 |
| PMID | 20157486 |
| PQID | 1022142309 |
| PQPubID | 135335 |
| ParticipantIDs | unpaywall_primary_10_1093_database_bap013 pubmedcentral_primary_oai_pubmedcentral_nih_gov_2797453 proquest_miscellaneous_21442526 proquest_miscellaneous_1859588371 proquest_journals_1022142309 pubmed_primary_20157486 crossref_citationtrail_10_1093_database_bap013 crossref_primary_10_1093_database_bap013 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2009-01-01 2009-00-00 20090101 |
| PublicationDateYYYYMMDD | 2009-01-01 |
| PublicationDate_xml | – month: 01 year: 2009 text: 2009-01-01 day: 01 |
| PublicationDecade | 2000 |
| PublicationPlace | England |
| PublicationPlace_xml | – name: England – name: Oxford |
| PublicationTitle | Database : the journal of biological databases and curation |
| PublicationTitleAlternate | Database (Oxford) |
| PublicationYear | 2009 |
| Publisher | Oxford University Press |
| Publisher_xml | – name: Oxford University Press |
| References | Benson ( key 20180618194039_B2) 2008; 36 Salomon ( key 20180618194039_B7) 2008 Behzadi ( key 20180618194039_B9) 2005 Milosavljević ( key 20180618194039_B12) 1993; 1 Galperin ( key 20180618194039_B1) 2009; 37 Rivals ( key 20180618194039_B13) 1996; 78 Bonanno ( key 20180618194039_B18) 2002; 305 Grumbach ( key 20180618194039_B10) 1993 Chen ( key 20180618194039_B16) 2001; 20 Menconi ( key 20180618194039_B20) 2008; 253 Ziv ( key 20180618194039_B5) 1977; 23 Cherniavski ( key 20180618194039_B14) 2004 Grumbach ( key 20180618194039_B11) 1994; 30 Hoebeke ( key 20180618194039_B3) 2005 Williams ( key 20180618194039_B4) Huffman ( key 20180618194039_B8) 1952; 40 Menconi ( key 20180618194039_B19) 2005; 67 Chen ( key 20180618194039_B17) 2002; 18 Ziv ( key 20180618194039_B6) 1978; 24 Liu ( key 20180618194039_B15) 2008; 9 Venter ( key 20180618194039_B21) 2004; 304 12490460 - Bioinformatics. 2002 Dec;18(12):1696-8 11494771 - IEEE Eng Med Biol Mag. 2001 Jul-Aug;20(4):61-6 18073190 - Nucleic Acids Res. 2008 Jan;36(Database issue):D25-30 18373878 - BMC Bioinformatics. 2008 Mar 31;9:176 15001713 - Science. 2004 Apr 2;304(5667):66-74 8905150 - Biochimie. 1996;78(5):315-22 18430439 - J Theor Biol. 2008 Jul 21;253(2):281-8 7584347 - Proc Int Conf Intell Syst Mol Biol. 1993;1:284-91 19033364 - Nucleic Acids Res. 2009 Jan;37(Database issue):D1-4 15893551 - Bull Math Biol. 2005 Jul;67(4):737-59 |
| References_xml | – volume: 18 start-page: 1696 year: 2002 ident: key 20180618194039_B17 article-title: DNACompress: fast and effective DNA sequence compression publication-title: Bioinformatics doi: 10.1093/bioinformatics/18.12.1696 – volume: 30 start-page: 875 year: 1994 ident: key 20180618194039_B11 article-title: A new challenge for compression algorithms: genetic sequences publication-title: J. Info. Process. Manag. doi: 10.1016/0306-4573(94)90014-0 – volume: 1 start-page: 284 year: 1993 ident: key 20180618194039_B12 article-title: Discovering sequence similarity by the algorithmic significance method publication-title: Proc. Int. Conf. Intell. Syst. Mol. Biol. – start-page: 340 volume-title: Proceedings of IEEE Symposium on Data Compression year: 1993 ident: key 20180618194039_B10 article-title: Compression of DNA sequences – volume: 253 start-page: 281 year: 2008 ident: key 20180618194039_B20 article-title: Data compression and genomes: a two dimensional life domain map publication-title: J. Theoret. Biol. doi: 10.1016/j.jtbi.2008.03.011 – volume: 37 start-page: D1 year: 2009 ident: key 20180618194039_B1 article-title: Nucleic Acids Research annual Database Issue and the NAR online Molecular Biology Database Collection in 2009 publication-title: Nucleic Acids Res. doi: 10.1093/nar/gkn942 – volume: 40 start-page: 1098 year: 1952 ident: key 20180618194039_B8 article-title: A method for the construction of minimum-redundancy codes publication-title: Proc. IRE doi: 10.1109/JRPROC.1952.273898 – volume: 36 start-page: D25 year: 2008 ident: key 20180618194039_B2 article-title: GenBank publication-title: Nucleic Acids Res. doi: 10.1093/nar/gkm929 – volume: 78 start-page: 315 year: 1996 ident: key 20180618194039_B13 article-title: Compression and genetic sequence analysis publication-title: Biochimie doi: 10.1016/0300-9084(96)84763-8 – start-page: 190 volume-title: Symposium on Combinatorial Pattern Matching (CPM'2005) year: 2005 ident: key 20180618194039_B9 article-title: DNA compression challenge revisited doi: 10.1007/11496656_17 – volume: 9 start-page: 176 year: 2008 ident: key 20180618194039_B15 article-title: RNACompress: grammar-based compression and informational complexity measurement of RNA secondary structure publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-9-176 – volume: 24 start-page: 530 year: 1978 ident: key 20180618194039_B6 article-title: Compression of individual sequences via variable-rate coding publication-title: IEEE Trans. Inform. Theory doi: 10.1109/TIT.1978.1055934 – volume-title: Concise Introduction to Data Compression. year: 2008 ident: key 20180618194039_B7 doi: 10.1007/978-1-84800-072-8 – volume: 23 start-page: 337 year: 1977 ident: key 20180618194039_B5 article-title: A universal algorithm for sequential data compression publication-title: IEEE Trans. Inform. Theory doi: 10.1109/TIT.1977.1055714 – start-page: 1 volume-title: Database Annotation in Molecular Biology year: 2005 ident: key 20180618194039_B3 article-title: Annotation and databases: status and prospects – volume: 20 start-page: 61 year: 2001 ident: key 20180618194039_B16 article-title: A compression algorithm for DNA sequences publication-title: IEEE Eng. Med. Biol. Mag. doi: 10.1109/51.940049 – ident: key 20180618194039_B4 – volume: 67 start-page: 737 year: 2005 ident: key 20180618194039_B19 article-title: Sublinear growth of information in DNA sequences publication-title: Bulletin Math. Biol. doi: 10.1016/j.bulm.2004.10.005 – volume-title: Computer Science & Engineering Technical Report year: 2004 ident: key 20180618194039_B14 article-title: Grammar-based compression of DNA sequences – volume: 304 start-page: 66 year: 2004 ident: key 20180618194039_B21 article-title: Environmental genome shotgun sequencing of the Sargasso Sea publication-title: Science doi: 10.1126/science.1093857 – volume: 305 start-page: 196 year: 2002 ident: key 20180618194039_B18 article-title: Information of sequences and applications publication-title: Physica A doi: 10.1016/S0378-4371(01)00661-6 – reference: 18430439 - J Theor Biol. 2008 Jul 21;253(2):281-8 – reference: 18373878 - BMC Bioinformatics. 2008 Mar 31;9:176 – reference: 15001713 - Science. 2004 Apr 2;304(5667):66-74 – reference: 19033364 - Nucleic Acids Res. 2009 Jan;37(Database issue):D1-4 – reference: 7584347 - Proc Int Conf Intell Syst Mol Biol. 1993;1:284-91 – reference: 12490460 - Bioinformatics. 2002 Dec;18(12):1696-8 – reference: 18073190 - Nucleic Acids Res. 2008 Jan;36(Database issue):D25-30 – reference: 11494771 - IEEE Eng Med Biol Mag. 2001 Jul-Aug;20(4):61-6 – reference: 8905150 - Biochimie. 1996;78(5):315-22 – reference: 15893551 - Bull Math Biol. 2005 Jul;67(4):737-59 |
| SSID | ssj0067884 |
| Score | 1.8368045 |
| Snippet | While modern hardware can provide vast amounts of inexpensive storage for biological databases, the compression of nucleotide sequence data is still of... |
| SourceID | unpaywall pubmedcentral proquest pubmed crossref |
| SourceType | Open Access Repository Aggregation Database Index Database Enrichment Source |
| StartPage | bap013 |
| SubjectTerms | Algorithms Genetics Original Studies |
| SummonAdditionalLinks | – databaseName: Unpaywall dbid: UNPAY link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Nb9QwEB2VrRAnWr6DChiJAxyS1l-Jw60qVBUSVQ-sVA4ochy7XRGSVckKtQd-O-ONk7KsKsSBY-RxZDvj8XP8_AbglaKuzDh1sbIpblBypTEOahZT66hRlSnpUkzn43F6NBUfTuXpBnwZ7sLowApPhisNnijpA_puGMl4Xrlr2YGcXxuUeo54ZpemGVMyzcJzgva3YDOVCNUnsDk9Ptn_vLwkKZWnNfJB7mf9Tasr1Rr8XGdR3lk0c335Q9f1b0vU4Rb8HDrXM1O-JouuTMzVH7qP_63323A3gFuy37_lHmzY5j7c7tNdXj6Ak3chGwtGlZr0iykxrV893xJNPLu9Z-U2RNdn7cWsO_9GEFSTxosut92ssmSgfhPfpIcwPXz_6eAoDhkdYiMk62IutOaez8rytKq0sEagpyBqNJZXmhrDcua0Si1lFS8RUecOHaksEYT5n56OP4JJ0zb2CRDlZCZctaclRhShnXKMZVhFVJllTqgIkuHrFSbInfusG3XRH7vzYhi6oh-qCF6PFea90sfNpjuDOxRhyn8v_NaZIjjdyyN4ORbjZPUnMLqx7QJtvJqcUjyjEby4wcZr2DHJ0gge9_41tgaxGvZZYUm24nmjgZcKXy1pZudLyXCW4b5RYsPfjD76t04-_QfbHZh0Fwv7DPFZVz4Ps-wXGmpCLw priority: 102 providerName: Unpaywall |
| Title | Differential direct coding: a compression algorithm for nucleotide sequence data |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/20157486 https://www.proquest.com/docview/1022142309 https://www.proquest.com/docview/1859588371 https://www.proquest.com/docview/21442526 https://pubmed.ncbi.nlm.nih.gov/PMC2797453 https://academic.oup.com/database/article-pdf/doi/10.1093/database/bap013/16728567/bap013.pdf |
| UnpaywallVersion | publishedVersion |
| Volume | 2009 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 1758-0463 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0067884 issn: 1758-0463 databaseCode: DOA dateStart: 20090101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVEBS databaseName: Academic Search Ultimate - eBooks customDbUrl: https://search.ebscohost.com/login.aspx?authtype=ip,shib&custid=s3936755&profile=ehost&defaultdb=asn eissn: 1758-0463 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0067884 issn: 1758-0463 databaseCode: ABDBF dateStart: 20090101 isFulltext: true titleUrlDefault: https://search.ebscohost.com/direct.asp?db=asn providerName: EBSCOhost – providerCode: PRVBFR databaseName: Free Medical Journals customDbUrl: eissn: 1758-0463 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0067884 issn: 1758-0463 databaseCode: DIK dateStart: 20090101 isFulltext: true titleUrlDefault: http://www.freemedicaljournals.com providerName: Flying Publisher – providerCode: PRVFQY databaseName: GFMER Free Medical Journals customDbUrl: eissn: 1758-0463 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0067884 issn: 1758-0463 databaseCode: GX1 dateStart: 20090101 isFulltext: true titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php providerName: Geneva Foundation for Medical Education and Research – providerCode: PRVFQY databaseName: GFMER Free Medical Journals customDbUrl: eissn: 1758-0463 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0067884 issn: 1758-0463 databaseCode: GX1 dateStart: 0 isFulltext: true titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php providerName: Geneva Foundation for Medical Education and Research – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 1758-0463 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0067884 issn: 1758-0463 databaseCode: M~E dateStart: 20090101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVAQN databaseName: PubMed Central customDbUrl: eissn: 1758-0463 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0067884 issn: 1758-0463 databaseCode: RPM dateStart: 20090101 isFulltext: true titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/ providerName: National Library of Medicine – providerCode: PRVASL databaseName: Oxford Journals Free Titles 2012-2013 - NESLI2 customDbUrl: eissn: 1758-0463 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0067884 issn: 1758-0463 databaseCode: 70E dateStart: 0 isFulltext: true titleUrlDefault: https://academic.oup.com/journals providerName: Oxford University Press – providerCode: PRVASL databaseName: Oxford Journals Open Access Collection customDbUrl: eissn: 1758-0463 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0067884 issn: 1758-0463 databaseCode: TOX dateStart: 20090101 isFulltext: true titleUrlDefault: https://academic.oup.com/journals/ providerName: Oxford University Press – providerCode: PRVFZP databaseName: Scholars Portal Journals: Open Access customDbUrl: eissn: 1758-0463 dateEnd: 20250131 omitProxy: true ssIdentifier: ssj0067884 issn: 1758-0463 databaseCode: M48 dateStart: 20090101 isFulltext: true titleUrlDefault: http://journals.scholarsportal.info providerName: Scholars Portal |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwEB6hVgguFeUZaBcjcYBD2rXjJA4SQhVQKqRWPbDScoocx6YrpcmyZAX775lZJ6HVUiFxycW24szDM-OZfAPwUnFXpBF3obIJBiiZ0ngOahFy67hRpSn4Gkzn9Cw5mcjP03j6px1QR8Affw3tqJ_UZFEd_Pq-eocK_7YDQzqkWko68w8LPR9TB9ttNFMZ9XE4lUNKAQ9lJXtsn81FBAqMljGV9FP1VQu14XZuVk_eWdZzvfqpq-qKaTq-BzudT8mOvBDswi1b34fbvsvk6gGcf-iaoKAyV8zbMGYaMlpvmGZUVO6LYWumq2_NYtZeXDL0ZVlNWMdNOyst6yuuGX3QQ5gcf_zy_iTsGimERsaiDSOpdURlpCJLylJLayQyCJ01Y6NSc2NEJpxWieWijAp0ZDOH_CsK9H3ortFFj2Crbmr7BJhySCNXjnWMiiy1U06IFJfIMrXCSRXAQU-83HQo49Tsosp9tjvKe8LnnvABvBoWzD3Axs1T93pu5L2g5BSxcvQJx1kAL4Zh1BFKfOjaNkucQyBuCkNxHsDzG-YQdJyIRRLAY8_eYTe9XASQXmP8MIEQuq-P1LOLNVK3SDFci3HjrwcR-ddHPv3vtzyDuz7VRfdDe7DVLpZ2Hz2mthitbxrw-WnKR2utGMH25Oz86Otv49Mgjg |
| linkProvider | Scholars Portal |
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Nb9QwEB2VrRAnWr6DChiJAxyS1l-Jw60qVBUSVQ-sVA4ochy7XRGSVckKtQd-O-ONk7KsKsSBY-RxZDvj8XP8_AbglaKuzDh1sbIpblBypTEOahZT66hRlSnpUkzn43F6NBUfTuXpBnwZ7sLowApPhisNnijpA_puGMl4Xrlr2YGcXxuUeo54ZpemGVMyzcJzgva3YDOVCNUnsDk9Ptn_vLwkKZWnNfJB7mf9Tasr1Rr8XGdR3lk0c335Q9f1b0vU4Rb8HDrXM1O-JouuTMzVH7qP_63323A3gFuy37_lHmzY5j7c7tNdXj6Ak3chGwtGlZr0iykxrV893xJNPLu9Z-U2RNdn7cWsO_9GEFSTxosut92ssmSgfhPfpIcwPXz_6eAoDhkdYiMk62IutOaez8rytKq0sEagpyBqNJZXmhrDcua0Si1lFS8RUecOHaksEYT5n56OP4JJ0zb2CRDlZCZctaclRhShnXKMZVhFVJllTqgIkuHrFSbInfusG3XRH7vzYhi6oh-qCF6PFea90sfNpjuDOxRhyn8v_NaZIjjdyyN4ORbjZPUnMLqx7QJtvJqcUjyjEby4wcZr2DHJ0gge9_41tgaxGvZZYUm24nmjgZcKXy1pZudLyXCW4b5RYsPfjD76t04-_QfbHZh0Fwv7DPFZVz4Ps-wXGmpCLw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Differential+direct+coding%3A+a+compression+algorithm+for+nucleotide+sequence+data&rft.jtitle=Database+%3A+the+journal+of+biological+databases+and+curation&rft.au=Vey%2C+Gregory&rft.date=2009&rft.eissn=1758-0463&rft.volume=2009&rft.spage=bap013&rft_id=info:doi/10.1093%2Fdatabase%2Fbap013&rft_id=info%3Apmid%2F20157486&rft.externalDocID=20157486 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1758-0463&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1758-0463&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1758-0463&client=summon |