A novel lossless encoding algorithm for data compression–genomics data as an exemplar
Data compression is a challenging and increasingly important problem. As the amount of data generated daily continues to increase, efficient transmission and storage have never been more critical. In this study, a novel encoding algorithm is proposed, motivated by the compression of DNA data and ass...
        Saved in:
      
    
          | Published in | Frontiers in bioinformatics Vol. 4; p. 1489704 | 
|---|---|
| Main Authors | , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        Switzerland
          Frontiers Media S.A
    
        23.01.2025
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 2673-7647 2673-7647  | 
| DOI | 10.3389/fbinf.2024.1489704 | 
Cover
| Abstract | Data compression is a challenging and increasingly important problem. As the amount of data generated daily continues to increase, efficient transmission and storage have never been more critical. In this study, a novel encoding algorithm is proposed, motivated by the compression of DNA data and associated characteristics. The proposed algorithm follows a divide-and-conquer approach by scanning the whole genome, classifying subsequences based on similarities in their content, and binning similar subsequences together. The data is then compressed into each bin independently. This approach is different than the currently known approaches: entropy, dictionary, predictive, or transform-based methods. Proof-of-concept performance was evaluated using a benchmark dataset with seventeen genomes ranging in size from kilobytes to gigabytes. The results showed a considerable improvement in the compression of each genome, preserving several megabytes compared to state-of-the-art tools. Moreover, the algorithm can be applied to the compression of other data types include mainly text, numbers, images, audio, and video which are being generated daily and unprecedentedly in massive volumes. | 
    
|---|---|
| AbstractList | Data compression is a challenging and increasingly important problem. As the amount of data generated daily continues to increase, efficient transmission and storage have never been more critical. In this study, a novel encoding algorithm is proposed, motivated by the compression of DNA data and associated characteristics. The proposed algorithm follows a divide-and-conquer approach by scanning the whole genome, classifying subsequences based on similarities in their content, and binning similar subsequences together. The data is then compressed into each bin independently. This approach is different than the currently known approaches: entropy, dictionary, predictive, or transform-based methods. Proof-of-concept performance was evaluated using a benchmark dataset with seventeen genomes ranging in size from kilobytes to gigabytes. The results showed a considerable improvement in the compression of each genome, preserving several megabytes compared to state-of-the-art tools. Moreover, the algorithm can be applied to the compression of other data types include mainly text, numbers, images, audio, and video which are being generated daily and unprecedentedly in massive volumes. Data compression is a challenging and increasingly important problem. As the amount of data generated daily continues to increase, efficient transmission and storage have never been more critical. In this study, a novel encoding algorithm is proposed, motivated by the compression of DNA data and associated characteristics. The proposed algorithm follows a divide-and-conquer approach by scanning the whole genome, classifying subsequences based on similarities in their content, and binning similar subsequences together. The data is then compressed into each bin independently. This approach is different than the currently known approaches: entropy, dictionary, predictive, or transform-based methods. Proof-of-concept performance was evaluated using a benchmark dataset with seventeen genomes ranging in size from kilobytes to gigabytes. The results showed a considerable improvement in the compression of each genome, preserving several megabytes compared to state-of-the-art tools. Moreover, the algorithm can be applied to the compression of other data types include mainly text, numbers, images, audio, and video which are being generated daily and unprecedentedly in massive volumes.Data compression is a challenging and increasingly important problem. As the amount of data generated daily continues to increase, efficient transmission and storage have never been more critical. In this study, a novel encoding algorithm is proposed, motivated by the compression of DNA data and associated characteristics. The proposed algorithm follows a divide-and-conquer approach by scanning the whole genome, classifying subsequences based on similarities in their content, and binning similar subsequences together. The data is then compressed into each bin independently. This approach is different than the currently known approaches: entropy, dictionary, predictive, or transform-based methods. Proof-of-concept performance was evaluated using a benchmark dataset with seventeen genomes ranging in size from kilobytes to gigabytes. The results showed a considerable improvement in the compression of each genome, preserving several megabytes compared to state-of-the-art tools. Moreover, the algorithm can be applied to the compression of other data types include mainly text, numbers, images, audio, and video which are being generated daily and unprecedentedly in massive volumes.  | 
    
| Author | Al-okaily, Anas Tbakhi, Abdelghani  | 
    
| AuthorAffiliation | 2 Department of Pathology and Molecular Medicine , McMaster University , Hamilton , ON , Canada 1 Department of Cell Therapy and Applied Genomics , King Hussein Cancer Center , Amman , Jordan  | 
    
| AuthorAffiliation_xml | – name: 1 Department of Cell Therapy and Applied Genomics , King Hussein Cancer Center , Amman , Jordan – name: 2 Department of Pathology and Molecular Medicine , McMaster University , Hamilton , ON , Canada  | 
    
| Author_xml | – sequence: 1 givenname: Anas surname: Al-okaily fullname: Al-okaily, Anas – sequence: 2 givenname: Abdelghani surname: Tbakhi fullname: Tbakhi, Abdelghani  | 
    
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/39917339$$D View this record in MEDLINE/PubMed | 
    
| BookMark | eNqNUU1v1DAQtVARLaV_gAPKkcsudpzE9glVFR-VKnEBcbQmzjh15diLnW3pjf_Qf9hfgrdZqvbGaSzPm_fevHlNDkIMSMhbRtecS_XB9i7YdU3rZs0aqQRtXpCjuhN8JbpGHDx5H5KTnK8opbWkUtH6FTnkSjHBuToiP0-rEK_RVz7m7DHnCoOJgwtjBX6Myc2XU2VjqgaYoTJx2qQCcjHc_7kbMcTJmbz0IFcQKvyN08ZDekNeWvAZT_b1mPz4_On72dfVxbcv52enFyvT1HReMSqh41JiMWNla6goBS1lfSM60_bGWiHAWma6RomWMitl3VrFmaIgYODH5HzhHSJc6U1yE6RbHcHph4-YRg1pdsaj5qaVvFPWSlvEW9UPQztwaBquZKsQCxdfuLZhA7c34P0jIaN6l7p-SF3vUtf71MvUx2Vqs-0nHAyGOYF_ZuV5J7hLPcZrzZhQqu5YYXi_Z0jx1xbzrCeXDXoPAeM2a866nUfWqgJ991TsUeXfQQugXgAmlYMmtP-zwl-Girc7 | 
    
| Cites_doi | 10.1109/ITCC.2001.918838 10.1093/comjnl/30.6.541 10.1145/322344.322346 10.17487/RFC3943 10.1109/tit.1980.1056237 10.1109/tit.1978.1055934 10.1016/0196-6774(85)90036-7 10.3390/info7040056 10.1126/science.2983426 10.1109/jrproc.1952.273898 10.1109/82.219839 10.1109/tit.1975.1055349 10.26483/ijarcs.v8i3.3086 10.1093/gigascience/giaa072 10.1109/18.382012 10.3390/a13040099 10.1109/ICICT48043.2020.9112516 10.1109/tit.1959.1057512 10.1371/journal.pone.0059190 10.1145/31846.42227 10.1109/tit.1977.1055714 10.1109/DCC.1991.213344 10.1093/nar/gkp1137 10.1109/tcom.1984.1096090 10.1147/rd.282.0135 10.1016/0166-218x(93)00116-h 10.1109/mc.1984.1659158 10.1093/bioinformatics/btp352 10.1145/5684.5688 10.1016/j.jksuci.2017.10.007 10.1016/0306-4573(94)90014-0 10.5923/j.bioinformatics.20130303.04 10.1145/584091.584093  | 
    
| ContentType | Journal Article | 
    
| Copyright | Copyright © 2025 Al-okaily and Tbakhi. Copyright © 2025 Al-okaily and Tbakhi. 2025 Al-okaily and Tbakhi  | 
    
| Copyright_xml | – notice: Copyright © 2025 Al-okaily and Tbakhi. – notice: Copyright © 2025 Al-okaily and Tbakhi. 2025 Al-okaily and Tbakhi  | 
    
| DBID | AAYXX CITATION NPM 7X8 5PM ADTOC UNPAY DOA  | 
    
| DOI | 10.3389/fbinf.2024.1489704 | 
    
| DatabaseName | CrossRef PubMed MEDLINE - Academic PubMed Central (Full Participant titles) Unpaywall for CDI: Periodical Content Unpaywall DOAJ Directory of Open Access Journals  | 
    
| DatabaseTitle | CrossRef PubMed MEDLINE - Academic  | 
    
| DatabaseTitleList | CrossRef PubMed MEDLINE - Academic  | 
    
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| DocumentTitleAlternate | Al-okaily and Tbakhi | 
    
| EISSN | 2673-7647 | 
    
| ExternalDocumentID | oai_doaj_org_article_3c58369ff8fc4259bdd5d3a4439859ee 10.3389/fbinf.2024.1489704 PMC11799261 39917339 10_3389_fbinf_2024_1489704  | 
    
| Genre | Journal Article | 
    
| GroupedDBID | 53G 9T4 AAFWJ AAYXX AFPKN ALMA_UNASSIGNED_HOLDINGS CITATION GROUPED_DOAJ M~E OK1 PGMZT RPM NPM 7X8 5PM ADTOC UNPAY  | 
    
| ID | FETCH-LOGICAL-c420t-108a6388e339f85c079f8ef01b476c5bcff77aff1c6497501f8825f93190a7ad3 | 
    
| IEDL.DBID | DOA | 
    
| ISSN | 2673-7647 | 
    
| IngestDate | Fri Oct 03 12:41:34 EDT 2025 Sun Oct 26 04:10:05 EDT 2025 Thu Aug 21 18:38:30 EDT 2025 Thu Oct 02 05:03:25 EDT 2025 Sun Feb 09 01:20:42 EST 2025 Tue Jul 01 03:01:15 EDT 2025  | 
    
| IsDoiOpenAccess | true | 
    
| IsOpenAccess | true | 
    
| IsPeerReviewed | true | 
    
| IsScholarly | true | 
    
| Keywords | Huffman encoding genomics compression BWT LZ  | 
    
| Language | English | 
    
| License | Copyright © 2025 Al-okaily and Tbakhi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. cc-by  | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-c420t-108a6388e339f85c079f8ef01b476c5bcff77aff1c6497501f8825f93190a7ad3 | 
    
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Reviewed by: Soham Sengupta, St. Jude Children’s Research Hospital, United States Edited by: Ali Kadhum Idrees, University of Babylon, Iraq Bhavika Mam, Independent Researcher, Palo Alto, CA, United States  | 
    
| OpenAccessLink | https://doaj.org/article/3c58369ff8fc4259bdd5d3a4439859ee | 
    
| PMID | 39917339 | 
    
| PQID | 3164398159 | 
    
| PQPubID | 23479 | 
    
| ParticipantIDs | doaj_primary_oai_doaj_org_article_3c58369ff8fc4259bdd5d3a4439859ee unpaywall_primary_10_3389_fbinf_2024_1489704 pubmedcentral_primary_oai_pubmedcentral_nih_gov_11799261 proquest_miscellaneous_3164398159 pubmed_primary_39917339 crossref_primary_10_3389_fbinf_2024_1489704  | 
    
| ProviderPackageCode | CITATION AAYXX  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2025-01-23 | 
    
| PublicationDateYYYYMMDD | 2025-01-23 | 
    
| PublicationDate_xml | – month: 01 year: 2025 text: 2025-01-23 day: 23  | 
    
| PublicationDecade | 2020 | 
    
| PublicationPlace | Switzerland | 
    
| PublicationPlace_xml | – name: Switzerland | 
    
| PublicationTitle | Frontiers in bioinformatics | 
    
| PublicationTitleAlternate | Front Bioinform | 
    
| PublicationYear | 2025 | 
    
| Publisher | Frontiers Media S.A | 
    
| Publisher_xml | – name: Frontiers Media S.A | 
    
| References | Martín (B30) 1979 Langdon (B25) 1984; 28 Mahoney (B28) 2005 Jahaan (B20) 2017; 8 Willems (B42) 1995; 41 Fraenkel (B14) 1996; 64 Uthayakumar (B39) 2018; 32 Li (B26) 2009; 25 Awan (B1) 2001 Bonfield (B4) 2013; 8 Grumbach (B17) 1994; 30 Ziv (B45) 1978; 24 Williams (B43) 1991 Hosseini (B18) 2016; 7 Salomon (B34) 2004 Lipman (B27) 1985; 227 Elias (B12) 1975; 21 Stout (B37) 1980; 26 Shannon (B35) 2001; 5 Fano (B13) 1949 Huffman (B19) 1952; 40 Cock (B8) 2010; 38 Oberhumer (B31) 2008 Cleary (B7) 1984; 32 Kodituwakku (B23) 2010; 1 Cover (B10) 1999 Vitter (B40) 1987; 34 Storer (B36) 1982; 29 Mansouri (B29) 2020; 13 Tunstall (B38) 1968 Kryukov (B24) 2020; 9 Ranganathan (B32) 1993; 40 Friend (B15) 2004; 3943 Cormack (B9) 1987; 30 Duda (B11) 2013 Gopinath (B16) 2020 Bakr (B2) 2013; 3 Capon (B6) 1959; 5 Knuth (B22) 1985; 6 Ziv (B44) 1977; 23 Burrows (B5) 1994 Bentley (B3) 1986; 29 Kavitha (B21) 2016; 7 Welch (B41) 1984; 17 Ryabko (B33) 1980; 16  | 
    
| References_xml | – start-page: 452 volume-title: Proceedings international Conference on information Technology: Coding and computing year: 2001 ident: B1 article-title: Lipt: a lossless text transform to improve compression doi: 10.1109/ITCC.2001.918838 – volume: 30 start-page: 541 year: 1987 ident: B9 article-title: Data compression using dynamic markov modelling publication-title: Comput. J. doi: 10.1093/comjnl/30.6.541 – year: 2008 ident: B31 article-title: Lzo-a real-time data compression library – volume-title: Data compression: the complete reference year: 2004 ident: B34 – volume: 29 start-page: 928 year: 1982 ident: B36 article-title: Data compression via textual substitution publication-title: J. ACM (JACM) doi: 10.1145/322344.322346 – volume: 16 start-page: 16 year: 1980 ident: B33 article-title: Data compression by means of a “book stack” publication-title: Probl. Peredachi Inf. – volume-title: The transmission of information year: 1949 ident: B13 – volume: 3943 year: 2004 ident: B15 article-title: Transport layer security (TLS) protocol compression using lempel-ziv-stac (LZS) publication-title: RFC doi: 10.17487/RFC3943 – volume: 26 start-page: 607 year: 1980 ident: B37 article-title: Improved prefix encodings of the natural numbers (corresp.) publication-title: IEEE Trans. Inf. Theory doi: 10.1109/tit.1980.1056237 – volume: 24 start-page: 530 year: 1978 ident: B45 article-title: Compression of individual sequences via variable-rate coding publication-title: IEEE Trans. Inf. Theory doi: 10.1109/tit.1978.1055934 – volume: 6 start-page: 163 year: 1985 ident: B22 article-title: Dynamic huffman coding publication-title: J. algorithms doi: 10.1016/0196-6774(85)90036-7 – volume: 7 start-page: 56 year: 2016 ident: B18 article-title: A survey on data compression methods for biological sequences publication-title: Information doi: 10.3390/info7040056 – volume: 227 start-page: 1435 year: 1985 ident: B27 article-title: Rapid and sensitive protein similarity searches publication-title: Science doi: 10.1126/science.2983426 – volume: 40 start-page: 1098 year: 1952 ident: B19 article-title: A method for the construction of minimum-redundancy codes publication-title: Proc. IRE doi: 10.1109/jrproc.1952.273898 – volume: 40 start-page: 96 year: 1993 ident: B32 article-title: High-speed vlsi designs for lempel-ziv-based data compression publication-title: IEEE Trans. Circuits Syst. II Analog Digital Signal Process. doi: 10.1109/82.219839 – volume: 21 start-page: 194 year: 1975 ident: B12 article-title: Universal codeword sets and representations of the integers publication-title: IEEE Trans. Inf. theory doi: 10.1109/tit.1975.1055349 – volume: 8 year: 2017 ident: B20 article-title: A comparative study and survey on existing dna compression techniques publication-title: Int. J. Adv. Res. Comput. Sci. doi: 10.26483/ijarcs.v8i3.3086 – year: 2005 ident: B28 article-title: Adaptive weighing of context models for lossless data compression publication-title: Tech. Rep. – start-page: 2540 year: 2013 ident: B11 article-title: Asymmetric numeral systems: entropy coding combining speed of huffman coding with compression rate of arithmetic coding – volume: 9 start-page: giaa072 year: 2020 ident: B24 article-title: Sequence compression benchmark (scb) database—a comprehensive evaluation of reference-free compressors for fasta-formatted sequences publication-title: GigaScience doi: 10.1093/gigascience/giaa072 – volume: 41 start-page: 653 year: 1995 ident: B42 article-title: The context-tree weighting method: basic properties publication-title: IEEE Trans. Inf. theory doi: 10.1109/18.382012 – volume: 13 start-page: 99 year: 2020 ident: B29 article-title: A new lossless dna compression algorithm based on a single-block encoding scheme publication-title: Algorithms doi: 10.3390/a13040099 – volume-title: Elements of information theory year: 1999 ident: B10 – start-page: 628 volume-title: 2020 international Conference on inventive computation technologies (ICICT) year: 2020 ident: B16 article-title: Comparison of lossless data compression techniques doi: 10.1109/ICICT48043.2020.9112516 – volume-title: A block-sorting lossless data compression algorithm year: 1994 ident: B5 – volume: 1 start-page: 416 year: 2010 ident: B23 article-title: Comparison of lossless data compression algorithms for text data publication-title: Indian J. Comput. Sci. Eng. – volume: 5 start-page: 157 year: 1959 ident: B6 article-title: A probabilistic model for run-length coding of pictures publication-title: IRE Trans. Inf. Theory doi: 10.1109/tit.1959.1057512 – volume: 8 start-page: e59190 year: 2013 ident: B4 article-title: Compression of fastq and sam format sequencing data publication-title: PloS one doi: 10.1371/journal.pone.0059190 – volume: 34 start-page: 825 year: 1987 ident: B40 article-title: Design and analysis of dynamic huffman codes publication-title: J. ACM (JACM) doi: 10.1145/31846.42227 – volume: 23 start-page: 337 year: 1977 ident: B44 article-title: A universal algorithm for sequential data compression publication-title: IEEE Trans. Inf. theory doi: 10.1109/tit.1977.1055714 – start-page: 362 volume-title: [1991] proceedings. Data compression conference year: 1991 ident: B43 article-title: An extremely fast ziv-lempel data compression algorithm doi: 10.1109/DCC.1991.213344 – volume: 38 start-page: 1767 year: 2010 ident: B8 article-title: The sanger fastq file format for sequences with quality scores, and the solexa/illumina fastq variants publication-title: Nucleic acids Res. doi: 10.1093/nar/gkp1137 – volume: 32 start-page: 396 year: 1984 ident: B7 article-title: Data compression using adaptive coding and partial string matching publication-title: IEEE Trans. Commun. doi: 10.1109/tcom.1984.1096090 – volume: 7 year: 2016 ident: B21 article-title: A survey on lossless and lossy data compression methods publication-title: Int. J. Comput. Sci. and Eng. Technol. (IJCSET) – volume: 28 start-page: 135 year: 1984 ident: B25 article-title: An introduction to arithmetic coding publication-title: IBM J. Res. Dev. doi: 10.1147/rd.282.0135 – volume: 64 start-page: 31 year: 1996 ident: B14 article-title: Robust universal complete codes for transmission and compression publication-title: Discrete Appl. Math. doi: 10.1016/0166-218x(93)00116-h – volume: 17 start-page: 8 year: 1984 ident: B41 article-title: A technique for high-performance data compression publication-title: Computer doi: 10.1109/mc.1984.1659158 – volume: 25 start-page: 2078 year: 2009 ident: B26 article-title: The sequence alignment/map format and samtools publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp352 – volume: 29 start-page: 320 year: 1986 ident: B3 article-title: A locally adaptive data compression scheme publication-title: Commun. ACM doi: 10.1145/5684.5688 – volume-title: Synthesis of noiseless compression codes year: 1968 ident: B38 – volume: 32 start-page: 647 year: 2018 ident: B39 article-title: Swarm intelligence based classification rule induction (CRI) framework for qualitative and quantitative approach: an application of bankruptcy prediction and credit risk analysis publication-title: J. King Saud University-Computer Inf. Sci. doi: 10.1016/j.jksuci.2017.10.007 – start-page: 24 year: 1979 ident: B30 article-title: Range encoding: an algorithm for removing redundancy from a digitised message publication-title: Video Data Rec. Conf. – volume: 30 start-page: 875 year: 1994 ident: B17 article-title: A new challenge for compression algorithms: genetic sequences publication-title: Inf. Process. and Manag. doi: 10.1016/0306-4573(94)90014-0 – volume: 3 start-page: 72 year: 2013 ident: B2 article-title: Dna lossless compression algorithms publication-title: Am. J. Bioinforma. Res. doi: 10.5923/j.bioinformatics.20130303.04 – volume: 5 start-page: 3 year: 2001 ident: B35 article-title: A mathematical theory of communication publication-title: ACM Sigmob. Mob. Comput. Commun. Rev. doi: 10.1145/584091.584093  | 
    
| SSID | ssj0002808902 | 
    
| Score | 2.29025 | 
    
| Snippet | Data compression is a challenging and increasingly important problem. As the amount of data generated daily continues to increase, efficient transmission and... | 
    
| SourceID | doaj unpaywall pubmedcentral proquest pubmed crossref  | 
    
| SourceType | Open Website Open Access Repository Aggregation Database Index Database  | 
    
| StartPage | 1489704 | 
    
| SubjectTerms | Bioinformatics BWT compression genomics Huffman encoding  | 
    
| SummonAdditionalLinks | – databaseName: Unpaywall dbid: UNPAY link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3LbtQwFLXQdFE2QNUC4VEZiR2kJPF7OaBWVRcVC0a0K8t2bDoi4xl1MrxW_AN_yJdwnaSjDg-prCIljmzfh32ufH0uQs-VI2VNLclJKHxOibS5VdTlQRBuRc1q7jq2z1N-PKEnZ-xsoMlJd2Gund9D8KReBYgPE9NmRcGnpRKJ-nOLM8DdI7Q1OX07Pk_V47ggueBU9Ldi_vHjxs7TEfT_DVX-mRy5vYoL8_WzaZprO8_R3b6E0bIjLEwJJx8PVq09cN9-o3O82aTuoTsDAMXj3mJ20C0fd9H7MY7zT77BDYy8gcUPJ37LtK1h03yYX07bixkGeItTQilOaeh9-mz8-f1HYnmdTd2y_2aW2ETsv_jZAmLmPTQ5Onz35jgfSi7kjlZFC4uyNOCR0hOigmSuEPDwoSgtFdwx60IQwoRQOk4VgI0yAEJnQYEjF0aYmtxHoziP_iHCTnDrJA2ceUIlqSVEw5V0orKM19BDhl5cqUMvemYNDRFJEpDuBKSTgPQgoAy9Thpbt0ys2N0LkKsenEwTxyThKgQZYDpM2RqMjRgKoEsy5aHLZ1f61uBF6WjERD9fLTUpEzKTgO0y9KDX_7orgHClAHlkSG5YxsZYNr_E6UXH1N3x7UGMmqGXayO6wWQf_V_zx-h2laoSF2VekSdo1F6u_FOASq3dH3zkF6y9EGE priority: 102 providerName: Unpaywall  | 
    
| Title | A novel lossless encoding algorithm for data compression–genomics data as an exemplar | 
    
| URI | https://www.ncbi.nlm.nih.gov/pubmed/39917339 https://www.proquest.com/docview/3164398159 https://pubmed.ncbi.nlm.nih.gov/PMC11799261 https://doi.org/10.3389/fbinf.2024.1489704 https://doaj.org/article/3c58369ff8fc4259bdd5d3a4439859ee  | 
    
| UnpaywallVersion | publishedVersion | 
    
| Volume | 4 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 2673-7647 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0002808902 issn: 2673-7647 databaseCode: DOA dateStart: 20210101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2673-7647 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0002808902 issn: 2673-7647 databaseCode: M~E dateStart: 20210101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVAQN databaseName: PubMed Central customDbUrl: eissn: 2673-7647 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0002808902 issn: 2673-7647 databaseCode: RPM dateStart: 20210101 isFulltext: true titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/ providerName: National Library of Medicine  | 
    
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3LjtMwFLXQsIANYsQrw0xlJHYQTRK_lwUxGrEYsaBiWFm2YzNFqVtNWx67-Qf-kC_h2kmrViDBglWkJIrtc_041745F6HnypG6pZaUJFS-pETa0irqyiAIt6JlLXdZ7fOCn0_o20t2uZPqK8WE9fLAPXCnxDFJuApBBgf9S9kWPkAMhYVUMuV9mn0rqXacqc95y6hKB2j9XzLghanTAI5mkuxsKEwOUokhM9tmJcqC_X9imb8HS95Zx4X5_tV03c5KdHYf3RsoJB73VT9Et3x8gD6McZx_8R3uoKwOpi-cFCrTwoRN92l-PV1dzTAQVJxCQnEKJO8DYOPPmx9Jp3U2dcv-mVliE7H_5mcL8HofosnZm_evz8shaUIJ8FQrmFalgTElPSEqSOYqARcfqtpSwR2zLgQhTAi141QBXagDcGwWFAzFygjTkkfoIM6jf4KwE9w6SQNnnlBJWgn-bCOdaCzjLZRQoBcbAPWi18bQ4FMkuHWGWye49QB3gV4ljLdvJl3rfAOsrQdr679Zu0DPNhbSMA7S4YaJfr5ealInbiWBnRXocW-xbVFAwmoBeBRI7tlyry77T-L0KmttZ8U88DIL9HJr9n9o7NH_aOxTdLdJ2YarumzIMTpYXa_9CVCglR3l3j7Ke1MjdHty8W788Rf10gjw | 
    
| linkProvider | Directory of Open Access Journals | 
    
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3LbtQwFLXQdFE2QNUC4VEZiR2kJPF7OaBWVRcVC0a0K8t2bDoi4xl1MrxW_AN_yJdwnaSjDg-prCIljmzfh32ufH0uQs-VI2VNLclJKHxOibS5VdTlQRBuRc1q7jq2z1N-PKEnZ-xsoMlJd2Gund9D8KReBYgPE9NmRcGnpRKJ-nOLM8DdI7Q1OX07Pk_V47ggueBU9Ldi_vHjxs7TEfT_DVX-mRy5vYoL8_WzaZprO8_R3b6E0bIjLEwJJx8PVq09cN9-o3O82aTuoTsDAMXj3mJ20C0fd9H7MY7zT77BDYy8gcUPJ37LtK1h03yYX07bixkGeItTQilOaeh9-mz8-f1HYnmdTd2y_2aW2ETsv_jZAmLmPTQ5Onz35jgfSi7kjlZFC4uyNOCR0hOigmSuEPDwoSgtFdwx60IQwoRQOk4VgI0yAEJnQYEjF0aYmtxHoziP_iHCTnDrJA2ceUIlqSVEw5V0orKM19BDhl5cqUMvemYNDRFJEpDuBKSTgPQgoAy9Thpbt0ys2N0LkKsenEwTxyThKgQZYDpM2RqMjRgKoEsy5aHLZ1f61uBF6WjERD9fLTUpEzKTgO0y9KDX_7orgHClAHlkSG5YxsZYNr_E6UXH1N3x7UGMmqGXayO6wWQf_V_zx-h2laoSF2VekSdo1F6u_FOASq3dH3zkF6y9EGE | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+novel+lossless+encoding+algorithm+for+data+compression%E2%80%93genomics+data+as+an+exemplar&rft.jtitle=Frontiers+in+bioinformatics&rft.au=Al-okaily%2C+Anas&rft.au=Tbakhi%2C+Abdelghani&rft.date=2025-01-23&rft.issn=2673-7647&rft.eissn=2673-7647&rft.volume=4&rft_id=info:doi/10.3389%2Ffbinf.2024.1489704&rft.externalDBID=n%2Fa&rft.externalDocID=10_3389_fbinf_2024_1489704 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2673-7647&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2673-7647&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2673-7647&client=summon |