A novel lossless encoding algorithm for data compression–genomics data as an exemplar

Data compression is a challenging and increasingly important problem. As the amount of data generated daily continues to increase, efficient transmission and storage have never been more critical. In this study, a novel encoding algorithm is proposed, motivated by the compression of DNA data and ass...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in bioinformatics Vol. 4; p. 1489704
Main Authors Al-okaily, Anas, Tbakhi, Abdelghani
Format Journal Article
LanguageEnglish
Published Switzerland Frontiers Media S.A 23.01.2025
Subjects
Online AccessGet full text
ISSN2673-7647
2673-7647
DOI10.3389/fbinf.2024.1489704

Cover

Abstract Data compression is a challenging and increasingly important problem. As the amount of data generated daily continues to increase, efficient transmission and storage have never been more critical. In this study, a novel encoding algorithm is proposed, motivated by the compression of DNA data and associated characteristics. The proposed algorithm follows a divide-and-conquer approach by scanning the whole genome, classifying subsequences based on similarities in their content, and binning similar subsequences together. The data is then compressed into each bin independently. This approach is different than the currently known approaches: entropy, dictionary, predictive, or transform-based methods. Proof-of-concept performance was evaluated using a benchmark dataset with seventeen genomes ranging in size from kilobytes to gigabytes. The results showed a considerable improvement in the compression of each genome, preserving several megabytes compared to state-of-the-art tools. Moreover, the algorithm can be applied to the compression of other data types include mainly text, numbers, images, audio, and video which are being generated daily and unprecedentedly in massive volumes.
AbstractList Data compression is a challenging and increasingly important problem. As the amount of data generated daily continues to increase, efficient transmission and storage have never been more critical. In this study, a novel encoding algorithm is proposed, motivated by the compression of DNA data and associated characteristics. The proposed algorithm follows a divide-and-conquer approach by scanning the whole genome, classifying subsequences based on similarities in their content, and binning similar subsequences together. The data is then compressed into each bin independently. This approach is different than the currently known approaches: entropy, dictionary, predictive, or transform-based methods. Proof-of-concept performance was evaluated using a benchmark dataset with seventeen genomes ranging in size from kilobytes to gigabytes. The results showed a considerable improvement in the compression of each genome, preserving several megabytes compared to state-of-the-art tools. Moreover, the algorithm can be applied to the compression of other data types include mainly text, numbers, images, audio, and video which are being generated daily and unprecedentedly in massive volumes.
Data compression is a challenging and increasingly important problem. As the amount of data generated daily continues to increase, efficient transmission and storage have never been more critical. In this study, a novel encoding algorithm is proposed, motivated by the compression of DNA data and associated characteristics. The proposed algorithm follows a divide-and-conquer approach by scanning the whole genome, classifying subsequences based on similarities in their content, and binning similar subsequences together. The data is then compressed into each bin independently. This approach is different than the currently known approaches: entropy, dictionary, predictive, or transform-based methods. Proof-of-concept performance was evaluated using a benchmark dataset with seventeen genomes ranging in size from kilobytes to gigabytes. The results showed a considerable improvement in the compression of each genome, preserving several megabytes compared to state-of-the-art tools. Moreover, the algorithm can be applied to the compression of other data types include mainly text, numbers, images, audio, and video which are being generated daily and unprecedentedly in massive volumes.Data compression is a challenging and increasingly important problem. As the amount of data generated daily continues to increase, efficient transmission and storage have never been more critical. In this study, a novel encoding algorithm is proposed, motivated by the compression of DNA data and associated characteristics. The proposed algorithm follows a divide-and-conquer approach by scanning the whole genome, classifying subsequences based on similarities in their content, and binning similar subsequences together. The data is then compressed into each bin independently. This approach is different than the currently known approaches: entropy, dictionary, predictive, or transform-based methods. Proof-of-concept performance was evaluated using a benchmark dataset with seventeen genomes ranging in size from kilobytes to gigabytes. The results showed a considerable improvement in the compression of each genome, preserving several megabytes compared to state-of-the-art tools. Moreover, the algorithm can be applied to the compression of other data types include mainly text, numbers, images, audio, and video which are being generated daily and unprecedentedly in massive volumes.
Author Al-okaily, Anas
Tbakhi, Abdelghani
AuthorAffiliation 2 Department of Pathology and Molecular Medicine , McMaster University , Hamilton , ON , Canada
1 Department of Cell Therapy and Applied Genomics , King Hussein Cancer Center , Amman , Jordan
AuthorAffiliation_xml – name: 1 Department of Cell Therapy and Applied Genomics , King Hussein Cancer Center , Amman , Jordan
– name: 2 Department of Pathology and Molecular Medicine , McMaster University , Hamilton , ON , Canada
Author_xml – sequence: 1
  givenname: Anas
  surname: Al-okaily
  fullname: Al-okaily, Anas
– sequence: 2
  givenname: Abdelghani
  surname: Tbakhi
  fullname: Tbakhi, Abdelghani
BackLink https://www.ncbi.nlm.nih.gov/pubmed/39917339$$D View this record in MEDLINE/PubMed
BookMark eNqNUU1v1DAQtVARLaV_gAPKkcsudpzE9glVFR-VKnEBcbQmzjh15diLnW3pjf_Qf9hfgrdZqvbGaSzPm_fevHlNDkIMSMhbRtecS_XB9i7YdU3rZs0aqQRtXpCjuhN8JbpGHDx5H5KTnK8opbWkUtH6FTnkSjHBuToiP0-rEK_RVz7m7DHnCoOJgwtjBX6Myc2XU2VjqgaYoTJx2qQCcjHc_7kbMcTJmbz0IFcQKvyN08ZDekNeWvAZT_b1mPz4_On72dfVxbcv52enFyvT1HReMSqh41JiMWNla6goBS1lfSM60_bGWiHAWma6RomWMitl3VrFmaIgYODH5HzhHSJc6U1yE6RbHcHph4-YRg1pdsaj5qaVvFPWSlvEW9UPQztwaBquZKsQCxdfuLZhA7c34P0jIaN6l7p-SF3vUtf71MvUx2Vqs-0nHAyGOYF_ZuV5J7hLPcZrzZhQqu5YYXi_Z0jx1xbzrCeXDXoPAeM2a866nUfWqgJ991TsUeXfQQugXgAmlYMmtP-zwl-Girc7
Cites_doi 10.1109/ITCC.2001.918838
10.1093/comjnl/30.6.541
10.1145/322344.322346
10.17487/RFC3943
10.1109/tit.1980.1056237
10.1109/tit.1978.1055934
10.1016/0196-6774(85)90036-7
10.3390/info7040056
10.1126/science.2983426
10.1109/jrproc.1952.273898
10.1109/82.219839
10.1109/tit.1975.1055349
10.26483/ijarcs.v8i3.3086
10.1093/gigascience/giaa072
10.1109/18.382012
10.3390/a13040099
10.1109/ICICT48043.2020.9112516
10.1109/tit.1959.1057512
10.1371/journal.pone.0059190
10.1145/31846.42227
10.1109/tit.1977.1055714
10.1109/DCC.1991.213344
10.1093/nar/gkp1137
10.1109/tcom.1984.1096090
10.1147/rd.282.0135
10.1016/0166-218x(93)00116-h
10.1109/mc.1984.1659158
10.1093/bioinformatics/btp352
10.1145/5684.5688
10.1016/j.jksuci.2017.10.007
10.1016/0306-4573(94)90014-0
10.5923/j.bioinformatics.20130303.04
10.1145/584091.584093
ContentType Journal Article
Copyright Copyright © 2025 Al-okaily and Tbakhi.
Copyright © 2025 Al-okaily and Tbakhi. 2025 Al-okaily and Tbakhi
Copyright_xml – notice: Copyright © 2025 Al-okaily and Tbakhi.
– notice: Copyright © 2025 Al-okaily and Tbakhi. 2025 Al-okaily and Tbakhi
DBID AAYXX
CITATION
NPM
7X8
5PM
ADTOC
UNPAY
DOA
DOI 10.3389/fbinf.2024.1489704
DatabaseName CrossRef
PubMed
MEDLINE - Academic
PubMed Central (Full Participant titles)
Unpaywall for CDI: Periodical Content
Unpaywall
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
PubMed
MEDLINE - Academic
DatabaseTitleList CrossRef
PubMed
MEDLINE - Academic


Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
DocumentTitleAlternate Al-okaily and Tbakhi
EISSN 2673-7647
ExternalDocumentID oai_doaj_org_article_3c58369ff8fc4259bdd5d3a4439859ee
10.3389/fbinf.2024.1489704
PMC11799261
39917339
10_3389_fbinf_2024_1489704
Genre Journal Article
GroupedDBID 53G
9T4
AAFWJ
AAYXX
AFPKN
ALMA_UNASSIGNED_HOLDINGS
CITATION
GROUPED_DOAJ
M~E
OK1
PGMZT
RPM
NPM
7X8
5PM
ADTOC
UNPAY
ID FETCH-LOGICAL-c420t-108a6388e339f85c079f8ef01b476c5bcff77aff1c6497501f8825f93190a7ad3
IEDL.DBID DOA
ISSN 2673-7647
IngestDate Fri Oct 03 12:41:34 EDT 2025
Sun Oct 26 04:10:05 EDT 2025
Thu Aug 21 18:38:30 EDT 2025
Thu Oct 02 05:03:25 EDT 2025
Sun Feb 09 01:20:42 EST 2025
Tue Jul 01 03:01:15 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords Huffman encoding
genomics
compression
BWT
LZ
Language English
License Copyright © 2025 Al-okaily and Tbakhi.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
cc-by
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c420t-108a6388e339f85c079f8ef01b476c5bcff77aff1c6497501f8825f93190a7ad3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Reviewed by: Soham Sengupta, St. Jude Children’s Research Hospital, United States
Edited by: Ali Kadhum Idrees, University of Babylon, Iraq
Bhavika Mam, Independent Researcher, Palo Alto, CA, United States
OpenAccessLink https://doaj.org/article/3c58369ff8fc4259bdd5d3a4439859ee
PMID 39917339
PQID 3164398159
PQPubID 23479
ParticipantIDs doaj_primary_oai_doaj_org_article_3c58369ff8fc4259bdd5d3a4439859ee
unpaywall_primary_10_3389_fbinf_2024_1489704
pubmedcentral_primary_oai_pubmedcentral_nih_gov_11799261
proquest_miscellaneous_3164398159
pubmed_primary_39917339
crossref_primary_10_3389_fbinf_2024_1489704
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2025-01-23
PublicationDateYYYYMMDD 2025-01-23
PublicationDate_xml – month: 01
  year: 2025
  text: 2025-01-23
  day: 23
PublicationDecade 2020
PublicationPlace Switzerland
PublicationPlace_xml – name: Switzerland
PublicationTitle Frontiers in bioinformatics
PublicationTitleAlternate Front Bioinform
PublicationYear 2025
Publisher Frontiers Media S.A
Publisher_xml – name: Frontiers Media S.A
References Martín (B30) 1979
Langdon (B25) 1984; 28
Mahoney (B28) 2005
Jahaan (B20) 2017; 8
Willems (B42) 1995; 41
Fraenkel (B14) 1996; 64
Uthayakumar (B39) 2018; 32
Li (B26) 2009; 25
Awan (B1) 2001
Bonfield (B4) 2013; 8
Grumbach (B17) 1994; 30
Ziv (B45) 1978; 24
Williams (B43) 1991
Hosseini (B18) 2016; 7
Salomon (B34) 2004
Lipman (B27) 1985; 227
Elias (B12) 1975; 21
Stout (B37) 1980; 26
Shannon (B35) 2001; 5
Fano (B13) 1949
Huffman (B19) 1952; 40
Cock (B8) 2010; 38
Oberhumer (B31) 2008
Cleary (B7) 1984; 32
Kodituwakku (B23) 2010; 1
Cover (B10) 1999
Vitter (B40) 1987; 34
Storer (B36) 1982; 29
Mansouri (B29) 2020; 13
Tunstall (B38) 1968
Kryukov (B24) 2020; 9
Ranganathan (B32) 1993; 40
Friend (B15) 2004; 3943
Cormack (B9) 1987; 30
Duda (B11) 2013
Gopinath (B16) 2020
Bakr (B2) 2013; 3
Capon (B6) 1959; 5
Knuth (B22) 1985; 6
Ziv (B44) 1977; 23
Burrows (B5) 1994
Bentley (B3) 1986; 29
Kavitha (B21) 2016; 7
Welch (B41) 1984; 17
Ryabko (B33) 1980; 16
References_xml – start-page: 452
  volume-title: Proceedings international Conference on information Technology: Coding and computing
  year: 2001
  ident: B1
  article-title: Lipt: a lossless text transform to improve compression
  doi: 10.1109/ITCC.2001.918838
– volume: 30
  start-page: 541
  year: 1987
  ident: B9
  article-title: Data compression using dynamic markov modelling
  publication-title: Comput. J.
  doi: 10.1093/comjnl/30.6.541
– year: 2008
  ident: B31
  article-title: Lzo-a real-time data compression library
– volume-title: Data compression: the complete reference
  year: 2004
  ident: B34
– volume: 29
  start-page: 928
  year: 1982
  ident: B36
  article-title: Data compression via textual substitution
  publication-title: J. ACM (JACM)
  doi: 10.1145/322344.322346
– volume: 16
  start-page: 16
  year: 1980
  ident: B33
  article-title: Data compression by means of a “book stack”
  publication-title: Probl. Peredachi Inf.
– volume-title: The transmission of information
  year: 1949
  ident: B13
– volume: 3943
  year: 2004
  ident: B15
  article-title: Transport layer security (TLS) protocol compression using lempel-ziv-stac (LZS)
  publication-title: RFC
  doi: 10.17487/RFC3943
– volume: 26
  start-page: 607
  year: 1980
  ident: B37
  article-title: Improved prefix encodings of the natural numbers (corresp.)
  publication-title: IEEE Trans. Inf. Theory
  doi: 10.1109/tit.1980.1056237
– volume: 24
  start-page: 530
  year: 1978
  ident: B45
  article-title: Compression of individual sequences via variable-rate coding
  publication-title: IEEE Trans. Inf. Theory
  doi: 10.1109/tit.1978.1055934
– volume: 6
  start-page: 163
  year: 1985
  ident: B22
  article-title: Dynamic huffman coding
  publication-title: J. algorithms
  doi: 10.1016/0196-6774(85)90036-7
– volume: 7
  start-page: 56
  year: 2016
  ident: B18
  article-title: A survey on data compression methods for biological sequences
  publication-title: Information
  doi: 10.3390/info7040056
– volume: 227
  start-page: 1435
  year: 1985
  ident: B27
  article-title: Rapid and sensitive protein similarity searches
  publication-title: Science
  doi: 10.1126/science.2983426
– volume: 40
  start-page: 1098
  year: 1952
  ident: B19
  article-title: A method for the construction of minimum-redundancy codes
  publication-title: Proc. IRE
  doi: 10.1109/jrproc.1952.273898
– volume: 40
  start-page: 96
  year: 1993
  ident: B32
  article-title: High-speed vlsi designs for lempel-ziv-based data compression
  publication-title: IEEE Trans. Circuits Syst. II Analog Digital Signal Process.
  doi: 10.1109/82.219839
– volume: 21
  start-page: 194
  year: 1975
  ident: B12
  article-title: Universal codeword sets and representations of the integers
  publication-title: IEEE Trans. Inf. theory
  doi: 10.1109/tit.1975.1055349
– volume: 8
  year: 2017
  ident: B20
  article-title: A comparative study and survey on existing dna compression techniques
  publication-title: Int. J. Adv. Res. Comput. Sci.
  doi: 10.26483/ijarcs.v8i3.3086
– year: 2005
  ident: B28
  article-title: Adaptive weighing of context models for lossless data compression
  publication-title: Tech. Rep.
– start-page: 2540
  year: 2013
  ident: B11
  article-title: Asymmetric numeral systems: entropy coding combining speed of huffman coding with compression rate of arithmetic coding
– volume: 9
  start-page: giaa072
  year: 2020
  ident: B24
  article-title: Sequence compression benchmark (scb) database—a comprehensive evaluation of reference-free compressors for fasta-formatted sequences
  publication-title: GigaScience
  doi: 10.1093/gigascience/giaa072
– volume: 41
  start-page: 653
  year: 1995
  ident: B42
  article-title: The context-tree weighting method: basic properties
  publication-title: IEEE Trans. Inf. theory
  doi: 10.1109/18.382012
– volume: 13
  start-page: 99
  year: 2020
  ident: B29
  article-title: A new lossless dna compression algorithm based on a single-block encoding scheme
  publication-title: Algorithms
  doi: 10.3390/a13040099
– volume-title: Elements of information theory
  year: 1999
  ident: B10
– start-page: 628
  volume-title: 2020 international Conference on inventive computation technologies (ICICT)
  year: 2020
  ident: B16
  article-title: Comparison of lossless data compression techniques
  doi: 10.1109/ICICT48043.2020.9112516
– volume-title: A block-sorting lossless data compression algorithm
  year: 1994
  ident: B5
– volume: 1
  start-page: 416
  year: 2010
  ident: B23
  article-title: Comparison of lossless data compression algorithms for text data
  publication-title: Indian J. Comput. Sci. Eng.
– volume: 5
  start-page: 157
  year: 1959
  ident: B6
  article-title: A probabilistic model for run-length coding of pictures
  publication-title: IRE Trans. Inf. Theory
  doi: 10.1109/tit.1959.1057512
– volume: 8
  start-page: e59190
  year: 2013
  ident: B4
  article-title: Compression of fastq and sam format sequencing data
  publication-title: PloS one
  doi: 10.1371/journal.pone.0059190
– volume: 34
  start-page: 825
  year: 1987
  ident: B40
  article-title: Design and analysis of dynamic huffman codes
  publication-title: J. ACM (JACM)
  doi: 10.1145/31846.42227
– volume: 23
  start-page: 337
  year: 1977
  ident: B44
  article-title: A universal algorithm for sequential data compression
  publication-title: IEEE Trans. Inf. theory
  doi: 10.1109/tit.1977.1055714
– start-page: 362
  volume-title: [1991] proceedings. Data compression conference
  year: 1991
  ident: B43
  article-title: An extremely fast ziv-lempel data compression algorithm
  doi: 10.1109/DCC.1991.213344
– volume: 38
  start-page: 1767
  year: 2010
  ident: B8
  article-title: The sanger fastq file format for sequences with quality scores, and the solexa/illumina fastq variants
  publication-title: Nucleic acids Res.
  doi: 10.1093/nar/gkp1137
– volume: 32
  start-page: 396
  year: 1984
  ident: B7
  article-title: Data compression using adaptive coding and partial string matching
  publication-title: IEEE Trans. Commun.
  doi: 10.1109/tcom.1984.1096090
– volume: 7
  year: 2016
  ident: B21
  article-title: A survey on lossless and lossy data compression methods
  publication-title: Int. J. Comput. Sci. and Eng. Technol. (IJCSET)
– volume: 28
  start-page: 135
  year: 1984
  ident: B25
  article-title: An introduction to arithmetic coding
  publication-title: IBM J. Res. Dev.
  doi: 10.1147/rd.282.0135
– volume: 64
  start-page: 31
  year: 1996
  ident: B14
  article-title: Robust universal complete codes for transmission and compression
  publication-title: Discrete Appl. Math.
  doi: 10.1016/0166-218x(93)00116-h
– volume: 17
  start-page: 8
  year: 1984
  ident: B41
  article-title: A technique for high-performance data compression
  publication-title: Computer
  doi: 10.1109/mc.1984.1659158
– volume: 25
  start-page: 2078
  year: 2009
  ident: B26
  article-title: The sequence alignment/map format and samtools
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btp352
– volume: 29
  start-page: 320
  year: 1986
  ident: B3
  article-title: A locally adaptive data compression scheme
  publication-title: Commun. ACM
  doi: 10.1145/5684.5688
– volume-title: Synthesis of noiseless compression codes
  year: 1968
  ident: B38
– volume: 32
  start-page: 647
  year: 2018
  ident: B39
  article-title: Swarm intelligence based classification rule induction (CRI) framework for qualitative and quantitative approach: an application of bankruptcy prediction and credit risk analysis
  publication-title: J. King Saud University-Computer Inf. Sci.
  doi: 10.1016/j.jksuci.2017.10.007
– start-page: 24
  year: 1979
  ident: B30
  article-title: Range encoding: an algorithm for removing redundancy from a digitised message
  publication-title: Video Data Rec. Conf.
– volume: 30
  start-page: 875
  year: 1994
  ident: B17
  article-title: A new challenge for compression algorithms: genetic sequences
  publication-title: Inf. Process. and Manag.
  doi: 10.1016/0306-4573(94)90014-0
– volume: 3
  start-page: 72
  year: 2013
  ident: B2
  article-title: Dna lossless compression algorithms
  publication-title: Am. J. Bioinforma. Res.
  doi: 10.5923/j.bioinformatics.20130303.04
– volume: 5
  start-page: 3
  year: 2001
  ident: B35
  article-title: A mathematical theory of communication
  publication-title: ACM Sigmob. Mob. Comput. Commun. Rev.
  doi: 10.1145/584091.584093
SSID ssj0002808902
Score 2.29025
Snippet Data compression is a challenging and increasingly important problem. As the amount of data generated daily continues to increase, efficient transmission and...
SourceID doaj
unpaywall
pubmedcentral
proquest
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
StartPage 1489704
SubjectTerms Bioinformatics
BWT
compression
genomics
Huffman encoding
SummonAdditionalLinks – databaseName: Unpaywall
  dbid: UNPAY
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3LbtQwFLXQdFE2QNUC4VEZiR2kJPF7OaBWVRcVC0a0K8t2bDoi4xl1MrxW_AN_yJdwnaSjDg-prCIljmzfh32ufH0uQs-VI2VNLclJKHxOibS5VdTlQRBuRc1q7jq2z1N-PKEnZ-xsoMlJd2Gund9D8KReBYgPE9NmRcGnpRKJ-nOLM8DdI7Q1OX07Pk_V47ggueBU9Ldi_vHjxs7TEfT_DVX-mRy5vYoL8_WzaZprO8_R3b6E0bIjLEwJJx8PVq09cN9-o3O82aTuoTsDAMXj3mJ20C0fd9H7MY7zT77BDYy8gcUPJ37LtK1h03yYX07bixkGeItTQilOaeh9-mz8-f1HYnmdTd2y_2aW2ETsv_jZAmLmPTQ5Onz35jgfSi7kjlZFC4uyNOCR0hOigmSuEPDwoSgtFdwx60IQwoRQOk4VgI0yAEJnQYEjF0aYmtxHoziP_iHCTnDrJA2ceUIlqSVEw5V0orKM19BDhl5cqUMvemYNDRFJEpDuBKSTgPQgoAy9Thpbt0ys2N0LkKsenEwTxyThKgQZYDpM2RqMjRgKoEsy5aHLZ1f61uBF6WjERD9fLTUpEzKTgO0y9KDX_7orgHClAHlkSG5YxsZYNr_E6UXH1N3x7UGMmqGXayO6wWQf_V_zx-h2laoSF2VekSdo1F6u_FOASq3dH3zkF6y9EGE
  priority: 102
  providerName: Unpaywall
Title A novel lossless encoding algorithm for data compression–genomics data as an exemplar
URI https://www.ncbi.nlm.nih.gov/pubmed/39917339
https://www.proquest.com/docview/3164398159
https://pubmed.ncbi.nlm.nih.gov/PMC11799261
https://doi.org/10.3389/fbinf.2024.1489704
https://doaj.org/article/3c58369ff8fc4259bdd5d3a4439859ee
UnpaywallVersion publishedVersion
Volume 4
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 2673-7647
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0002808902
  issn: 2673-7647
  databaseCode: DOA
  dateStart: 20210101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2673-7647
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0002808902
  issn: 2673-7647
  databaseCode: M~E
  dateStart: 20210101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVAQN
  databaseName: PubMed Central
  customDbUrl:
  eissn: 2673-7647
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0002808902
  issn: 2673-7647
  databaseCode: RPM
  dateStart: 20210101
  isFulltext: true
  titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/
  providerName: National Library of Medicine
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3LjtMwFLXQsIANYsQrw0xlJHYQTRK_lwUxGrEYsaBiWFm2YzNFqVtNWx67-Qf-kC_h2kmrViDBglWkJIrtc_041745F6HnypG6pZaUJFS-pETa0irqyiAIt6JlLXdZ7fOCn0_o20t2uZPqK8WE9fLAPXCnxDFJuApBBgf9S9kWPkAMhYVUMuV9mn0rqXacqc95y6hKB2j9XzLghanTAI5mkuxsKEwOUokhM9tmJcqC_X9imb8HS95Zx4X5_tV03c5KdHYf3RsoJB73VT9Et3x8gD6McZx_8R3uoKwOpi-cFCrTwoRN92l-PV1dzTAQVJxCQnEKJO8DYOPPmx9Jp3U2dcv-mVliE7H_5mcL8HofosnZm_evz8shaUIJ8FQrmFalgTElPSEqSOYqARcfqtpSwR2zLgQhTAi141QBXagDcGwWFAzFygjTkkfoIM6jf4KwE9w6SQNnnlBJWgn-bCOdaCzjLZRQoBcbAPWi18bQ4FMkuHWGWye49QB3gV4ljLdvJl3rfAOsrQdr679Zu0DPNhbSMA7S4YaJfr5ealInbiWBnRXocW-xbVFAwmoBeBRI7tlyry77T-L0KmttZ8U88DIL9HJr9n9o7NH_aOxTdLdJ2YarumzIMTpYXa_9CVCglR3l3j7Ke1MjdHty8W788Rf10gjw
linkProvider Directory of Open Access Journals
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3LbtQwFLXQdFE2QNUC4VEZiR2kJPF7OaBWVRcVC0a0K8t2bDoi4xl1MrxW_AN_yJdwnaSjDg-prCIljmzfh32ufH0uQs-VI2VNLclJKHxOibS5VdTlQRBuRc1q7jq2z1N-PKEnZ-xsoMlJd2Gund9D8KReBYgPE9NmRcGnpRKJ-nOLM8DdI7Q1OX07Pk_V47ggueBU9Ldi_vHjxs7TEfT_DVX-mRy5vYoL8_WzaZprO8_R3b6E0bIjLEwJJx8PVq09cN9-o3O82aTuoTsDAMXj3mJ20C0fd9H7MY7zT77BDYy8gcUPJ37LtK1h03yYX07bixkGeItTQilOaeh9-mz8-f1HYnmdTd2y_2aW2ETsv_jZAmLmPTQ5Onz35jgfSi7kjlZFC4uyNOCR0hOigmSuEPDwoSgtFdwx60IQwoRQOk4VgI0yAEJnQYEjF0aYmtxHoziP_iHCTnDrJA2ceUIlqSVEw5V0orKM19BDhl5cqUMvemYNDRFJEpDuBKSTgPQgoAy9Thpbt0ys2N0LkKsenEwTxyThKgQZYDpM2RqMjRgKoEsy5aHLZ1f61uBF6WjERD9fLTUpEzKTgO0y9KDX_7orgHClAHlkSG5YxsZYNr_E6UXH1N3x7UGMmqGXayO6wWQf_V_zx-h2laoSF2VekSdo1F6u_FOASq3dH3zkF6y9EGE
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+novel+lossless+encoding+algorithm+for+data+compression%E2%80%93genomics+data+as+an+exemplar&rft.jtitle=Frontiers+in+bioinformatics&rft.au=Al-okaily%2C+Anas&rft.au=Tbakhi%2C+Abdelghani&rft.date=2025-01-23&rft.issn=2673-7647&rft.eissn=2673-7647&rft.volume=4&rft_id=info:doi/10.3389%2Ffbinf.2024.1489704&rft.externalDBID=n%2Fa&rft.externalDocID=10_3389_fbinf_2024_1489704
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2673-7647&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2673-7647&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2673-7647&client=summon