EC: an efficient error correction algorithm for short reads
Background In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimm...
        Saved in:
      
    
          | Published in | BMC bioinformatics Vol. 16; no. Suppl 17; p. S2 | 
|---|---|
| Main Authors | , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        London
          BioMed Central
    
        07.12.2015
     BioMed Central Ltd  | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 1471-2105 1471-2105  | 
| DOI | 10.1186/1471-2105-16-S17-S2 | 
Cover
| Abstract | Background
In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimming and by correcting the erroneous bases of the reads. It helps to achieve high quality data and the computational complexity of many biological applications will be greatly reduced if the reads are first corrected. We have developed a novel error correction algorithm called EC and compared it with four other state-of-the-art algorithms using both real and simulated sequencing reads.
Results
We have done extensive and rigorous experiments that reveal that EC is indeed an effective, scalable, and efficient error correction tool. Real reads that we have employed in our performance evaluation are Illumina-generated short reads of various lengths. Six experimental datasets we have utilized are taken from sequence and read archive (SRA) at NCBI. The simulated reads are obtained by picking substrings from random positions of reference genomes. To introduce errors, some of the bases of the simulated reads are changed to other bases with some probabilities.
Conclusions
Error correction is a vital problem in biology especially for NGS data. In this paper we present a novel algorithm, called
Error Corrector (EC)
, for correcting substitution errors in biological sequencing reads. We plan to investigate the possibility of employing the techniques introduced in this research paper to handle insertion and deletion errors also.
Software availability
The implementation is freely available for non-commercial purposes. It can be downloaded from:
http://engr.uconn.edu/~rajasek/EC.zip
. | 
    
|---|---|
| AbstractList | In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimming and by correcting the erroneous bases of the reads. It helps to achieve high quality data and the computational complexity of many biological applications will be greatly reduced if the reads are first corrected. We have developed a novel error correction algorithm called EC and compared it with four other state-of-the-art algorithms using both real and simulated sequencing reads.
We have done extensive and rigorous experiments that reveal that EC is indeed an effective, scalable, and efficient error correction tool. Real reads that we have employed in our performance evaluation are Illumina-generated short reads of various lengths. Six experimental datasets we have utilized are taken from sequence and read archive (SRA) at NCBI. The simulated reads are obtained by picking substrings from random positions of reference genomes. To introduce errors, some of the bases of the simulated reads are changed to other bases with some probabilities.
Error correction is a vital problem in biology especially for NGS data. In this paper we present a novel algorithm, called Error Corrector (EC), for correcting substitution errors in biological sequencing reads. We plan to investigate the possibility of employing the techniques introduced in this research paper to handle insertion and deletion errors also.
The implementation is freely available for non-commercial purposes. It can be downloaded from: http://engr.uconn.edu/~rajasek/EC.zip. In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimming and by correcting the erroneous bases of the reads. It helps to achieve high quality data and the computational complexity of many biological applications will be greatly reduced if the reads are first corrected. We have developed a novel error correction algorithm called EC and compared it with four other state-of-the-art algorithms using both real and simulated sequencing reads.BACKGROUNDIn highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimming and by correcting the erroneous bases of the reads. It helps to achieve high quality data and the computational complexity of many biological applications will be greatly reduced if the reads are first corrected. We have developed a novel error correction algorithm called EC and compared it with four other state-of-the-art algorithms using both real and simulated sequencing reads.We have done extensive and rigorous experiments that reveal that EC is indeed an effective, scalable, and efficient error correction tool. Real reads that we have employed in our performance evaluation are Illumina-generated short reads of various lengths. Six experimental datasets we have utilized are taken from sequence and read archive (SRA) at NCBI. The simulated reads are obtained by picking substrings from random positions of reference genomes. To introduce errors, some of the bases of the simulated reads are changed to other bases with some probabilities.RESULTSWe have done extensive and rigorous experiments that reveal that EC is indeed an effective, scalable, and efficient error correction tool. Real reads that we have employed in our performance evaluation are Illumina-generated short reads of various lengths. Six experimental datasets we have utilized are taken from sequence and read archive (SRA) at NCBI. The simulated reads are obtained by picking substrings from random positions of reference genomes. To introduce errors, some of the bases of the simulated reads are changed to other bases with some probabilities.Error correction is a vital problem in biology especially for NGS data. In this paper we present a novel algorithm, called Error Corrector (EC), for correcting substitution errors in biological sequencing reads. We plan to investigate the possibility of employing the techniques introduced in this research paper to handle insertion and deletion errors also.CONCLUSIONSError correction is a vital problem in biology especially for NGS data. In this paper we present a novel algorithm, called Error Corrector (EC), for correcting substitution errors in biological sequencing reads. We plan to investigate the possibility of employing the techniques introduced in this research paper to handle insertion and deletion errors also.The implementation is freely available for non-commercial purposes. It can be downloaded from: http://engr.uconn.edu/~rajasek/EC.zip.SOFTWARE AVAILABILITYThe implementation is freely available for non-commercial purposes. It can be downloaded from: http://engr.uconn.edu/~rajasek/EC.zip. Background In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimming and by correcting the erroneous bases of the reads. It helps to achieve high quality data and the computational complexity of many biological applications will be greatly reduced if the reads are first corrected. We have developed a novel error correction algorithm called EC and compared it with four other state-of-the-art algorithms using both real and simulated sequencing reads. Results We have done extensive and rigorous experiments that reveal that EC is indeed an effective, scalable, and efficient error correction tool. Real reads that we have employed in our performance evaluation are Illumina-generated short reads of various lengths. Six experimental datasets we have utilized are taken from sequence and read archive (SRA) at NCBI. The simulated reads are obtained by picking substrings from random positions of reference genomes. To introduce errors, some of the bases of the simulated reads are changed to other bases with some probabilities. Conclusions Error correction is a vital problem in biology especially for NGS data. In this paper we present a novel algorithm, called Error Corrector (EC) , for correcting substitution errors in biological sequencing reads. We plan to investigate the possibility of employing the techniques introduced in this research paper to handle insertion and deletion errors also. Software availability The implementation is freely available for non-commercial purposes. It can be downloaded from: In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimming and by correcting the erroneous bases of the reads. It helps to achieve high quality data and the computational complexity of many biological applications will be greatly reduced if the reads are first corrected. We have developed a novel error correction algorithm called EC and compared it with four other state-of-the-art algorithms using both real and simulated sequencing reads. We have done extensive and rigorous experiments that reveal that EC is indeed an effective, scalable, and efficient error correction tool. Real reads that we have employed in our performance evaluation are Illumina-generated short reads of various lengths. Six experimental datasets we have utilized are taken from sequence and read archive (SRA) at NCBI. The simulated reads are obtained by picking substrings from random positions of reference genomes. To introduce errors, some of the bases of the simulated reads are changed to other bases with some probabilities. Error correction is a vital problem in biology especially for NGS data. In this paper we present a novel algorithm, called Error Corrector (EC) , for correcting substitution errors in biological sequencing reads. We plan to investigate the possibility of employing the techniques introduced in this research paper to handle insertion and deletion errors also. Background In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimming and by correcting the erroneous bases of the reads. It helps to achieve high quality data and the computational complexity of many biological applications will be greatly reduced if the reads are first corrected. We have developed a novel error correction algorithm called EC and compared it with four other state-of-the-art algorithms using both real and simulated sequencing reads. Results We have done extensive and rigorous experiments that reveal that EC is indeed an effective, scalable, and efficient error correction tool. Real reads that we have employed in our performance evaluation are Illumina-generated short reads of various lengths. Six experimental datasets we have utilized are taken from sequence and read archive (SRA) at NCBI. The simulated reads are obtained by picking substrings from random positions of reference genomes. To introduce errors, some of the bases of the simulated reads are changed to other bases with some probabilities. Conclusions Error correction is a vital problem in biology especially for NGS data. In this paper we present a novel algorithm, called Error Corrector (EC) , for correcting substitution errors in biological sequencing reads. We plan to investigate the possibility of employing the techniques introduced in this research paper to handle insertion and deletion errors also. Software availability The implementation is freely available for non-commercial purposes. It can be downloaded from: http://engr.uconn.edu/~rajasek/EC.zip .  | 
    
| ArticleNumber | S2 | 
    
| Audience | Academic | 
    
| Author | Rajasekaran, Sanguthevar Saha, Subrata  | 
    
| AuthorAffiliation | 1 Department of Computer Science & Engineering, University of Connecticut, 371 Fairfield Way, Unit 4155, Storrs, CT 06269-4155, USA | 
    
| AuthorAffiliation_xml | – name: 1 Department of Computer Science & Engineering, University of Connecticut, 371 Fairfield Way, Unit 4155, Storrs, CT 06269-4155, USA | 
    
| Author_xml | – sequence: 1 givenname: Subrata surname: Saha fullname: Saha, Subrata organization: Department of Computer Science & Engineering, University of Connecticut – sequence: 2 givenname: Sanguthevar surname: Rajasekaran fullname: Rajasekaran, Sanguthevar email: rajasek@engr.uconn.edu organization: Department of Computer Science & Engineering, University of Connecticut  | 
    
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/26678663$$D View this record in MEDLINE/PubMed | 
    
| BookMark | eNqFUl1rFDEUDVKxH_oLBBnwRR-mJpOvGQtCWaoWCoKrzyFNbmZTZpM1mVH7782wa-2KVPKQcO85J_fce4_RQYgBEHpO8CkhrXhDmCR1QzCviaiXRNbL5hE6uose3HsfouOcbzAmssX8CTpshJCtEPQInV0s3lY6VOCcNx7CWEFKMVUmpgRm9DFUeuhj8uNqXbmSyKuYxiqBtvkpeuz0kOHZ7j5BX99ffFl8rK8-fbhcnF_VhnMylgIkFcwCZra1xmnaNNJIRlt7TVzHeMdb23aAOwoYMHFAJZWcCOY6Ia0l9ASxre4UNvr2hx4GtUl-rdOtIljNvVCzUzU7VUSoTKTKTaG929I20_UarCnmkv5Djdqr_UzwK9XH74oJyVrBisCrnUCK3ybIo1r7bGAYdIA4ZUUkx6zpuOQF-nIL7fUAygcXi6KZ4eqccdpKSrtZ8PQfqHIsrL0p03W-xPcIr_cIBTPCz7HXU87qcvl5H_vivt07n79nXQB0CzAp5pzAPdzFslFqOXex-4tl_KjnxSi1--E_3N3gcvkp9JDUTZxSKLvyIO0X3WLbqw | 
    
| CitedBy_id | crossref_primary_10_3390_microorganisms9061162 crossref_primary_10_1186_s12864_018_4544_x crossref_primary_10_1186_s12859_017_1784_8  | 
    
| Cites_doi | 10.1093/bioinformatics/btr208 10.1093/bioinformatics/bth205 10.1073/pnas.171285098 10.1093/bioinformatics/btt407 10.1093/bioinformatics/btp379 10.1093/bioinformatics/btq151 10.1093/bioinformatics/btq468 10.1101/gr.111351.110 10.1186/1471-2105-9-128 10.1093/nar/gkg653; 10.1101/gr.7337908 10.1186/gb-2010-11-11-r116 10.1101/gr.208902 10.1007/978-3-319-19048-8_25 10.1093/bioinformatics/btq653 10.1093/bioinformatics/bts690 10.1093/bioinformatics/btr170  | 
    
| ContentType | Journal Article | 
    
| Copyright | Saha and Rajasekaran 2015 COPYRIGHT 2015 BioMed Central Ltd. Copyright © 2015 Saha and Rajasekaran 2015 Saha and Rajasekaran  | 
    
| Copyright_xml | – notice: Saha and Rajasekaran 2015 – notice: COPYRIGHT 2015 BioMed Central Ltd. – notice: Copyright © 2015 Saha and Rajasekaran 2015 Saha and Rajasekaran  | 
    
| DBID | C6C AAYXX CITATION CGR CUY CVF ECM EIF NPM ISR 7X8 5PM ADTOC UNPAY  | 
    
| DOI | 10.1186/1471-2105-16-S17-S2 | 
    
| DatabaseName | Springer Nature OA Free Journals CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Gale In Context: Science MEDLINE - Academic PubMed Central (Full Participant titles) Unpaywall for CDI: Periodical Content Unpaywall  | 
    
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic  | 
    
| DatabaseTitleList | MEDLINE MEDLINE - Academic  | 
    
| Database_xml | – sequence: 1 dbid: C6C name: Springer Nature OA Free Journals url: http://www.springeropen.com/ sourceTypes: Publisher – sequence: 2 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 4 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Biology | 
    
| EISSN | 1471-2105 | 
    
| EndPage | S2 | 
    
| ExternalDocumentID | 10.1186/1471-2105-16-s17-s2 PMC4674864 A453873394 26678663 10_1186_1471_2105_16_S17_S2  | 
    
| Genre | Research Support, U.S. Gov't, Non-P.H.S Research Support, Non-U.S. Gov't Journal Article Research Support, N.I.H., Extramural  | 
    
| GrantInformation_xml | – fundername: NLM NIH HHS grantid: R01-LM010101  | 
    
| GroupedDBID | --- 0R~ 23N 2WC 4.4 53G 5VS 6J9 7X7 88E 8AO 8FE 8FG 8FH 8FI 8FJ AAFWJ AAJSJ AAKPC AASML ABDBF ABUWG ACGFO ACGFS ACIHN ACIWK ACPRK ACUHS ADBBV ADMLS ADRAZ ADUKV AEAQA AENEX AEUYN AFKRA AFPKN AFRAH AHBYD AHMBA AHSBF AHYZX ALMA_UNASSIGNED_HOLDINGS AMKLP AMTXH AOIJS ARAPS AZQEC BAPOH BAWUL BBNVY BCNDV BENPR BFQNJ BGLVJ BHPHI BMC BPHCQ BVXVI C6C CCPQU CS3 DIK DU5 DWQXO E3Z EAD EAP EAS EBD EBLON EBS EJD EMB EMK EMOBN ESX F5P FYUFA GNUQQ GROUPED_DOAJ GX1 H13 HCIFZ HMCUK HYE IAO ICD IHR INH INR ISR ITC K6V K7- KQ8 LK8 M1P M48 M7P MK~ ML0 M~E O5R O5S OK1 OVT P2P P62 PGMZT PHGZM PHGZT PIMPY PJZUB PPXIY PQGLB PQQKQ PROAC PSQYO PUEGO RBZ RNS ROL RPM RSV SBL SOJ SV3 TR2 TUS UKHRP W2D WOQ WOW XH6 XSB AAYXX CITATION 123 2VQ ALIPV C1A CGR CUY CVF ECM EIF IPNFZ NPM RIG 7X8 5PM ADTOC UNPAY  | 
    
| ID | FETCH-LOGICAL-c551t-217364de04d8dcfa3227c7438db1f945958d89e093e0e01fe37375164f967dd13 | 
    
| IEDL.DBID | M48 | 
    
| ISSN | 1471-2105 | 
    
| IngestDate | Sun Oct 26 03:58:49 EDT 2025 Tue Sep 30 17:00:51 EDT 2025 Fri Sep 05 14:26:43 EDT 2025 Mon Oct 20 22:16:03 EDT 2025 Mon Oct 20 16:33:45 EDT 2025 Thu Oct 16 14:18:21 EDT 2025 Mon Jul 21 06:04:26 EDT 2025 Thu Apr 24 22:54:12 EDT 2025 Wed Oct 01 04:15:27 EDT 2025 Sat Sep 06 07:27:17 EDT 2025  | 
    
| IsDoiOpenAccess | true | 
    
| IsOpenAccess | true | 
    
| IsPeerReviewed | true | 
    
| IsScholarly | true | 
    
| Issue | Suppl 17 | 
    
| Keywords | Error Corrector Hash Table Reference Genome Synthetic Dataset Bloom Filter  | 
    
| Language | English | 
    
| License | This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. cc-by  | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-c551t-217364de04d8dcfa3227c7438db1f945958d89e093e0e01fe37375164f967dd13 | 
    
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23  | 
    
| OpenAccessLink | http://journals.scholarsportal.info/openUrl.xqy?doi=10.1186/1471-2105-16-S17-S2 | 
    
| PMID | 26678663 | 
    
| PQID | 1750429575 | 
    
| PQPubID | 23479 | 
    
| ParticipantIDs | unpaywall_primary_10_1186_1471_2105_16_s17_s2 pubmedcentral_primary_oai_pubmedcentral_nih_gov_4674864 proquest_miscellaneous_1750429575 gale_infotracmisc_A453873394 gale_infotracacademiconefile_A453873394 gale_incontextgauss_ISR_A453873394 pubmed_primary_26678663 crossref_primary_10_1186_1471_2105_16_S17_S2 crossref_citationtrail_10_1186_1471_2105_16_S17_S2 springer_journals_10_1186_1471_2105_16_S17_S2  | 
    
| ProviderPackageCode | CITATION AAYXX  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2015-12-07 | 
    
| PublicationDateYYYYMMDD | 2015-12-07 | 
    
| PublicationDate_xml | – month: 12 year: 2015 text: 2015-12-07 day: 07  | 
    
| PublicationDecade | 2010 | 
    
| PublicationPlace | London | 
    
| PublicationPlace_xml | – name: London – name: England  | 
    
| PublicationTitle | BMC bioinformatics | 
    
| PublicationTitleAbbrev | BMC Bioinformatics | 
    
| PublicationTitleAlternate | BMC Bioinformatics | 
    
| PublicationYear | 2015 | 
    
| Publisher | BioMed Central BioMed Central Ltd  | 
    
| Publisher_xml | – name: BioMed Central – name: BioMed Central Ltd  | 
    
| References | J Schroder (7183_CR14) 2009; 25 PA Pevzner (7183_CR2) 2001; 98 M Chaisson (7183_CR1) 2004; 20 P Medvedev (7183_CR7) 2011; 27 L Ilie (7183_CR9) 2013; 29 J Butler (7183_CR3) 2008; 18 L Salmela (7183_CR16) 2010; 26 MT Tammi (7183_CR10) 2003; 31 H Shi (7183_CR4) 2009 S Batzoglou (7183_CR11) 2002; 12 Y Liu (7183_CR8) 2013; 29 L Salmela (7183_CR12) 2011; 27 DR Kelley (7183_CR5) 2010; 11 X Yang (7183_CR6) 2010; 26 L Ilie (7183_CR15) 2011; 27 J Buhler (7183_CR18) 2001; 9 WC Kao (7183_CR13) 2011; 21 AD Smith (7183_CR19) 2008; 9 S Saha (7183_CR17) 2015 23853064 - Bioinformatics. 2013 Oct 1;29(19):2490-3 18340039 - Genome Res. 2008 May;18(5):810-20 21482625 - Genome Res. 2011 Jul;21(7):1181-92 23202746 - Bioinformatics. 2013 Feb 1;29(3):308-15 19542152 - Bioinformatics. 2009 Sep 1;25(17):2157-63 12015879 - J Comput Biol. 2002;9(2):225-42 20378555 - Bioinformatics. 2010 May 15;26(10):1284-90 21115437 - Bioinformatics. 2011 Feb 1;27(3):295-302 12888528 - Nucleic Acids Res. 2003 Aug 1;31(15):4663-72 20834037 - Bioinformatics. 2010 Oct 15;26(20):2526-33 11779843 - Genome Res. 2002 Jan;12(1):177-89 15059830 - Bioinformatics. 2004 Sep 1;20(13):2067-74 21114842 - Genome Biol. 2010;11(11):R116 11504945 - Proc Natl Acad Sci U S A. 2001 Aug 14;98(17):9748-53 18307793 - BMC Bioinformatics. 2008;9:128 21471014 - Bioinformatics. 2011 Jun 1;27(11):1455-61 21685062 - Bioinformatics. 2011 Jul 1;27(13):i137-41  | 
    
| References_xml | – volume: 27 start-page: i137 issue: 13 year: 2011 ident: 7183_CR7 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btr208 – volume: 20 start-page: 2067 issue: 13 year: 2004 ident: 7183_CR1 publication-title: Bioinformatics doi: 10.1093/bioinformatics/bth205 – volume: 98 start-page: 9748 issue: 17 year: 2001 ident: 7183_CR2 publication-title: Proc Natl Acad Sci U S A doi: 10.1073/pnas.171285098 – volume: 9 start-page: 69 issue: 2 year: 2001 ident: 7183_CR18 publication-title: J Comput Biol – start-page: 1 volume-title: Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on year: 2009 ident: 7183_CR4 – volume: 29 start-page: 2490 issue: 19 year: 2013 ident: 7183_CR9 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btt407 – volume: 25 start-page: 2157 issue: 17 year: 2009 ident: 7183_CR14 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp379 – volume: 26 start-page: 1284 issue: 10 year: 2010 ident: 7183_CR16 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btq151 – volume: 26 start-page: 2526 issue: 20 year: 2010 ident: 7183_CR6 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btq468 – volume: 21 start-page: 1181 issue: 7 year: 2011 ident: 7183_CR13 publication-title: Genome Res doi: 10.1101/gr.111351.110 – volume: 9 start-page: 128 year: 2008 ident: 7183_CR19 publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-9-128 – volume: 31 start-page: 4663 issue: 15 year: 2003 ident: 7183_CR10 publication-title: Nucleic Acids Res doi: 10.1093/nar/gkg653; – volume: 18 start-page: 810 issue: 5 year: 2008 ident: 7183_CR3 publication-title: Genome Res doi: 10.1101/gr.7337908 – volume: 11 start-page: R116 issue: 11 year: 2010 ident: 7183_CR5 publication-title: Genome Biol doi: 10.1186/gb-2010-11-11-r116 – volume: 12 start-page: 177 issue: 1 year: 2002 ident: 7183_CR11 publication-title: Genome Res doi: 10.1101/gr.208902 – start-page: 297 volume-title: 11th International Symposium on Bioinformatics Research and Applications (ISBRA) year: 2015 ident: 7183_CR17 doi: 10.1007/978-3-319-19048-8_25 – volume: 27 start-page: 295 issue: 3 year: 2011 ident: 7183_CR15 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btq653 – volume: 29 start-page: 308 issue: 3 year: 2013 ident: 7183_CR8 publication-title: Bioinformatics doi: 10.1093/bioinformatics/bts690 – volume: 27 start-page: 1455 issue: 11 year: 2011 ident: 7183_CR12 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btr170 – reference: 12888528 - Nucleic Acids Res. 2003 Aug 1;31(15):4663-72 – reference: 20834037 - Bioinformatics. 2010 Oct 15;26(20):2526-33 – reference: 11779843 - Genome Res. 2002 Jan;12(1):177-89 – reference: 21114842 - Genome Biol. 2010;11(11):R116 – reference: 15059830 - Bioinformatics. 2004 Sep 1;20(13):2067-74 – reference: 23202746 - Bioinformatics. 2013 Feb 1;29(3):308-15 – reference: 18340039 - Genome Res. 2008 May;18(5):810-20 – reference: 21482625 - Genome Res. 2011 Jul;21(7):1181-92 – reference: 18307793 - BMC Bioinformatics. 2008;9:128 – reference: 11504945 - Proc Natl Acad Sci U S A. 2001 Aug 14;98(17):9748-53 – reference: 21115437 - Bioinformatics. 2011 Feb 1;27(3):295-302 – reference: 21685062 - Bioinformatics. 2011 Jul 1;27(13):i137-41 – reference: 12015879 - J Comput Biol. 2002;9(2):225-42 – reference: 19542152 - Bioinformatics. 2009 Sep 1;25(17):2157-63 – reference: 20378555 - Bioinformatics. 2010 May 15;26(10):1284-90 – reference: 23853064 - Bioinformatics. 2013 Oct 1;29(19):2490-3 – reference: 21471014 - Bioinformatics. 2011 Jun 1;27(11):1455-61  | 
    
| SSID | ssj0017805 | 
    
| Score | 2.2126007 | 
    
| Snippet | Background
In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single... In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to... Background In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single...  | 
    
| SourceID | unpaywall pubmedcentral proquest gale pubmed crossref springer  | 
    
| SourceType | Open Access Repository Aggregation Database Index Database Enrichment Source Publisher  | 
    
| StartPage | S2 | 
    
| SubjectTerms | Algorithms Bioinformatics Biomedical and Life Sciences Comparative analysis Computational Biology/Bioinformatics Computer Appl. in Life Sciences Computer Simulation Databases, Nucleic Acid Genomics High-Throughput Nucleotide Sequencing - methods Life Sciences Microarrays Sequence Analysis, DNA - methods  | 
    
| SummonAdditionalLinks | – databaseName: Springer Nature OA Free Journals dbid: C6C link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3daxQxEA_SIuqD-O3aVqIIvhi8bL71qRwtVdAHz0LfQjbJ9grnbtm9Q_rfO9nbW24LHvQ5s9lkksz8hkx-g9CHWEpFfV4QzkUkXMeCFMJQogNlESB-0CI9Tv7xU56d8-8X4qLn2U5vYbbv76mWnykYTwJhiSBUkhlY1BnY233wUrK7mZXT4cogkfP3tEL_-XDkem4b4C0PdDs7crgifYQerKprd_PXLRZbXuj0CXrcw0d8vF7vp-herJ6h--uCkjfP0deT6RfsKhw7YgjoGcemqRvsUw2O7gUDdovLurlazv9ggKu4nQP8xgAcQ_sCnZ-e_J6ekb48AvEAc5YwOcUkD3HCgw6-dHA0lQdAoENBS8OFETpoEyeGxUmc0DIyxZSA8Kg0UgVYi5dor6qr-BphCEvAmWkqIVriUedF4LxQ3DjmPGd5maF8oznre-7wVMJiYbsYQkub1G2Tum3KGKPKzvIMfRo-ul5TZ-wWf5-WxCZSiiplvVy6Vdvab7Nf9piDWVaMGZ6hj71QWcMAvOsfEcA0Eo_VSPJwJAmnxo-a321W3qamlGpWxXrVWpoI73MDMDZDr9Y7YRg-oBmlAaNlSI32yCCQyLrHLdXVvCPt7qq6SPgv2ewm21uLdrdWyLDldmuxBfk2f3PH_g_QQ4CAokvQUYdob9ms4hHArGXxtjte_wCAQhoI priority: 102 providerName: Springer Nature – databaseName: Unpaywall dbid: UNPAY link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3di9QwEA-yh6gPfn9UT4ki-GL2tm3apPq0HHecgoe4LpxPofno7WKvXfqBnH-9kza7bFdYEHzOhKbTycxv6MxvEHprspj5KpCE0sgQyo0kMkp8wrUfGoD4mke2OfnLeXw2p58vogvXWWd7YeSVksvSkYZaouLxdht63nc52CkKpjpa6ay_9Dw-8sHJEkhfIuLHZAaedwZ--SCOAKCP0MH8_Ov0R9dn5KQc_9BfO2vYWQeDGLXrqbdC1W4Z5eZf6h10qy1W6fWvNM-3wtXpPdSsX7SvUvk5bhs5Vr93OCD_sybuo7sO3uJpb48P0A1TPEQ3-4GX14_Qx5PjDzgtsOmIK-AR2FRVWWFlZ4R0HRY4zS_LatksrjAcCdcLSA8wAFtdP0bz05Pvx2fEjW8gCmBYA2dgYUy1mVDNtcpScB1MAWDhWvpZQqMk4ponZpKEZmImfmZCFrII0rcsiZkGW3mCRkVZmGcIQ9oEwZb7MWRz1PBAakolo0kapoqGQeahYP3BhHLc5nbERi66HIfHwmpFWK0IW9HmMzELPPR-s2nVU3vsF39jLUFY0ozCVuVcpm1di0-zb2JKIWywMEyoh945oayEA6jUNTnAa1ierYHk4UASbrUaLL9eG5ywS7YUrjBlWwvfEvIHCcBsDz3tDXBzfEBbjAOG9BAbmOZGwJKJD1eK5aIjFe-mzsTwXLI2YuG8Wb1fK2Rj6fu1CFdL1MHzf5R_gW4DRI26AiJ2iEZN1ZqXAAMb-cpd6z-RklMu priority: 102 providerName: Unpaywall  | 
    
| Title | EC: an efficient error correction algorithm for short reads | 
    
| URI | https://link.springer.com/article/10.1186/1471-2105-16-S17-S2 https://www.ncbi.nlm.nih.gov/pubmed/26678663 https://www.proquest.com/docview/1750429575 https://pubmed.ncbi.nlm.nih.gov/PMC4674864 https://bmcbioinformatics.biomedcentral.com/counter/pdf/10.1186/1471-2105-16-S17-S2  | 
    
| UnpaywallVersion | publishedVersion | 
    
| Volume | 16 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVADU databaseName: BioMed Central Open Access Free customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: RBZ dateStart: 20000101 isFulltext: true titleUrlDefault: https://www.biomedcentral.com/search/ providerName: BioMedCentral – providerCode: PRVAFT databaseName: Open Access Digital Library customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: KQ8 dateStart: 20000101 isFulltext: true titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html providerName: Colorado Alliance of Research Libraries – providerCode: PRVAFT databaseName: Open Access Digital Library customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: KQ8 dateStart: 20000701 isFulltext: true titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html providerName: Colorado Alliance of Research Libraries – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: DOA dateStart: 20000101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVEBS databaseName: EBSCOhost Academic Search Ultimate customDbUrl: https://search.ebscohost.com/login.aspx?authtype=ip,shib&custid=s3936755&profile=ehost&defaultdb=asn eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: ABDBF dateStart: 20000101 isFulltext: true titleUrlDefault: https://search.ebscohost.com/direct.asp?db=asn providerName: EBSCOhost – providerCode: PRVEBS databaseName: Inspec with Full Text customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: ADMLS dateStart: 20000101 isFulltext: true titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text providerName: EBSCOhost – providerCode: PRVBFR databaseName: Free Medical Journals customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: DIK dateStart: 20000101 isFulltext: true titleUrlDefault: http://www.freemedicaljournals.com providerName: Flying Publisher – providerCode: PRVFQY databaseName: GFMER Free Medical Journals customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: GX1 dateStart: 0 isFulltext: true titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php providerName: Geneva Foundation for Medical Education and Research – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: M~E dateStart: 20000101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVAQN databaseName: PubMed Central customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: RPM dateStart: 20000101 isFulltext: true titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/ providerName: National Library of Medicine – providerCode: PRVPQU databaseName: Health & Medical Collection customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: 7X7 dateStart: 20090101 isFulltext: true titleUrlDefault: https://search.proquest.com/healthcomplete providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: http://www.proquest.com/pqcentral?accountid=15518 eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: BENPR dateStart: 20090101 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Technology Collection customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: 8FG dateStart: 20090101 isFulltext: true titleUrlDefault: https://search.proquest.com/technologycollection1 providerName: ProQuest – providerCode: PRVFZP databaseName: Scholars Portal Open Access Journals customDbUrl: eissn: 1471-2105 dateEnd: 20250131 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: M48 dateStart: 20000701 isFulltext: true titleUrlDefault: http://journals.scholarsportal.info providerName: Scholars Portal – providerCode: PRVAVX databaseName: HAS SpringerNature Open Access 2022 customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: AAJSJ dateStart: 20001201 isFulltext: true titleUrlDefault: https://www.springernature.com providerName: Springer Nature – providerCode: PRVAVX databaseName: Springer Nature OA Free Journals customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: C6C dateStart: 20000112 isFulltext: true titleUrlDefault: http://www.springeropen.com/ providerName: Springer Nature  | 
    
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3ri9NAEF_OO0T9IL6NniWK4BdXu8lmH4pILK1n4cpxsVA_LXlsrgcxOZMW7X_vbJrEy6EFv7TQnbxmZ3Z-08z-BqGXOmWcxE6EKfU0pkJHOPIkwSIhrgaInwjPbE4-nrGjOZ0uvMUearuiNgqs_pramX5S8zJ78-vH5iM4_Ifa4QV7S2CBxZC6eJgwHMCqG8CafAChSppeDsf0z2sFQ-BfbzdqDmhoiP5xkl6ourpgX4pYV6spu1eqt9CNdX4Rbn6GWXYpak3uoNsN3LT9rX3cRXs6v4eubxtQbu6j9-PROzvMbV0TScCZbV2WRWnHpmdHvePBDrOzojxfLb_bAG_tagmasgFoJtUDNJ-Mv46OcNNOAccAi1bwcNxlNNFDmogkTkNwZR4DgBBJRFJJPemJREg9lK4e6iFJtctd7kE6lUrGE5i7h2g_L3L9GNmQxkDwE4RBdkW1cKKE0ohTGbphTF0ntZDTak7FDde4aXmRqTrnEEwZdSujbmUqzAhXgWOh191BF1uqjd3iL8yUKENikZsqmbNwXVXqS3CqfArLOHddSS30qhFKC7iBOGw2HcBjGN6rnuRhTxK8LO4NP29nXpkhU5qW62JdKWII8h0JsNdCj7aW0N0-oB8uANNZiPdspBMw5N79kfx8WZN8111gGFwXt9akWufYrRXcmdxuLVYgXzlP_k_pT9FNQIxeXc_DD9H-qlzrZ4DKVtEAXeMLDp9i8nmADnx_Gkzh-9N4dnIKv47YaFD_3zGoPRJG5rMT_9tvH2Myaw | 
    
| linkProvider | Scholars Portal | 
    
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT9wwEB0hUEU5VP0mFFq3qtRLra4TO3bKaYVAyxY4NCBxs5LYYZGWBCW7qvj3HWeTiCB1pZ49ceyxZ_xGHr8B-GrzULLMTynnwlKubEpTETGqDAssQnyjhHucfH4RTq749Fpcb4Do3sI02e7dlWTjqRuzVuEPhm6UYoAiKAtpjL41Rs-75bKs0B63xuNpPO2vDxxRf0sx9I9PB8fQU2f86DR6minZX5fuwPayuE8e_iTz-aMT6eQlvGihJBmv1v4VbNjiNTxbFZd8eAOHx0c_SVIQ25BEYM_EVlVZkczV42heM5BkflNWt4vZHUHoSuoZQnGCINLUb-Hq5PjyaELbUgk0Q8izwMnJIOTGjrhRJssTNFOZIThQJmV5xEUklFGRHUWBHdkRy20gAykwVMqjUBpcl3ewWZSF3QWCIQoebIqFGDlxq_zUcJ5KHiVBkvHAzz3wO83prOURd-Us5rqJJ1Sonbq1U7d22WNM6tj34Hv_0f2KRmO9-Be3JNoRVBQuA-YmWda1Po1_6zFHFy2DIOIefGuF8hIHkCXtgwKchuO0GkjuDyTRgrJB8-du5bVrcmlnhS2XtWaO_N6PENJ68H61E_rhI7KRCvGaB3KwR3oBR9w9bCluZw2Bd1PhJcT_0m436dZz1Ou1Qvstt16LNcrX_t5_9v8JtieX52f67PTi1wd4jtBQNIk7ch82F9XSHiD8WqQfW2P7CxFLImE | 
    
| linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwELZQEdAeEO8GChiExAWr68SvtKdq6arlUSGWStysJHa6lbbJKs4K9d8zzktNJVbi7ImTjMcz38jjbxD6YHMhaRamhDFuCVM2JSmPKVGGRhYgvlHcX07-fiZOztmX37yvJnR9tXt_JNneafAsTUW9vzJ5u8WV2KfgUgkkK5xQQebgZ-fghe8yCG--icFUTIeDBE_Z35EN_ePBUUC67ZZvxKXbNZPDwekOerAuVsn1n2S5vBGbZo_Qww5U4qPWCh6jO7Z4gu61bSavn6LD4-kBTgpsG7oImBnbqiornPnOHM29BpwsL8rqsl5cYQCx2C0AlGOAk8Y9Q-ez41_TE9I1TSAZgJ8afk5Gghk7YUaZLE9gw8oMYIIyKc1jxmOujIrtJI7sxE5obiMZSQ5JUx4LaWCFnqOtoizsLsKQrECIU1RADsWsClPDWCpZnERJxqIwD1DYa05nHaO4b2yx1E1moYT26tZe3drXkVGp52GAPg0PrVpCjc3i7_2SaE9VUfhamItk7Zw-nf_URwyctYyimAXoYyeUl_ABWdJdLYDf8OxWI8m9kSTspWw0_K5fee2HfAFaYcu109TT4IcxgNsAvWgtYfh8wDhSAXILkBzZyCDgKbzHI8XloqHybnq9CHgv6a1Jdz7EbdYKGUxusxYdyLvw5X_O_xbd__F5pr-dnn19hbYBI_Kmgkfuoa26WtvXgMPq9E2z0_4CqtQlPg | 
    
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3di9QwEA-yh6gPfn9UT4ki-GL2tm3apPq0HHecgoe4LpxPofno7WKvXfqBnH-9kza7bFdYEHzOhKbTycxv6MxvEHprspj5KpCE0sgQyo0kMkp8wrUfGoD4mke2OfnLeXw2p58vogvXWWd7YeSVksvSkYZaouLxdht63nc52CkKpjpa6ay_9Dw-8sHJEkhfIuLHZAaedwZ--SCOAKCP0MH8_Ov0R9dn5KQc_9BfO2vYWQeDGLXrqbdC1W4Z5eZf6h10qy1W6fWvNM-3wtXpPdSsX7SvUvk5bhs5Vr93OCD_sybuo7sO3uJpb48P0A1TPEQ3-4GX14_Qx5PjDzgtsOmIK-AR2FRVWWFlZ4R0HRY4zS_LatksrjAcCdcLSA8wAFtdP0bz05Pvx2fEjW8gCmBYA2dgYUy1mVDNtcpScB1MAWDhWvpZQqMk4ponZpKEZmImfmZCFrII0rcsiZkGW3mCRkVZmGcIQ9oEwZb7MWRz1PBAakolo0kapoqGQeahYP3BhHLc5nbERi66HIfHwmpFWK0IW9HmMzELPPR-s2nVU3vsF39jLUFY0ozCVuVcpm1di0-zb2JKIWywMEyoh945oayEA6jUNTnAa1ierYHk4UASbrUaLL9eG5ywS7YUrjBlWwvfEvIHCcBsDz3tDXBzfEBbjAOG9BAbmOZGwJKJD1eK5aIjFe-mzsTwXLI2YuG8Wb1fK2Rj6fu1CFdL1MHzf5R_gW4DRI26AiJ2iEZN1ZqXAAMb-cpd6z-RklMu | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=EC%3A+an+efficient+error+correction+algorithm+for+short+reads&rft.jtitle=BMC+bioinformatics&rft.au=Saha%2C+Subrata&rft.au=Rajasekaran%2C+Sanguthevar&rft.date=2015-12-07&rft.issn=1471-2105&rft.eissn=1471-2105&rft.volume=16&rft.issue=S17&rft_id=info:doi/10.1186%2F1471-2105-16-S17-S2&rft.externalDBID=n%2Fa&rft.externalDocID=10_1186_1471_2105_16_S17_S2 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1471-2105&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1471-2105&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1471-2105&client=summon |