EC: an efficient error correction algorithm for short reads

Background In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimm...

Full description

Saved in:

Bibliographic Details
Published in	BMC bioinformatics Vol. 16; no. Suppl 17; p. S2
Main Authors	Saha, Subrata, Rajasekaran, Sanguthevar
Format	Journal Article
Language	English
Published	London BioMed Central 07.12.2015 BioMed Central Ltd
Subjects	Algorithms Bioinformatics Biomedical and Life Sciences Comparative analysis Computational Biology/Bioinformatics Computer Appl. in Life Sciences Computer Simulation Databases, Nucleic Acid Genomics High-Throughput Nucleotide Sequencing - methods Life Sciences Microarrays Sequence Analysis, DNA - methods Error Corrector Hash Table Reference Genome Synthetic Dataset Bloom Filter
Online Access	Get full text
ISSN	1471-2105 1471-2105
DOI	10.1186/1471-2105-16-S17-S2

Cover

Abstract	Background In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimming and by correcting the erroneous bases of the reads. It helps to achieve high quality data and the computational complexity of many biological applications will be greatly reduced if the reads are first corrected. We have developed a novel error correction algorithm called EC and compared it with four other state-of-the-art algorithms using both real and simulated sequencing reads. Results We have done extensive and rigorous experiments that reveal that EC is indeed an effective, scalable, and efficient error correction tool. Real reads that we have employed in our performance evaluation are Illumina-generated short reads of various lengths. Six experimental datasets we have utilized are taken from sequence and read archive (SRA) at NCBI. The simulated reads are obtained by picking substrings from random positions of reference genomes. To introduce errors, some of the bases of the simulated reads are changed to other bases with some probabilities. Conclusions Error correction is a vital problem in biology especially for NGS data. In this paper we present a novel algorithm, called Error Corrector (EC) , for correcting substitution errors in biological sequencing reads. We plan to investigate the possibility of employing the techniques introduced in this research paper to handle insertion and deletion errors also. Software availability The implementation is freely available for non-commercial purposes. It can be downloaded from: http://engr.uconn.edu/~rajasek/EC.zip .
AbstractList	In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimming and by correcting the erroneous bases of the reads. It helps to achieve high quality data and the computational complexity of many biological applications will be greatly reduced if the reads are first corrected. We have developed a novel error correction algorithm called EC and compared it with four other state-of-the-art algorithms using both real and simulated sequencing reads. We have done extensive and rigorous experiments that reveal that EC is indeed an effective, scalable, and efficient error correction tool. Real reads that we have employed in our performance evaluation are Illumina-generated short reads of various lengths. Six experimental datasets we have utilized are taken from sequence and read archive (SRA) at NCBI. The simulated reads are obtained by picking substrings from random positions of reference genomes. To introduce errors, some of the bases of the simulated reads are changed to other bases with some probabilities. Error correction is a vital problem in biology especially for NGS data. In this paper we present a novel algorithm, called Error Corrector (EC), for correcting substitution errors in biological sequencing reads. We plan to investigate the possibility of employing the techniques introduced in this research paper to handle insertion and deletion errors also. The implementation is freely available for non-commercial purposes. It can be downloaded from: http://engr.uconn.edu/~rajasek/EC.zip. In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimming and by correcting the erroneous bases of the reads. It helps to achieve high quality data and the computational complexity of many biological applications will be greatly reduced if the reads are first corrected. We have developed a novel error correction algorithm called EC and compared it with four other state-of-the-art algorithms using both real and simulated sequencing reads.BACKGROUNDIn highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimming and by correcting the erroneous bases of the reads. It helps to achieve high quality data and the computational complexity of many biological applications will be greatly reduced if the reads are first corrected. We have developed a novel error correction algorithm called EC and compared it with four other state-of-the-art algorithms using both real and simulated sequencing reads.We have done extensive and rigorous experiments that reveal that EC is indeed an effective, scalable, and efficient error correction tool. Real reads that we have employed in our performance evaluation are Illumina-generated short reads of various lengths. Six experimental datasets we have utilized are taken from sequence and read archive (SRA) at NCBI. The simulated reads are obtained by picking substrings from random positions of reference genomes. To introduce errors, some of the bases of the simulated reads are changed to other bases with some probabilities.RESULTSWe have done extensive and rigorous experiments that reveal that EC is indeed an effective, scalable, and efficient error correction tool. Real reads that we have employed in our performance evaluation are Illumina-generated short reads of various lengths. Six experimental datasets we have utilized are taken from sequence and read archive (SRA) at NCBI. The simulated reads are obtained by picking substrings from random positions of reference genomes. To introduce errors, some of the bases of the simulated reads are changed to other bases with some probabilities.Error correction is a vital problem in biology especially for NGS data. In this paper we present a novel algorithm, called Error Corrector (EC), for correcting substitution errors in biological sequencing reads. We plan to investigate the possibility of employing the techniques introduced in this research paper to handle insertion and deletion errors also.CONCLUSIONSError correction is a vital problem in biology especially for NGS data. In this paper we present a novel algorithm, called Error Corrector (EC), for correcting substitution errors in biological sequencing reads. We plan to investigate the possibility of employing the techniques introduced in this research paper to handle insertion and deletion errors also.The implementation is freely available for non-commercial purposes. It can be downloaded from: http://engr.uconn.edu/~rajasek/EC.zip.SOFTWARE AVAILABILITYThe implementation is freely available for non-commercial purposes. It can be downloaded from: http://engr.uconn.edu/~rajasek/EC.zip. Background In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimming and by correcting the erroneous bases of the reads. It helps to achieve high quality data and the computational complexity of many biological applications will be greatly reduced if the reads are first corrected. We have developed a novel error correction algorithm called EC and compared it with four other state-of-the-art algorithms using both real and simulated sequencing reads. Results We have done extensive and rigorous experiments that reveal that EC is indeed an effective, scalable, and efficient error correction tool. Real reads that we have employed in our performance evaluation are Illumina-generated short reads of various lengths. Six experimental datasets we have utilized are taken from sequence and read archive (SRA) at NCBI. The simulated reads are obtained by picking substrings from random positions of reference genomes. To introduce errors, some of the bases of the simulated reads are changed to other bases with some probabilities. Conclusions Error correction is a vital problem in biology especially for NGS data. In this paper we present a novel algorithm, called Error Corrector (EC) , for correcting substitution errors in biological sequencing reads. We plan to investigate the possibility of employing the techniques introduced in this research paper to handle insertion and deletion errors also. Software availability The implementation is freely available for non-commercial purposes. It can be downloaded from: In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimming and by correcting the erroneous bases of the reads. It helps to achieve high quality data and the computational complexity of many biological applications will be greatly reduced if the reads are first corrected. We have developed a novel error correction algorithm called EC and compared it with four other state-of-the-art algorithms using both real and simulated sequencing reads. We have done extensive and rigorous experiments that reveal that EC is indeed an effective, scalable, and efficient error correction tool. Real reads that we have employed in our performance evaluation are Illumina-generated short reads of various lengths. Six experimental datasets we have utilized are taken from sequence and read archive (SRA) at NCBI. The simulated reads are obtained by picking substrings from random positions of reference genomes. To introduce errors, some of the bases of the simulated reads are changed to other bases with some probabilities. Error correction is a vital problem in biology especially for NGS data. In this paper we present a novel algorithm, called Error Corrector (EC) , for correcting substitution errors in biological sequencing reads. We plan to investigate the possibility of employing the techniques introduced in this research paper to handle insertion and deletion errors also. Background In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to the limitation of the NGS technologies, there could be errors in the reads. The error rate of the reads can be reduced with trimming and by correcting the erroneous bases of the reads. It helps to achieve high quality data and the computational complexity of many biological applications will be greatly reduced if the reads are first corrected. We have developed a novel error correction algorithm called EC and compared it with four other state-of-the-art algorithms using both real and simulated sequencing reads. Results We have done extensive and rigorous experiments that reveal that EC is indeed an effective, scalable, and efficient error correction tool. Real reads that we have employed in our performance evaluation are Illumina-generated short reads of various lengths. Six experimental datasets we have utilized are taken from sequence and read archive (SRA) at NCBI. The simulated reads are obtained by picking substrings from random positions of reference genomes. To introduce errors, some of the bases of the simulated reads are changed to other bases with some probabilities. Conclusions Error correction is a vital problem in biology especially for NGS data. In this paper we present a novel algorithm, called Error Corrector (EC) , for correcting substitution errors in biological sequencing reads. We plan to investigate the possibility of employing the techniques introduced in this research paper to handle insertion and deletion errors also. Software availability The implementation is freely available for non-commercial purposes. It can be downloaded from: http://engr.uconn.edu/~rajasek/EC.zip .
ArticleNumber	S2
Audience	Academic
Author	Rajasekaran, Sanguthevar Saha, Subrata
AuthorAffiliation	1 Department of Computer Science & Engineering, University of Connecticut, 371 Fairfield Way, Unit 4155, Storrs, CT 06269-4155, USA
AuthorAffiliation_xml	– name: 1 Department of Computer Science & Engineering, University of Connecticut, 371 Fairfield Way, Unit 4155, Storrs, CT 06269-4155, USA
Author_xml	– sequence: 1 givenname: Subrata surname: Saha fullname: Saha, Subrata organization: Department of Computer Science & Engineering, University of Connecticut – sequence: 2 givenname: Sanguthevar surname: Rajasekaran fullname: Rajasekaran, Sanguthevar email: rajasek@engr.uconn.edu organization: Department of Computer Science & Engineering, University of Connecticut
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/26678663$$D View this record in MEDLINE/PubMed
BookMark	eNqFUl1rFDEUDVKxH_oLBBnwRR-mJpOvGQtCWaoWCoKrzyFNbmZTZpM1mVH7782wa-2KVPKQcO85J_fce4_RQYgBEHpO8CkhrXhDmCR1QzCviaiXRNbL5hE6uose3HsfouOcbzAmssX8CTpshJCtEPQInV0s3lY6VOCcNx7CWEFKMVUmpgRm9DFUeuhj8uNqXbmSyKuYxiqBtvkpeuz0kOHZ7j5BX99ffFl8rK8-fbhcnF_VhnMylgIkFcwCZra1xmnaNNJIRlt7TVzHeMdb23aAOwoYMHFAJZWcCOY6Ia0l9ASxre4UNvr2hx4GtUl-rdOtIljNvVCzUzU7VUSoTKTKTaG929I20_UarCnmkv5Djdqr_UzwK9XH74oJyVrBisCrnUCK3ybIo1r7bGAYdIA4ZUUkx6zpuOQF-nIL7fUAygcXi6KZ4eqccdpKSrtZ8PQfqHIsrL0p03W-xPcIr_cIBTPCz7HXU87qcvl5H_vivt07n79nXQB0CzAp5pzAPdzFslFqOXex-4tl_KjnxSi1--E_3N3gcvkp9JDUTZxSKLvyIO0X3WLbqw
CitedBy_id	crossref_primary_10_3390_microorganisms9061162 crossref_primary_10_1186_s12864_018_4544_x crossref_primary_10_1186_s12859_017_1784_8
Cites_doi	10.1093/bioinformatics/btr208 10.1093/bioinformatics/bth205 10.1073/pnas.171285098 10.1093/bioinformatics/btt407 10.1093/bioinformatics/btp379 10.1093/bioinformatics/btq151 10.1093/bioinformatics/btq468 10.1101/gr.111351.110 10.1186/1471-2105-9-128 10.1093/nar/gkg653; 10.1101/gr.7337908 10.1186/gb-2010-11-11-r116 10.1101/gr.208902 10.1007/978-3-319-19048-8_25 10.1093/bioinformatics/btq653 10.1093/bioinformatics/bts690 10.1093/bioinformatics/btr170
ContentType	Journal Article
Copyright	Saha and Rajasekaran 2015 COPYRIGHT 2015 BioMed Central Ltd. Copyright © 2015 Saha and Rajasekaran 2015 Saha and Rajasekaran
Copyright_xml	– notice: Saha and Rajasekaran 2015 – notice: COPYRIGHT 2015 BioMed Central Ltd. – notice: Copyright © 2015 Saha and Rajasekaran 2015 Saha and Rajasekaran
DBID	C6C AAYXX CITATION CGR CUY CVF ECM EIF NPM ISR 7X8 5PM ADTOC UNPAY
DOI	10.1186/1471-2105-16-S17-S2
DatabaseName	Springer Nature OA Free Journals CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Gale In Context: Science MEDLINE - Academic PubMed Central (Full Participant titles) Unpaywall for CDI: Periodical Content Unpaywall
DatabaseTitle	CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic
DatabaseTitleList	MEDLINE MEDLINE - Academic
Database_xml	– sequence: 1 dbid: C6C name: Springer Nature OA Free Journals url: http://www.springeropen.com/ sourceTypes: Publisher – sequence: 2 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 4 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
Discipline	Biology
EISSN	1471-2105
EndPage	S2
ExternalDocumentID	10.1186/1471-2105-16-s17-s2 PMC4674864 A453873394 26678663 10_1186_1471_2105_16_S17_S2
Genre	Research Support, U.S. Gov't, Non-P.H.S Research Support, Non-U.S. Gov't Journal Article Research Support, N.I.H., Extramural
GrantInformation_xml	– fundername: NLM NIH HHS grantid: R01-LM010101
GroupedDBID	--- 0R~ 23N 2WC 4.4 53G 5VS 6J9 7X7 88E 8AO 8FE 8FG 8FH 8FI 8FJ AAFWJ AAJSJ AAKPC AASML ABDBF ABUWG ACGFO ACGFS ACIHN ACIWK ACPRK ACUHS ADBBV ADMLS ADRAZ ADUKV AEAQA AENEX AEUYN AFKRA AFPKN AFRAH AHBYD AHMBA AHSBF AHYZX ALMA_UNASSIGNED_HOLDINGS AMKLP AMTXH AOIJS ARAPS AZQEC BAPOH BAWUL BBNVY BCNDV BENPR BFQNJ BGLVJ BHPHI BMC BPHCQ BVXVI C6C CCPQU CS3 DIK DU5 DWQXO E3Z EAD EAP EAS EBD EBLON EBS EJD EMB EMK EMOBN ESX F5P FYUFA GNUQQ GROUPED_DOAJ GX1 H13 HCIFZ HMCUK HYE IAO ICD IHR INH INR ISR ITC K6V K7- KQ8 LK8 M1P M48 M7P MK~ ML0 M~E O5R O5S OK1 OVT P2P P62 PGMZT PHGZM PHGZT PIMPY PJZUB PPXIY PQGLB PQQKQ PROAC PSQYO PUEGO RBZ RNS ROL RPM RSV SBL SOJ SV3 TR2 TUS UKHRP W2D WOQ WOW XH6 XSB AAYXX CITATION 123 2VQ ALIPV C1A CGR CUY CVF ECM EIF IPNFZ NPM RIG 7X8 5PM ADTOC UNPAY
ID	FETCH-LOGICAL-c551t-217364de04d8dcfa3227c7438db1f945958d89e093e0e01fe37375164f967dd13
IEDL.DBID	M48
ISSN	1471-2105
IngestDate	Sun Oct 26 03:58:49 EDT 2025 Tue Sep 30 17:00:51 EDT 2025 Fri Sep 05 14:26:43 EDT 2025 Mon Oct 20 22:16:03 EDT 2025 Mon Oct 20 16:33:45 EDT 2025 Thu Oct 16 14:18:21 EDT 2025 Mon Jul 21 06:04:26 EDT 2025 Thu Apr 24 22:54:12 EDT 2025 Wed Oct 01 04:15:27 EDT 2025 Sat Sep 06 07:27:17 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	Suppl 17
Keywords	Error Corrector Hash Table Reference Genome Synthetic Dataset Bloom Filter
Language	English
License	This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. cc-by
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c551t-217364de04d8dcfa3227c7438db1f945958d89e093e0e01fe37375164f967dd13
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
OpenAccessLink	http://journals.scholarsportal.info/openUrl.xqy?doi=10.1186/1471-2105-16-S17-S2
PMID	26678663
PQID	1750429575
PQPubID	23479
ParticipantIDs	unpaywall_primary_10_1186_1471_2105_16_s17_s2 pubmedcentral_primary_oai_pubmedcentral_nih_gov_4674864 proquest_miscellaneous_1750429575 gale_infotracmisc_A453873394 gale_infotracacademiconefile_A453873394 gale_incontextgauss_ISR_A453873394 pubmed_primary_26678663 crossref_primary_10_1186_1471_2105_16_S17_S2 crossref_citationtrail_10_1186_1471_2105_16_S17_S2 springer_journals_10_1186_1471_2105_16_S17_S2
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2015-12-07
PublicationDateYYYYMMDD	2015-12-07
PublicationDate_xml	– month: 12 year: 2015 text: 2015-12-07 day: 07
PublicationDecade	2010
PublicationPlace	London
PublicationPlace_xml	– name: London – name: England
PublicationTitle	BMC bioinformatics
PublicationTitleAbbrev	BMC Bioinformatics
PublicationTitleAlternate	BMC Bioinformatics
PublicationYear	2015
Publisher	BioMed Central BioMed Central Ltd
Publisher_xml	– name: BioMed Central – name: BioMed Central Ltd
References	J Schroder (7183_CR14) 2009; 25 PA Pevzner (7183_CR2) 2001; 98 M Chaisson (7183_CR1) 2004; 20 P Medvedev (7183_CR7) 2011; 27 L Ilie (7183_CR9) 2013; 29 J Butler (7183_CR3) 2008; 18 L Salmela (7183_CR16) 2010; 26 MT Tammi (7183_CR10) 2003; 31 H Shi (7183_CR4) 2009 S Batzoglou (7183_CR11) 2002; 12 Y Liu (7183_CR8) 2013; 29 L Salmela (7183_CR12) 2011; 27 DR Kelley (7183_CR5) 2010; 11 X Yang (7183_CR6) 2010; 26 L Ilie (7183_CR15) 2011; 27 J Buhler (7183_CR18) 2001; 9 WC Kao (7183_CR13) 2011; 21 AD Smith (7183_CR19) 2008; 9 S Saha (7183_CR17) 2015 23853064 - Bioinformatics. 2013 Oct 1;29(19):2490-3 18340039 - Genome Res. 2008 May;18(5):810-20 21482625 - Genome Res. 2011 Jul;21(7):1181-92 23202746 - Bioinformatics. 2013 Feb 1;29(3):308-15 19542152 - Bioinformatics. 2009 Sep 1;25(17):2157-63 12015879 - J Comput Biol. 2002;9(2):225-42 20378555 - Bioinformatics. 2010 May 15;26(10):1284-90 21115437 - Bioinformatics. 2011 Feb 1;27(3):295-302 12888528 - Nucleic Acids Res. 2003 Aug 1;31(15):4663-72 20834037 - Bioinformatics. 2010 Oct 15;26(20):2526-33 11779843 - Genome Res. 2002 Jan;12(1):177-89 15059830 - Bioinformatics. 2004 Sep 1;20(13):2067-74 21114842 - Genome Biol. 2010;11(11):R116 11504945 - Proc Natl Acad Sci U S A. 2001 Aug 14;98(17):9748-53 18307793 - BMC Bioinformatics. 2008;9:128 21471014 - Bioinformatics. 2011 Jun 1;27(11):1455-61 21685062 - Bioinformatics. 2011 Jul 1;27(13):i137-41
References_xml	– volume: 27 start-page: i137 issue: 13 year: 2011 ident: 7183_CR7 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btr208 – volume: 20 start-page: 2067 issue: 13 year: 2004 ident: 7183_CR1 publication-title: Bioinformatics doi: 10.1093/bioinformatics/bth205 – volume: 98 start-page: 9748 issue: 17 year: 2001 ident: 7183_CR2 publication-title: Proc Natl Acad Sci U S A doi: 10.1073/pnas.171285098 – volume: 9 start-page: 69 issue: 2 year: 2001 ident: 7183_CR18 publication-title: J Comput Biol – start-page: 1 volume-title: Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on year: 2009 ident: 7183_CR4 – volume: 29 start-page: 2490 issue: 19 year: 2013 ident: 7183_CR9 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btt407 – volume: 25 start-page: 2157 issue: 17 year: 2009 ident: 7183_CR14 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp379 – volume: 26 start-page: 1284 issue: 10 year: 2010 ident: 7183_CR16 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btq151 – volume: 26 start-page: 2526 issue: 20 year: 2010 ident: 7183_CR6 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btq468 – volume: 21 start-page: 1181 issue: 7 year: 2011 ident: 7183_CR13 publication-title: Genome Res doi: 10.1101/gr.111351.110 – volume: 9 start-page: 128 year: 2008 ident: 7183_CR19 publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-9-128 – volume: 31 start-page: 4663 issue: 15 year: 2003 ident: 7183_CR10 publication-title: Nucleic Acids Res doi: 10.1093/nar/gkg653; – volume: 18 start-page: 810 issue: 5 year: 2008 ident: 7183_CR3 publication-title: Genome Res doi: 10.1101/gr.7337908 – volume: 11 start-page: R116 issue: 11 year: 2010 ident: 7183_CR5 publication-title: Genome Biol doi: 10.1186/gb-2010-11-11-r116 – volume: 12 start-page: 177 issue: 1 year: 2002 ident: 7183_CR11 publication-title: Genome Res doi: 10.1101/gr.208902 – start-page: 297 volume-title: 11th International Symposium on Bioinformatics Research and Applications (ISBRA) year: 2015 ident: 7183_CR17 doi: 10.1007/978-3-319-19048-8_25 – volume: 27 start-page: 295 issue: 3 year: 2011 ident: 7183_CR15 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btq653 – volume: 29 start-page: 308 issue: 3 year: 2013 ident: 7183_CR8 publication-title: Bioinformatics doi: 10.1093/bioinformatics/bts690 – volume: 27 start-page: 1455 issue: 11 year: 2011 ident: 7183_CR12 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btr170 – reference: 12888528 - Nucleic Acids Res. 2003 Aug 1;31(15):4663-72 – reference: 20834037 - Bioinformatics. 2010 Oct 15;26(20):2526-33 – reference: 11779843 - Genome Res. 2002 Jan;12(1):177-89 – reference: 21114842 - Genome Biol. 2010;11(11):R116 – reference: 15059830 - Bioinformatics. 2004 Sep 1;20(13):2067-74 – reference: 23202746 - Bioinformatics. 2013 Feb 1;29(3):308-15 – reference: 18340039 - Genome Res. 2008 May;18(5):810-20 – reference: 21482625 - Genome Res. 2011 Jul;21(7):1181-92 – reference: 18307793 - BMC Bioinformatics. 2008;9:128 – reference: 11504945 - Proc Natl Acad Sci U S A. 2001 Aug 14;98(17):9748-53 – reference: 21115437 - Bioinformatics. 2011 Feb 1;27(3):295-302 – reference: 21685062 - Bioinformatics. 2011 Jul 1;27(13):i137-41 – reference: 12015879 - J Comput Biol. 2002;9(2):225-42 – reference: 19542152 - Bioinformatics. 2009 Sep 1;25(17):2157-63 – reference: 20378555 - Bioinformatics. 2010 May 15;26(10):1284-90 – reference: 23853064 - Bioinformatics. 2013 Oct 1;29(19):2490-3 – reference: 21471014 - Bioinformatics. 2011 Jun 1;27(11):1455-61
SSID	ssj0017805
Score	2.2126007
Snippet	Background In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single... In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single run. Due to... Background In highly parallel next-generation sequencing (NGS) techniques millions to billions of short reads are produced from a genomic sequence in a single...
SourceID	unpaywall pubmedcentral proquest gale pubmed crossref springer
SourceType	Open Access Repository Aggregation Database Index Database Enrichment Source Publisher
StartPage	S2
SubjectTerms	Algorithms Bioinformatics Biomedical and Life Sciences Comparative analysis Computational Biology/Bioinformatics Computer Appl. in Life Sciences Computer Simulation Databases, Nucleic Acid Genomics High-Throughput Nucleotide Sequencing - methods Life Sciences Microarrays Sequence Analysis, DNA - methods
SummonAdditionalLinks	– databaseName: Springer Nature OA Free Journals dbid: C6C link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3daxQxEA_SIuqD-O3aVqIIvhi8bL71qRwtVdAHz0LfQjbJ9grnbtm9Q_rfO9nbW24LHvQ5s9lkksz8hkx-g9CHWEpFfV4QzkUkXMeCFMJQogNlESB-0CI9Tv7xU56d8-8X4qLn2U5vYbbv76mWnykYTwJhiSBUkhlY1BnY233wUrK7mZXT4cogkfP3tEL_-XDkem4b4C0PdDs7crgifYQerKprd_PXLRZbXuj0CXrcw0d8vF7vp-herJ6h--uCkjfP0deT6RfsKhw7YgjoGcemqRvsUw2O7gUDdovLurlazv9ggKu4nQP8xgAcQ_sCnZ-e_J6ekb48AvEAc5YwOcUkD3HCgw6-dHA0lQdAoENBS8OFETpoEyeGxUmc0DIyxZSA8Kg0UgVYi5dor6qr-BphCEvAmWkqIVriUedF4LxQ3DjmPGd5maF8oznre-7wVMJiYbsYQkub1G2Tum3KGKPKzvIMfRo-ul5TZ-wWf5-WxCZSiiplvVy6Vdvab7Nf9piDWVaMGZ6hj71QWcMAvOsfEcA0Eo_VSPJwJAmnxo-a321W3qamlGpWxXrVWpoI73MDMDZDr9Y7YRg-oBmlAaNlSI32yCCQyLrHLdXVvCPt7qq6SPgv2ewm21uLdrdWyLDldmuxBfk2f3PH_g_QQ4CAokvQUYdob9ms4hHArGXxtjte_wCAQhoI priority: 102 providerName: Springer Nature – databaseName: Unpaywall dbid: UNPAY link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3di9QwEA-yh6gPfn9UT4ki-GL2tm3apPq0HHecgoe4LpxPofno7WKvXfqBnH-9kza7bFdYEHzOhKbTycxv6MxvEHprspj5KpCE0sgQyo0kMkp8wrUfGoD4mke2OfnLeXw2p58vogvXWWd7YeSVksvSkYZaouLxdht63nc52CkKpjpa6ay_9Dw-8sHJEkhfIuLHZAaedwZ--SCOAKCP0MH8_Ov0R9dn5KQc_9BfO2vYWQeDGLXrqbdC1W4Z5eZf6h10qy1W6fWvNM-3wtXpPdSsX7SvUvk5bhs5Vr93OCD_sybuo7sO3uJpb48P0A1TPEQ3-4GX14_Qx5PjDzgtsOmIK-AR2FRVWWFlZ4R0HRY4zS_LatksrjAcCdcLSA8wAFtdP0bz05Pvx2fEjW8gCmBYA2dgYUy1mVDNtcpScB1MAWDhWvpZQqMk4ponZpKEZmImfmZCFrII0rcsiZkGW3mCRkVZmGcIQ9oEwZb7MWRz1PBAakolo0kapoqGQeahYP3BhHLc5nbERi66HIfHwmpFWK0IW9HmMzELPPR-s2nVU3vsF39jLUFY0ozCVuVcpm1di0-zb2JKIWywMEyoh945oayEA6jUNTnAa1ierYHk4UASbrUaLL9eG5ywS7YUrjBlWwvfEvIHCcBsDz3tDXBzfEBbjAOG9BAbmOZGwJKJD1eK5aIjFe-mzsTwXLI2YuG8Wb1fK2Rj6fu1CFdL1MHzf5R_gW4DRI26AiJ2iEZN1ZqXAAMb-cpd6z-RklMu priority: 102 providerName: Unpaywall
Title	EC: an efficient error correction algorithm for short reads
URI	https://link.springer.com/article/10.1186/1471-2105-16-S17-S2 https://www.ncbi.nlm.nih.gov/pubmed/26678663 https://www.proquest.com/docview/1750429575 https://pubmed.ncbi.nlm.nih.gov/PMC4674864 https://bmcbioinformatics.biomedcentral.com/counter/pdf/10.1186/1471-2105-16-S17-S2
UnpaywallVersion	publishedVersion
Volume	16
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVADU databaseName: BioMed Central Open Access Free customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: RBZ dateStart: 20000101 isFulltext: true titleUrlDefault: https://www.biomedcentral.com/search/ providerName: BioMedCentral – providerCode: PRVAFT databaseName: Open Access Digital Library customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: KQ8 dateStart: 20000101 isFulltext: true titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html providerName: Colorado Alliance of Research Libraries – providerCode: PRVAFT databaseName: Open Access Digital Library customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: KQ8 dateStart: 20000701 isFulltext: true titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html providerName: Colorado Alliance of Research Libraries – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: DOA dateStart: 20000101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVEBS databaseName: EBSCOhost Academic Search Ultimate customDbUrl: https://search.ebscohost.com/login.aspx?authtype=ip,shib&custid=s3936755&profile=ehost&defaultdb=asn eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: ABDBF dateStart: 20000101 isFulltext: true titleUrlDefault: https://search.ebscohost.com/direct.asp?db=asn providerName: EBSCOhost – providerCode: PRVEBS databaseName: Inspec with Full Text customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: ADMLS dateStart: 20000101 isFulltext: true titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text providerName: EBSCOhost – providerCode: PRVBFR databaseName: Free Medical Journals customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: DIK dateStart: 20000101 isFulltext: true titleUrlDefault: http://www.freemedicaljournals.com providerName: Flying Publisher – providerCode: PRVFQY databaseName: GFMER Free Medical Journals customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: GX1 dateStart: 0 isFulltext: true titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php providerName: Geneva Foundation for Medical Education and Research – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: M~E dateStart: 20000101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVAQN databaseName: PubMed Central customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: RPM dateStart: 20000101 isFulltext: true titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/ providerName: National Library of Medicine – providerCode: PRVPQU databaseName: Health & Medical Collection customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: 7X7 dateStart: 20090101 isFulltext: true titleUrlDefault: https://search.proquest.com/healthcomplete providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: http://www.proquest.com/pqcentral?accountid=15518 eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: BENPR dateStart: 20090101 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Technology Collection customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: 8FG dateStart: 20090101 isFulltext: true titleUrlDefault: https://search.proquest.com/technologycollection1 providerName: ProQuest – providerCode: PRVFZP databaseName: Scholars Portal Open Access Journals customDbUrl: eissn: 1471-2105 dateEnd: 20250131 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: M48 dateStart: 20000701 isFulltext: true titleUrlDefault: http://journals.scholarsportal.info providerName: Scholars Portal – providerCode: PRVAVX databaseName: HAS SpringerNature Open Access 2022 customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: AAJSJ dateStart: 20001201 isFulltext: true titleUrlDefault: https://www.springernature.com providerName: Springer Nature – providerCode: PRVAVX databaseName: Springer Nature OA Free Journals customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: C6C dateStart: 20000112 isFulltext: true titleUrlDefault: http://www.springeropen.com/ providerName: Springer Nature
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3ri9NAEF_OO0T9IL6NniWK4BdXu8lmH4pILK1n4cpxsVA_LXlsrgcxOZMW7X_vbJrEy6EFv7TQnbxmZ3Z-08z-BqGXOmWcxE6EKfU0pkJHOPIkwSIhrgaInwjPbE4-nrGjOZ0uvMUearuiNgqs_pramX5S8zJ78-vH5iM4_Ifa4QV7S2CBxZC6eJgwHMCqG8CafAChSppeDsf0z2sFQ-BfbzdqDmhoiP5xkl6ourpgX4pYV6spu1eqt9CNdX4Rbn6GWXYpak3uoNsN3LT9rX3cRXs6v4eubxtQbu6j9-PROzvMbV0TScCZbV2WRWnHpmdHvePBDrOzojxfLb_bAG_tagmasgFoJtUDNJ-Mv46OcNNOAccAi1bwcNxlNNFDmogkTkNwZR4DgBBJRFJJPemJREg9lK4e6iFJtctd7kE6lUrGE5i7h2g_L3L9GNmQxkDwE4RBdkW1cKKE0ohTGbphTF0ntZDTak7FDde4aXmRqTrnEEwZdSujbmUqzAhXgWOh191BF1uqjd3iL8yUKENikZsqmbNwXVXqS3CqfArLOHddSS30qhFKC7iBOGw2HcBjGN6rnuRhTxK8LO4NP29nXpkhU5qW62JdKWII8h0JsNdCj7aW0N0-oB8uANNZiPdspBMw5N79kfx8WZN8111gGFwXt9akWufYrRXcmdxuLVYgXzlP_k_pT9FNQIxeXc_DD9H-qlzrZ4DKVtEAXeMLDp9i8nmADnx_Gkzh-9N4dnIKv47YaFD_3zGoPRJG5rMT_9tvH2Myaw
linkProvider	Scholars Portal
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT9wwEB0hUEU5VP0mFFq3qtRLra4TO3bKaYVAyxY4NCBxs5LYYZGWBCW7qvj3HWeTiCB1pZ49ceyxZ_xGHr8B-GrzULLMTynnwlKubEpTETGqDAssQnyjhHucfH4RTq749Fpcb4Do3sI02e7dlWTjqRuzVuEPhm6UYoAiKAtpjL41Rs-75bKs0B63xuNpPO2vDxxRf0sx9I9PB8fQU2f86DR6minZX5fuwPayuE8e_iTz-aMT6eQlvGihJBmv1v4VbNjiNTxbFZd8eAOHx0c_SVIQ25BEYM_EVlVZkczV42heM5BkflNWt4vZHUHoSuoZQnGCINLUb-Hq5PjyaELbUgk0Q8izwMnJIOTGjrhRJssTNFOZIThQJmV5xEUklFGRHUWBHdkRy20gAykwVMqjUBpcl3ewWZSF3QWCIQoebIqFGDlxq_zUcJ5KHiVBkvHAzz3wO83prOURd-Us5rqJJ1Sonbq1U7d22WNM6tj34Hv_0f2KRmO9-Be3JNoRVBQuA-YmWda1Po1_6zFHFy2DIOIefGuF8hIHkCXtgwKchuO0GkjuDyTRgrJB8-du5bVrcmlnhS2XtWaO_N6PENJ68H61E_rhI7KRCvGaB3KwR3oBR9w9bCluZw2Bd1PhJcT_0m436dZz1Ou1Qvstt16LNcrX_t5_9v8JtieX52f67PTi1wd4jtBQNIk7ch82F9XSHiD8WqQfW2P7CxFLImE
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwELZQEdAeEO8GChiExAWr68SvtKdq6arlUSGWStysJHa6lbbJKs4K9d8zzktNJVbi7ImTjMcz38jjbxD6YHMhaRamhDFuCVM2JSmPKVGGRhYgvlHcX07-fiZOztmX37yvJnR9tXt_JNneafAsTUW9vzJ5u8WV2KfgUgkkK5xQQebgZ-fghe8yCG--icFUTIeDBE_Z35EN_ePBUUC67ZZvxKXbNZPDwekOerAuVsn1n2S5vBGbZo_Qww5U4qPWCh6jO7Z4gu61bSavn6LD4-kBTgpsG7oImBnbqiornPnOHM29BpwsL8rqsl5cYQCx2C0AlGOAk8Y9Q-ez41_TE9I1TSAZgJ8afk5Gghk7YUaZLE9gw8oMYIIyKc1jxmOujIrtJI7sxE5obiMZSQ5JUx4LaWCFnqOtoizsLsKQrECIU1RADsWsClPDWCpZnERJxqIwD1DYa05nHaO4b2yx1E1moYT26tZe3drXkVGp52GAPg0PrVpCjc3i7_2SaE9VUfhamItk7Zw-nf_URwyctYyimAXoYyeUl_ABWdJdLYDf8OxWI8m9kSTspWw0_K5fee2HfAFaYcu109TT4IcxgNsAvWgtYfh8wDhSAXILkBzZyCDgKbzHI8XloqHybnq9CHgv6a1Jdz7EbdYKGUxusxYdyLvw5X_O_xbd__F5pr-dnn19hbYBI_Kmgkfuoa26WtvXgMPq9E2z0_4CqtQlPg
linkToUnpaywall	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3di9QwEA-yh6gPfn9UT4ki-GL2tm3apPq0HHecgoe4LpxPofno7WKvXfqBnH-9kza7bFdYEHzOhKbTycxv6MxvEHprspj5KpCE0sgQyo0kMkp8wrUfGoD4mke2OfnLeXw2p58vogvXWWd7YeSVksvSkYZaouLxdht63nc52CkKpjpa6ay_9Dw-8sHJEkhfIuLHZAaedwZ--SCOAKCP0MH8_Ov0R9dn5KQc_9BfO2vYWQeDGLXrqbdC1W4Z5eZf6h10qy1W6fWvNM-3wtXpPdSsX7SvUvk5bhs5Vr93OCD_sybuo7sO3uJpb48P0A1TPEQ3-4GX14_Qx5PjDzgtsOmIK-AR2FRVWWFlZ4R0HRY4zS_LatksrjAcCdcLSA8wAFtdP0bz05Pvx2fEjW8gCmBYA2dgYUy1mVDNtcpScB1MAWDhWvpZQqMk4ponZpKEZmImfmZCFrII0rcsiZkGW3mCRkVZmGcIQ9oEwZb7MWRz1PBAakolo0kapoqGQeahYP3BhHLc5nbERi66HIfHwmpFWK0IW9HmMzELPPR-s2nVU3vsF39jLUFY0ozCVuVcpm1di0-zb2JKIWywMEyoh945oayEA6jUNTnAa1ierYHk4UASbrUaLL9eG5ywS7YUrjBlWwvfEvIHCcBsDz3tDXBzfEBbjAOG9BAbmOZGwJKJD1eK5aIjFe-mzsTwXLI2YuG8Wb1fK2Rj6fu1CFdL1MHzf5R_gW4DRI26AiJ2iEZN1ZqXAAMb-cpd6z-RklMu
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=EC%3A+an+efficient+error+correction+algorithm+for+short+reads&rft.jtitle=BMC+bioinformatics&rft.au=Saha%2C+Subrata&rft.au=Rajasekaran%2C+Sanguthevar&rft.date=2015-12-07&rft.issn=1471-2105&rft.eissn=1471-2105&rft.volume=16&rft.issue=S17&rft_id=info:doi/10.1186%2F1471-2105-16-S17-S2&rft.externalDBID=n%2Fa&rft.externalDocID=10_1186_1471_2105_16_S17_S2
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1471-2105&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1471-2105&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1471-2105&client=summon