GRASS: a generic algorithm for scaffolding next-generation sequencing assemblies
Motivation: The increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome sequencing. This in turn has fueled the need for reliable means of obtaining high-quality draft genomes from short-read sequencing data. The mil...
        Saved in:
      
    
          | Published in | Bioinformatics (Oxford, England) Vol. 28; no. 11; pp. 1429 - 1437 | 
|---|---|
| Main Authors | , , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        Oxford
          Oxford University Press
    
        01.06.2012
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 1367-4803 1367-4811 1367-4811  | 
| DOI | 10.1093/bioinformatics/bts175 | 
Cover
| Abstract | Motivation: The increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome sequencing. This in turn has fueled the need for reliable means of obtaining high-quality draft genomes from short-read sequencing data. The millions of reads usually involved in HTS experiments are first assembled into longer fragments called contigs, which are then scaffolded, i.e. ordered and oriented using additional information, to produce even longer sequences called scaffolds. Most existing scaffolders of HTS genome assemblies are not suited for using information other than paired reads to perform scaffolding. They use this limited information to construct scaffolds, often preferring scaffold length over accuracy, when faced with the tradeoff.
Results: We present GRASS (GeneRic ASsembly Scaffolder)—a novel algorithm for scaffolding second-generation sequencing assemblies capable of using diverse information sources. GRASS offers a mixed-integer programming formulation of the contig scaffolding problem, which combines contig order, distance and orientation in a single optimization objective. The resulting optimization problem is solved using an expectation–maximization procedure and an unconstrained binary quadratic programming approximation of the original problem. We compared GRASS with existing HTS scaffolders using Illumina paired reads of three bacterial genomes. Our algorithm constructs a comparable number of scaffolds, but makes fewer errors. This result is further improved when additional data, in the form of related genome sequences, are used.
Availability: GRASS source code is freely available from http://code.google.com/p/tud-scaffolding/.
Contact:  a.gritsenko@tudelft.nl
Supplementary information:  Supplementary data are available at Bioinformatics online. | 
    
|---|---|
| AbstractList | The increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome sequencing. This in turn has fueled the need for reliable means of obtaining high-quality draft genomes from short-read sequencing data. The millions of reads usually involved in HTS experiments are first assembled into longer fragments called contigs, which are then scaffolded, i.e. ordered and oriented using additional information, to produce even longer sequences called scaffolds. Most existing scaffolders of HTS genome assemblies are not suited for using information other than paired reads to perform scaffolding. They use this limited information to construct scaffolds, often preferring scaffold length over accuracy, when faced with the tradeoff.
We present GRASS (GeneRic ASsembly Scaffolder)-a novel algorithm for scaffolding second-generation sequencing assemblies capable of using diverse information sources. GRASS offers a mixed-integer programming formulation of the contig scaffolding problem, which combines contig order, distance and orientation in a single optimization objective. The resulting optimization problem is solved using an expectation-maximization procedure and an unconstrained binary quadratic programming approximation of the original problem. We compared GRASS with existing HTS scaffolders using Illumina paired reads of three bacterial genomes. Our algorithm constructs a comparable number of scaffolds, but makes fewer errors. This result is further improved when additional data, in the form of related genome sequences, are used.
GRASS source code is freely available from http://code.google.com/p/tud-scaffolding/.
Supplementary data are available at Bioinformatics online. The increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome sequencing. This in turn has fueled the need for reliable means of obtaining high-quality draft genomes from short-read sequencing data. The millions of reads usually involved in HTS experiments are first assembled into longer fragments called contigs, which are then scaffolded, i.e. ordered and oriented using additional information, to produce even longer sequences called scaffolds. Most existing scaffolders of HTS genome assemblies are not suited for using information other than paired reads to perform scaffolding. They use this limited information to construct scaffolds, often preferring scaffold length over accuracy, when faced with the tradeoff.MOTIVATIONThe increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome sequencing. This in turn has fueled the need for reliable means of obtaining high-quality draft genomes from short-read sequencing data. The millions of reads usually involved in HTS experiments are first assembled into longer fragments called contigs, which are then scaffolded, i.e. ordered and oriented using additional information, to produce even longer sequences called scaffolds. Most existing scaffolders of HTS genome assemblies are not suited for using information other than paired reads to perform scaffolding. They use this limited information to construct scaffolds, often preferring scaffold length over accuracy, when faced with the tradeoff.We present GRASS (GeneRic ASsembly Scaffolder)-a novel algorithm for scaffolding second-generation sequencing assemblies capable of using diverse information sources. GRASS offers a mixed-integer programming formulation of the contig scaffolding problem, which combines contig order, distance and orientation in a single optimization objective. The resulting optimization problem is solved using an expectation-maximization procedure and an unconstrained binary quadratic programming approximation of the original problem. We compared GRASS with existing HTS scaffolders using Illumina paired reads of three bacterial genomes. Our algorithm constructs a comparable number of scaffolds, but makes fewer errors. This result is further improved when additional data, in the form of related genome sequences, are used.RESULTSWe present GRASS (GeneRic ASsembly Scaffolder)-a novel algorithm for scaffolding second-generation sequencing assemblies capable of using diverse information sources. GRASS offers a mixed-integer programming formulation of the contig scaffolding problem, which combines contig order, distance and orientation in a single optimization objective. The resulting optimization problem is solved using an expectation-maximization procedure and an unconstrained binary quadratic programming approximation of the original problem. We compared GRASS with existing HTS scaffolders using Illumina paired reads of three bacterial genomes. Our algorithm constructs a comparable number of scaffolds, but makes fewer errors. This result is further improved when additional data, in the form of related genome sequences, are used.GRASS source code is freely available from http://code.google.com/p/tud-scaffolding/.AVAILABILITYGRASS source code is freely available from http://code.google.com/p/tud-scaffolding/.Supplementary data are available at Bioinformatics online.SUPPLEMENTARY INFORMATIONSupplementary data are available at Bioinformatics online. Motivation: The increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome sequencing. This in turn has fueled the need for reliable means of obtaining high-quality draft genomes from short-read sequencing data. The millions of reads usually involved in HTS experiments are first assembled into longer fragments called contigs, which are then scaffolded, i.e. ordered and oriented using additional information, to produce even longer sequences called scaffolds. Most existing scaffolders of HTS genome assemblies are not suited for using information other than paired reads to perform scaffolding. They use this limited information to construct scaffolds, often preferring scaffold length over accuracy, when faced with the tradeoff. Results: We present GRASS (GeneRic ASsembly Scaffolder)—a novel algorithm for scaffolding second-generation sequencing assemblies capable of using diverse information sources. GRASS offers a mixed-integer programming formulation of the contig scaffolding problem, which combines contig order, distance and orientation in a single optimization objective. The resulting optimization problem is solved using an expectation–maximization procedure and an unconstrained binary quadratic programming approximation of the original problem. We compared GRASS with existing HTS scaffolders using Illumina paired reads of three bacterial genomes. Our algorithm constructs a comparable number of scaffolds, but makes fewer errors. This result is further improved when additional data, in the form of related genome sequences, are used. Availability: GRASS source code is freely available from http://code.google.com/p/tud-scaffolding/. Contact: a.gritsenko@tudelft.nl Supplementary information: Supplementary data are available at Bioinformatics online.  | 
    
| Author | Gritsenko, Alexey A. Nijkamp, Jurgen F. Reinders, Marcel J.T. Ridder, Dick de  | 
    
| Author_xml | – sequence: 1 givenname: Alexey A. surname: Gritsenko fullname: Gritsenko, Alexey A. – sequence: 2 givenname: Jurgen F. surname: Nijkamp fullname: Nijkamp, Jurgen F. – sequence: 3 givenname: Marcel J.T. surname: Reinders fullname: Reinders, Marcel J.T. – sequence: 4 givenname: Dick de surname: Ridder fullname: Ridder, Dick de  | 
    
| BackLink | http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=25893870$$DView record in Pascal Francis https://www.ncbi.nlm.nih.gov/pubmed/22492642$$D View this record in MEDLINE/PubMed  | 
    
| BookMark | eNqNkM1OGzEURi1ERUjoI1DNphKbKf6ZsT3tCiEIlSIVQbse2Z47wZXHTu2JIG-PISmIbsrKV_L57v10pmjfBw8IHRP8heCGnWobrO9DHNRoTTrVYyKi3kOHhHFRVpKQ_ZcZswmapvQbY1zjmh-gCaVVQ3lFD9H1_Obs9vZroYoleIjWFMotQ7Tj3VDk7UUyqu-D66xfFh4exvIZyzeDLxL8WYM3T18qJRi0s5CO0IdeuQQfd-8M_bq8-Hl-VS5-zL-fny1KwyQfS0l1QyvR1R3GWnUsV-U9NB1RwEXPuw6kqWTTGNkQQYEZKmshJQVNtayUZjPEt3vXfqU298q5dhXtoOKmJbh9UtS-VdRuFeXgyTa4iiH3T2M72GTAOeUhrFOOkrrinAmR0U87dK0H6F4O_NWXgc87QGVRro8q60ivXC0bJgXOXL3lTAwpRejf3fXbPzljx2f5Y1TW_Sf9CCW5ros | 
    
| CitedBy_id | crossref_primary_10_1186_s40168_019_0626_5 crossref_primary_10_1093_bioinformatics_btv171 crossref_primary_10_1093_bib_bbab033 crossref_primary_10_3389_fmicb_2014_00769 crossref_primary_10_1142_S0219720019500148 crossref_primary_10_1186_1471_2164_16_S10_S11 crossref_primary_10_1016_j_endm_2018_01_020 crossref_primary_10_1186_1471_2164_14_289 crossref_primary_10_1007_s10462_020_09951_1 crossref_primary_10_1007_s00521_014_1659_0 crossref_primary_10_1186_1471_2105_16_S14_S2 crossref_primary_10_1007_s00453_021_00819_6 crossref_primary_10_1016_j_tcs_2015_06_023 crossref_primary_10_1371_journal_pone_0237087 crossref_primary_10_1007_s42979_022_01198_7 crossref_primary_10_1186_gb_2014_15_3_r42 crossref_primary_10_1093_bib_bbad169 crossref_primary_10_1093_bioinformatics_btu291 crossref_primary_10_1093_bioinformatics_bty773 crossref_primary_10_1186_s12864_016_2579_4 crossref_primary_10_1101_gr_178319_114 crossref_primary_10_1186_1471_2105_15_281 crossref_primary_10_1186_s12859_017_1919_y crossref_primary_10_1089_cmb_2019_0310 crossref_primary_10_1109_TCBB_2018_2858267 crossref_primary_10_24072_pcjournal_128 crossref_primary_10_3389_fgene_2014_00243 crossref_primary_10_3233_IFS_151994  | 
    
| Cites_doi | 10.4056/sigs.541628 10.1093/bioinformatics/btn548 10.1007/BF01188580 10.1186/gb-2009-10-3-r25 10.1093/bioinformatics/btp352 10.1093/molbev/msj030 10.1093/bioinformatics/btq683 10.1093/bioinformatics/btq033 10.1145/360825.360861 10.1101/gr.183201 10.1093/bioinformatics/btr174 10.1101/gr.1536204 10.1093/bioinformatics/btp324 10.1093/bioinformatics/btr562 10.1093/bioinformatics/bth324 10.1145/585265.585267 10.1093/bioinformatics/btn102 10.1126/science.287.5461.2196 10.1101/gr.074492.107 10.1089/cmb.2011.0170 10.1093/nar/30.11.2478 10.1080/10556780701550083 10.1016/j.biosystems.2004.08.002 10.1002/9781118627372 10.1186/1471-2105-11-345  | 
    
| ContentType | Journal Article | 
    
| Copyright | 2015 INIST-CNRS | 
    
| Copyright_xml | – notice: 2015 INIST-CNRS | 
    
| DBID | AAYXX CITATION IQODW CGR CUY CVF ECM EIF NPM 7X8 ADTOC UNPAY  | 
    
| DOI | 10.1093/bioinformatics/bts175 | 
    
| DatabaseName | CrossRef Pascal-Francis Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic Unpaywall for CDI: Periodical Content Unpaywall  | 
    
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic  | 
    
| DatabaseTitleList | MEDLINE MEDLINE - Academic CrossRef  | 
    
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 3 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Biology | 
    
| EISSN | 1367-4811 | 
    
| EndPage | 1437 | 
    
| ExternalDocumentID | 10.1093/bioinformatics/bts175 22492642 25893870 10_1093_bioinformatics_bts175  | 
    
| Genre | Research Support, Non-U.S. Gov't Journal Article  | 
    
| GroupedDBID | --- -E4 -~X .2P .DC .I3 0R~ 1TH 23N 2WC 4.4 48X 53G 5GY 5WA 70D AAIJN AAIMJ AAJKP AAJQQ AAKPC AAMDB AAMVS AAOGV AAPQZ AAPXW AAUQX AAVAP AAVLN AAYXX ABEJV ABEUO ABGNP ABIXL ABNKS ABPQP ABPTD ABQLI ABWST ABXVV ABZBJ ACGFS ACIWK ACPRK ACUFI ACUXJ ACYTK ADBBV ADEYI ADEZT ADFTL ADGKP ADGZP ADHKW ADHZD ADMLS ADOCK ADPDF ADRDM ADRTK ADVEK ADYVW ADZTZ ADZXQ AECKG AEGPL AEJOX AEKKA AEKSI AELWJ AEMDU AENEX AENZO AEPUE AETBJ AEWNT AFFZL AFGWE AFIYH AFOFC AFRAH AGINJ AGKEF AGQXC AGSYK AHMBA AHXPO AIJHB AJEEA AJEUX AKHUL AKWXX ALMA_UNASSIGNED_HOLDINGS ALTZX ALUQC AMNDL APIBT APWMN ARIXL ASPBG AVWKF AXUDD AYOIW AZVOD BAWUL BAYMD BHONS BQDIO BQUQU BSWAC BTQHN C1A C45 CDBKE CITATION COF CS3 CZ4 DAKXR DIK DILTD DU5 D~K EBD EBS EE~ EJD EMOBN F5P F9B FEDTE FHSFR FLIZI FLUFQ FOEOM FQBLK GAUVT GJXCC GROUPED_DOAJ GX1 H13 H5~ HAR HW0 HZ~ IOX J21 JXSIZ KAQDR KOP KQ8 KSI KSN M-Z MK~ ML0 N9A NGC NLBLG NMDNZ NOMLY NU- NVLIB O0~ O9- OAWHX ODMLO OJQWA OK1 OVD OVEED P2P PAFKI PEELM PQQKQ Q1. Q5Y R44 RD5 RNS ROL RPM RUSNO RW1 RXO SV3 TEORI TJP TLC TOX TR2 W8F WOQ X7H YAYTL YKOAZ YXANX ZKX ~91 ~KM .-4 .GJ ABEFU ABNGD ACUKT AFFNX AGQPQ AI. AQDSO ATTQO AZFZN CAG ELUNK HVGLF IQODW NTWIH O~Y PB- RIG RNI RZF RZO VH1 ZGI ABQTQ ADRIX AFXEN BCRHZ CGR CUY CVF ECM EIF M49 NPM ROX 7X8 ADTOC UNPAY  | 
    
| ID | FETCH-LOGICAL-c386t-82b9247d5d00bad31366fe9d1ae67f6dde8c4899c89172e3c2857882eb2b84ab3 | 
    
| IEDL.DBID | UNPAY | 
    
| ISSN | 1367-4803 1367-4811  | 
    
| IngestDate | Sun Oct 26 03:16:49 EDT 2025 Fri Jul 11 07:54:04 EDT 2025 Wed Feb 19 01:51:35 EST 2025 Mon Jul 21 09:16:25 EDT 2025 Tue Jul 01 03:27:06 EDT 2025 Thu Apr 24 22:55:54 EDT 2025  | 
    
| IsDoiOpenAccess | true | 
    
| IsOpenAccess | true | 
    
| IsPeerReviewed | true | 
    
| IsScholarly | true | 
    
| Issue | 11 | 
    
| Keywords | Algorithm Sequencing  | 
    
| Language | English | 
    
| License | CC BY 4.0 | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-c386t-82b9247d5d00bad31366fe9d1ae67f6dde8c4899c89172e3c2857882eb2b84ab3 | 
    
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23  | 
    
| OpenAccessLink | https://proxy.k.utb.cz/login?url=https://academic.oup.com/bioinformatics/article-pdf/28/11/1429/48869328/bioinformatics_28_11_1429.pdf | 
    
| PMID | 22492642 | 
    
| PQID | 1015466377 | 
    
| PQPubID | 23479 | 
    
| PageCount | 9 | 
    
| ParticipantIDs | unpaywall_primary_10_1093_bioinformatics_bts175 proquest_miscellaneous_1015466377 pubmed_primary_22492642 pascalfrancis_primary_25893870 crossref_primary_10_1093_bioinformatics_bts175 crossref_citationtrail_10_1093_bioinformatics_bts175  | 
    
| ProviderPackageCode | CITATION AAYXX  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2012-06-01 | 
    
| PublicationDateYYYYMMDD | 2012-06-01 | 
    
| PublicationDate_xml | – month: 06 year: 2012 text: 2012-06-01 day: 01  | 
    
| PublicationDecade | 2010 | 
    
| PublicationPlace | Oxford | 
    
| PublicationPlace_xml | – name: Oxford – name: England  | 
    
| PublicationTitle | Bioinformatics (Oxford, England) | 
    
| PublicationTitleAlternate | Bioinformatics | 
    
| PublicationYear | 2012 | 
    
| Publisher | Oxford University Press | 
    
| Publisher_xml | – name: Oxford University Press | 
    
| References | Quinlan (2023012512313706400_B29) 2010; 26 Peng (2023012512313706400_B27) 2010; 13 Langmead (2023012512313706400_B16) 2009; 10 Kent (2023012512313706400_B15) 2001; 11 Barnett (2023012512313706400_B2) 2011; 27 Pardalos (2023012512313706400_B26) 2008; 14 Nesterov (2023012512313706400_B25) 1997 Gao (2023012512313706400_B8) 2011; 18 Beasley (2023012512313706400_B3) 1998 Dayarian (2023012512313706400_B6) 2010; 11 Merz (2023012512313706400_B19) 2004; 78 Nagarajan (2023012512313706400_B23) 2008; 10 Nemhauser (2023012512313706400_B24) 1988 Huson (2023012512313706400_B12) 2006; 23 National Center for Biotechnology Information (2023012512313706400_B22) 2011 Zerbino (2023012512313706400_B31) 2008; 18 Zerbino (2023012512313706400_B32) 2009 Miller (2023012512313706400_B20) 2008; 24 Li (2023012512313706400_B17) 2009; 25 Delcher (2023012512313706400_B7) 2002; 30 IBM |ILOG (2023012512313706400_B13) 2011 Li (2023012512313706400_B18) 2009; 25 Henz (2023012512313706400_B9) 2004; 21 Pop (2023012512313706400_B28) 2004; 14 Kececioglu (2023012512313706400_B14) 1995; 13 Auch (2023012512313706400_B1) 2010; 2 Huson (2023012512313706400_B11) 2002; 49 Boetzer (2023012512313706400_B4) 2011; 27 Salmela (2023012512313706400_B30) 2011; 27 Dantzig (2023012512313706400_B5) 1998 Hirschberg (2023012512313706400_B10) 1975; 18 Myers (2023012512313706400_B21) 2000; 287  | 
    
| References_xml | – volume: 2 start-page: 142 year: 2010 ident: 2023012512313706400_B1 article-title: Standard operating procedure for calculating genome-to-genome distances based on high-scoring segment pairs publication-title: Stand. Genomic Sci. doi: 10.4056/sigs.541628 – volume: 24 start-page: 2818 year: 2008 ident: 2023012512313706400_B20 article-title: Aggressive assembly of pyrosequencing reads with mates publication-title: Bioinformatics doi: 10.1093/bioinformatics/btn548 – volume: 13 start-page: 7 year: 1995 ident: 2023012512313706400_B14 article-title: Combinatorial algorithms for DNA sequence assembly publication-title: Algorithmica doi: 10.1007/BF01188580 – volume: 10 start-page: R25 year: 2009 ident: 2023012512313706400_B16 article-title: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome publication-title: Genome Biol. doi: 10.1186/gb-2009-10-3-r25 – volume: 25 start-page: 2078 year: 2009 ident: 2023012512313706400_B18 article-title: The Sequence Alignment/Map format and SAMtools publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp352 – volume-title: Linear Programming and Extensions year: 1998 ident: 2023012512313706400_B5 – volume: 23 start-page: 254 year: 2006 ident: 2023012512313706400_B12 article-title: Application of phylogenetic networks in evolutionary studies publication-title: Mol. Biol. Evol. doi: 10.1093/molbev/msj030 – volume: 27 start-page: 578 year: 2011 ident: 2023012512313706400_B4 article-title: Scaffolding pre-assembled contigs using SSPACE publication-title: Bioinformatics doi: 10.1093/bioinformatics/btq683 – volume: 26 start-page: 841 year: 2010 ident: 2023012512313706400_B29 article-title: BEDTools: a flexible suite of utilities for comparing genomic features publication-title: Bioinformatics doi: 10.1093/bioinformatics/btq033 – volume: 18 start-page: 341 year: 1975 ident: 2023012512313706400_B10 article-title: A linear space algorithm for computing maximal common subsequences publication-title: Commun. ACM doi: 10.1145/360825.360861 – volume: 11 start-page: 1541 year: 2001 ident: 2023012512313706400_B15 article-title: Assembly of the working draft of the human genome with GigAssembler publication-title: Genome Res. doi: 10.1101/gr.183201 – volume: 27 start-page: 1691 year: 2011 ident: 2023012512313706400_B2 article-title: BamTools: a C++ API and toolkit for analyzing and managing BAM files publication-title: Bioinformatics doi: 10.1093/bioinformatics/btr174 – volume: 14 start-page: 149 year: 2004 ident: 2023012512313706400_B28 article-title: Hierarchical scaffolding with Bambus publication-title: Genome Res. doi: 10.1101/gr.1536204 – volume: 25 start-page: 1754 year: 2009 ident: 2023012512313706400_B17 article-title: Fast and accurate short read alignment with Burrows-Wheeler transform publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp324 – volume: 27 start-page: 3259 year: 2011 ident: 2023012512313706400_B30 article-title: Fast scaffolding with small independent mixed integer programs publication-title: Bioinformatics doi: 10.1093/bioinformatics/btr562 – volume: 21 start-page: 2329 year: 2004 ident: 2023012512313706400_B9 article-title: Whole-genome prokaryotic phylogeny publication-title: Bioinformatics doi: 10.1093/bioinformatics/bth324 – volume: 49 start-page: 603 year: 2002 ident: 2023012512313706400_B11 article-title: The greedy path-merging algorithm for contig scaffolding publication-title: J. ACM doi: 10.1145/585265.585267 – volume: 10 start-page: 1229 year: 2008 ident: 2023012512313706400_B23 article-title: Scaffolding and validation of bacterial genome assemblies using optical restriction maps publication-title: Bioinformatics doi: 10.1093/bioinformatics/btn102 – volume-title: Technical Report. year: 1998 ident: 2023012512313706400_B3 article-title: Heuristic algorithms for the unconstrained binary quadratic programming problem – volume-title: PhD Thesis year: 2009 ident: 2023012512313706400_B32 article-title: Genome assembly and comparison – year: 2011 ident: 2023012512313706400_B13 publication-title: ILOG CPLEX: high-performance software for mathematical programming and optimization. – volume-title: CORE Discussion Papers 1997019. year: 1997 ident: 2023012512313706400_B25 article-title: Quality of semidefinite relaxation for nonconvex quadratic optimization – volume: 287 start-page: 2196 year: 2000 ident: 2023012512313706400_B21 article-title: A whole-genome assembly of Drosophila publication-title: Science doi: 10.1126/science.287.5461.2196 – volume: 18 start-page: 821 year: 2008 ident: 2023012512313706400_B31 article-title: Velvet: algorithms for de novo short read assembly using de Bruijn graphs publication-title: Genome Res. doi: 10.1101/gr.074492.107 – volume: 18 start-page: 1681 year: 2011 ident: 2023012512313706400_B8 article-title: Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences publication-title: J. Comput. Biol. doi: 10.1089/cmb.2011.0170 – volume-title: The NCBI C++ Toolkit Book (Internet). year: 2011 ident: 2023012512313706400_B22 article-title: Biological Sequence Data Model – volume: 30 start-page: 2478 year: 2002 ident: 2023012512313706400_B7 article-title: Fast algorithms for large-scale genome alignment and comparison publication-title: Nucleic Acids Res. doi: 10.1093/nar/30.11.2478 – volume: 14 start-page: 129 year: 2008 ident: 2023012512313706400_B26 article-title: Global equilibrium search applied to the unconstrained binary quadratic optimization problem publication-title: Optim. Meth. Softw. doi: 10.1080/10556780701550083 – volume: 78 start-page: 99 year: 2004 ident: 2023012512313706400_B19 article-title: Memetic algorithms for the unconstrained binary quadratic programming problem publication-title: BioSystems doi: 10.1016/j.biosystems.2004.08.002 – volume-title: Integer and combinatorial optimization. year: 1988 ident: 2023012512313706400_B24 doi: 10.1002/9781118627372 – volume: 13 start-page: 149 year: 2010 ident: 2023012512313706400_B27 article-title: IDBA – a practical iterative de Bruijn graph de novo assembler publication-title: Genome Res. – volume: 11 start-page: 345 year: 2010 ident: 2023012512313706400_B6 article-title: SOPRA: scaffolding algorithm for paired reads via statistical optimization publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-11-345  | 
    
| SSID | ssj0005056 | 
    
| Score | 2.2480967 | 
    
| Snippet | Motivation: The increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome... The increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome sequencing....  | 
    
| SourceID | unpaywall proquest pubmed pascalfrancis crossref  | 
    
| SourceType | Open Access Repository Aggregation Database Index Database Enrichment Source  | 
    
| StartPage | 1429 | 
    
| SubjectTerms | Algorithms Biological and medical sciences Contig Mapping Escherichia coli - genetics Fundamental and applied biological sciences. Psychology General aspects Genome, Bacterial High-Throughput Nucleotide Sequencing - methods Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Pseudomonas syringae - genetics Sequence Analysis, DNA - methods Xanthomonadaceae - genetics  | 
    
| Title | GRASS: a generic algorithm for scaffolding next-generation sequencing assemblies | 
    
| URI | https://www.ncbi.nlm.nih.gov/pubmed/22492642 https://www.proquest.com/docview/1015466377 https://academic.oup.com/bioinformatics/article-pdf/28/11/1429/48869328/bioinformatics_28_11_1429.pdf  | 
    
| UnpaywallVersion | publishedVersion | 
    
| Volume | 28 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAFT databaseName: Open Access Digital Library customDbUrl: eissn: 1367-4811 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: KQ8 dateStart: 19960101 isFulltext: true titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html providerName: Colorado Alliance of Research Libraries – providerCode: PRVEBS databaseName: Inspec with Full Text customDbUrl: eissn: 1367-4811 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: ADMLS dateStart: 19980101 isFulltext: true titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text providerName: EBSCOhost – providerCode: PRVBFR databaseName: Free Medical Journals customDbUrl: eissn: 1367-4811 dateEnd: 20241102 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: DIK dateStart: 19960101 isFulltext: true titleUrlDefault: http://www.freemedicaljournals.com providerName: Flying Publisher – providerCode: PRVFQY databaseName: GFMER Free Medical Journals customDbUrl: eissn: 1367-4811 dateEnd: 20241102 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: GX1 dateStart: 19960101 isFulltext: true titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php providerName: Geneva Foundation for Medical Education and Research – providerCode: PRVAQN databaseName: PubMed Central (ODIN) customDbUrl: eissn: 1367-4811 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: RPM dateStart: 20070101 isFulltext: true titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/ providerName: National Library of Medicine – providerCode: PRVOVD databaseName: Journals@Ovid LWW All Open Access Journal Collection Rolling customDbUrl: eissn: 1367-4811 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: OVEED dateStart: 20010101 isFulltext: true titleUrlDefault: http://ovidsp.ovid.com/ providerName: Ovid – providerCode: PRVASL databaseName: Oxford Journals Open Access (Activated by CARLI) customDbUrl: eissn: 1367-4811 dateEnd: 20220930 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: TOX dateStart: 19850101 isFulltext: true titleUrlDefault: https://academic.oup.com/journals/ providerName: Oxford University Press – providerCode: PRVASL databaseName: Oxford Journals Open Access (Activated by CARLI) customDbUrl: eissn: 1367-4811 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: TOX dateStart: 19850101 isFulltext: true titleUrlDefault: https://academic.oup.com/journals/ providerName: Oxford University Press  | 
    
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB61WyFAFe9HeKyMxNVJ4ziJ3duqolQcSkW70nKKbMcpK7bZVTcrVH4Av7vjOFnKCgk4cItkj5OMx_Y38sw3AG_zVJVSGEsFjyXlymRUyVxS7ZgMjFSWq5bt8zg7GvMPk3SyBbbPhVFdVHjYpzTo6byjEHW0xVGnT7ooq4iJKI6jGHfUCM0wQyQiNroXTCDELVyXECW2YSdLEbIPYGd8fDL67HOycspFW0G5e47jPtNHJpuv180ydiGJN86w3YVaojorXwfjd0D1Ltxe1Qt19U3NZjcOr8P78KP_bR-z8jVcNTo03zcYIf-7Xh7AvQ7-kpEf5yFs2foR3PIFMa8ew8n7T6PT032iyLmjw54aombn88tp8-WC4LAEVVBV_r6M1M5nb7u1xkW6oHDXhA6BvdCIrpdPYHz47uzgiHY1H6hJRNZQwTR6hHmZlnt7WpUJzklWWVnGymZ5leFmLAxHH9EI9DOZTQwTuOcIZjXTgiudPIVBPa_tcyBKlEJLbEZIwnNTyiotcfa1c_kqw7MAeD-LhekI0V1djlnhL-aTYkNpfvIDCNdiC88I8ieB4S8mspZiKYJF3C8DeNPbTIGL293YqNrOV0sXf5dyxIR5HsAzb0w_pVuuR84CiNbW9Xcf9OKfJV7CHQSKzIfIvYJBc7myrxGMNXoI22cfJ8NuQV0DK7U7fA | 
    
| linkProvider | Unpaywall | 
    
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB6VrRAgxLsQHpWRuCZpHMexua0QpeJQVZSVyimyHaddsc2uullV5QfwuxnHzlJWSMCBWyR7nGQ8tr-RZ74BeFMWqpbC2FiwTMZMGR4rWcpYOyYDI5Vlqmf7POQHE_bxpDjZAjvkwqgQFZ4MKQ16Og8Uoo62OA36jBd1k1KRZlma4Y6aohlyRCJio3tFBULcynVJUOIGbPMCIfsItieHR-MvPierjJnoKyiH5ywbMn1kvvl63S0zF5J47Qy7u1BLVGfj62D8DqjegVurdqGuLtVsdu3w2r8P34ff9jErX5NVpxPzbYMR8r_r5QHcC_CXjP04D2HLto_gpi-IefUYjj58Gh8fvyWKnDo67KkhanY6v5h2Z-cEhyWogqbx92WkdT573603LhKCwl0TOgT2XCO6Xj6Byf77z-8O4lDzITa54F0sqEaPsKyLem9PqzrHOeGNlXWmLC8bjpuxMAx9RCPQz6Q2N1TgniOo1VQLpnS-A6N23tpnQJSohZbYjJCElaaWTVHj7Gvn8jWG8QjYMIuVCYTori7HrPIX83m1oTQ_-REka7GFZwT5k8DuLyaylqIFgkXcLyN4PdhMhYvb3dio1s5XSxd_VzDEhGUZwVNvTD-le65HRiNI19b1dx_0_J8lXsBtBIrUh8i9hFF3sbKvEIx1ejcspR_L5Tpg | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=GRASS%3A+a+generic+algorithm+for+scaffolding+next-generation+sequencing+assemblies&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Gritsenko%2C+Alexey+A&rft.au=Nijkamp%2C+Jurgen+F&rft.au=Reinders%2C+Marcel+J+T&rft.au=de+Ridder%2C+Dick&rft.date=2012-06-01&rft.eissn=1367-4811&rft.volume=28&rft.issue=11&rft.spage=1429&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbts175&rft_id=info%3Apmid%2F22492642&rft.externalDocID=22492642 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon |