GRASS: a generic algorithm for scaffolding next-generation sequencing assemblies

Motivation: The increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome sequencing. This in turn has fueled the need for reliable means of obtaining high-quality draft genomes from short-read sequencing data. The mil...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics (Oxford, England) Vol. 28; no. 11; pp. 1429 - 1437
Main Authors Gritsenko, Alexey A., Nijkamp, Jurgen F., Reinders, Marcel J.T., Ridder, Dick de
Format Journal Article
LanguageEnglish
Published Oxford Oxford University Press 01.06.2012
Subjects
Online AccessGet full text
ISSN1367-4803
1367-4811
1367-4811
DOI10.1093/bioinformatics/bts175

Cover

Abstract Motivation: The increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome sequencing. This in turn has fueled the need for reliable means of obtaining high-quality draft genomes from short-read sequencing data. The millions of reads usually involved in HTS experiments are first assembled into longer fragments called contigs, which are then scaffolded, i.e. ordered and oriented using additional information, to produce even longer sequences called scaffolds. Most existing scaffolders of HTS genome assemblies are not suited for using information other than paired reads to perform scaffolding. They use this limited information to construct scaffolds, often preferring scaffold length over accuracy, when faced with the tradeoff. Results: We present GRASS (GeneRic ASsembly Scaffolder)—a novel algorithm for scaffolding second-generation sequencing assemblies capable of using diverse information sources. GRASS offers a mixed-integer programming formulation of the contig scaffolding problem, which combines contig order, distance and orientation in a single optimization objective. The resulting optimization problem is solved using an expectation–maximization procedure and an unconstrained binary quadratic programming approximation of the original problem. We compared GRASS with existing HTS scaffolders using Illumina paired reads of three bacterial genomes. Our algorithm constructs a comparable number of scaffolds, but makes fewer errors. This result is further improved when additional data, in the form of related genome sequences, are used. Availability: GRASS source code is freely available from http://code.google.com/p/tud-scaffolding/. Contact:  a.gritsenko@tudelft.nl Supplementary information:  Supplementary data are available at Bioinformatics online.
AbstractList The increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome sequencing. This in turn has fueled the need for reliable means of obtaining high-quality draft genomes from short-read sequencing data. The millions of reads usually involved in HTS experiments are first assembled into longer fragments called contigs, which are then scaffolded, i.e. ordered and oriented using additional information, to produce even longer sequences called scaffolds. Most existing scaffolders of HTS genome assemblies are not suited for using information other than paired reads to perform scaffolding. They use this limited information to construct scaffolds, often preferring scaffold length over accuracy, when faced with the tradeoff. We present GRASS (GeneRic ASsembly Scaffolder)-a novel algorithm for scaffolding second-generation sequencing assemblies capable of using diverse information sources. GRASS offers a mixed-integer programming formulation of the contig scaffolding problem, which combines contig order, distance and orientation in a single optimization objective. The resulting optimization problem is solved using an expectation-maximization procedure and an unconstrained binary quadratic programming approximation of the original problem. We compared GRASS with existing HTS scaffolders using Illumina paired reads of three bacterial genomes. Our algorithm constructs a comparable number of scaffolds, but makes fewer errors. This result is further improved when additional data, in the form of related genome sequences, are used. GRASS source code is freely available from http://code.google.com/p/tud-scaffolding/. Supplementary data are available at Bioinformatics online.
The increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome sequencing. This in turn has fueled the need for reliable means of obtaining high-quality draft genomes from short-read sequencing data. The millions of reads usually involved in HTS experiments are first assembled into longer fragments called contigs, which are then scaffolded, i.e. ordered and oriented using additional information, to produce even longer sequences called scaffolds. Most existing scaffolders of HTS genome assemblies are not suited for using information other than paired reads to perform scaffolding. They use this limited information to construct scaffolds, often preferring scaffold length over accuracy, when faced with the tradeoff.MOTIVATIONThe increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome sequencing. This in turn has fueled the need for reliable means of obtaining high-quality draft genomes from short-read sequencing data. The millions of reads usually involved in HTS experiments are first assembled into longer fragments called contigs, which are then scaffolded, i.e. ordered and oriented using additional information, to produce even longer sequences called scaffolds. Most existing scaffolders of HTS genome assemblies are not suited for using information other than paired reads to perform scaffolding. They use this limited information to construct scaffolds, often preferring scaffold length over accuracy, when faced with the tradeoff.We present GRASS (GeneRic ASsembly Scaffolder)-a novel algorithm for scaffolding second-generation sequencing assemblies capable of using diverse information sources. GRASS offers a mixed-integer programming formulation of the contig scaffolding problem, which combines contig order, distance and orientation in a single optimization objective. The resulting optimization problem is solved using an expectation-maximization procedure and an unconstrained binary quadratic programming approximation of the original problem. We compared GRASS with existing HTS scaffolders using Illumina paired reads of three bacterial genomes. Our algorithm constructs a comparable number of scaffolds, but makes fewer errors. This result is further improved when additional data, in the form of related genome sequences, are used.RESULTSWe present GRASS (GeneRic ASsembly Scaffolder)-a novel algorithm for scaffolding second-generation sequencing assemblies capable of using diverse information sources. GRASS offers a mixed-integer programming formulation of the contig scaffolding problem, which combines contig order, distance and orientation in a single optimization objective. The resulting optimization problem is solved using an expectation-maximization procedure and an unconstrained binary quadratic programming approximation of the original problem. We compared GRASS with existing HTS scaffolders using Illumina paired reads of three bacterial genomes. Our algorithm constructs a comparable number of scaffolds, but makes fewer errors. This result is further improved when additional data, in the form of related genome sequences, are used.GRASS source code is freely available from http://code.google.com/p/tud-scaffolding/.AVAILABILITYGRASS source code is freely available from http://code.google.com/p/tud-scaffolding/.Supplementary data are available at Bioinformatics online.SUPPLEMENTARY INFORMATIONSupplementary data are available at Bioinformatics online.
Motivation: The increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome sequencing. This in turn has fueled the need for reliable means of obtaining high-quality draft genomes from short-read sequencing data. The millions of reads usually involved in HTS experiments are first assembled into longer fragments called contigs, which are then scaffolded, i.e. ordered and oriented using additional information, to produce even longer sequences called scaffolds. Most existing scaffolders of HTS genome assemblies are not suited for using information other than paired reads to perform scaffolding. They use this limited information to construct scaffolds, often preferring scaffold length over accuracy, when faced with the tradeoff. Results: We present GRASS (GeneRic ASsembly Scaffolder)—a novel algorithm for scaffolding second-generation sequencing assemblies capable of using diverse information sources. GRASS offers a mixed-integer programming formulation of the contig scaffolding problem, which combines contig order, distance and orientation in a single optimization objective. The resulting optimization problem is solved using an expectation–maximization procedure and an unconstrained binary quadratic programming approximation of the original problem. We compared GRASS with existing HTS scaffolders using Illumina paired reads of three bacterial genomes. Our algorithm constructs a comparable number of scaffolds, but makes fewer errors. This result is further improved when additional data, in the form of related genome sequences, are used. Availability: GRASS source code is freely available from http://code.google.com/p/tud-scaffolding/. Contact:  a.gritsenko@tudelft.nl Supplementary information:  Supplementary data are available at Bioinformatics online.
Author Gritsenko, Alexey A.
Nijkamp, Jurgen F.
Reinders, Marcel J.T.
Ridder, Dick de
Author_xml – sequence: 1
  givenname: Alexey A.
  surname: Gritsenko
  fullname: Gritsenko, Alexey A.
– sequence: 2
  givenname: Jurgen F.
  surname: Nijkamp
  fullname: Nijkamp, Jurgen F.
– sequence: 3
  givenname: Marcel J.T.
  surname: Reinders
  fullname: Reinders, Marcel J.T.
– sequence: 4
  givenname: Dick de
  surname: Ridder
  fullname: Ridder, Dick de
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=25893870$$DView record in Pascal Francis
https://www.ncbi.nlm.nih.gov/pubmed/22492642$$D View this record in MEDLINE/PubMed
BookMark eNqNkM1OGzEURi1ERUjoI1DNphKbKf6ZsT3tCiEIlSIVQbse2Z47wZXHTu2JIG-PISmIbsrKV_L57v10pmjfBw8IHRP8heCGnWobrO9DHNRoTTrVYyKi3kOHhHFRVpKQ_ZcZswmapvQbY1zjmh-gCaVVQ3lFD9H1_Obs9vZroYoleIjWFMotQ7Tj3VDk7UUyqu-D66xfFh4exvIZyzeDLxL8WYM3T18qJRi0s5CO0IdeuQQfd-8M_bq8-Hl-VS5-zL-fny1KwyQfS0l1QyvR1R3GWnUsV-U9NB1RwEXPuw6kqWTTGNkQQYEZKmshJQVNtayUZjPEt3vXfqU298q5dhXtoOKmJbh9UtS-VdRuFeXgyTa4iiH3T2M72GTAOeUhrFOOkrrinAmR0U87dK0H6F4O_NWXgc87QGVRro8q60ivXC0bJgXOXL3lTAwpRejf3fXbPzljx2f5Y1TW_Sf9CCW5ros
CitedBy_id crossref_primary_10_1186_s40168_019_0626_5
crossref_primary_10_1093_bioinformatics_btv171
crossref_primary_10_1093_bib_bbab033
crossref_primary_10_3389_fmicb_2014_00769
crossref_primary_10_1142_S0219720019500148
crossref_primary_10_1186_1471_2164_16_S10_S11
crossref_primary_10_1016_j_endm_2018_01_020
crossref_primary_10_1186_1471_2164_14_289
crossref_primary_10_1007_s10462_020_09951_1
crossref_primary_10_1007_s00521_014_1659_0
crossref_primary_10_1186_1471_2105_16_S14_S2
crossref_primary_10_1007_s00453_021_00819_6
crossref_primary_10_1016_j_tcs_2015_06_023
crossref_primary_10_1371_journal_pone_0237087
crossref_primary_10_1007_s42979_022_01198_7
crossref_primary_10_1186_gb_2014_15_3_r42
crossref_primary_10_1093_bib_bbad169
crossref_primary_10_1093_bioinformatics_btu291
crossref_primary_10_1093_bioinformatics_bty773
crossref_primary_10_1186_s12864_016_2579_4
crossref_primary_10_1101_gr_178319_114
crossref_primary_10_1186_1471_2105_15_281
crossref_primary_10_1186_s12859_017_1919_y
crossref_primary_10_1089_cmb_2019_0310
crossref_primary_10_1109_TCBB_2018_2858267
crossref_primary_10_24072_pcjournal_128
crossref_primary_10_3389_fgene_2014_00243
crossref_primary_10_3233_IFS_151994
Cites_doi 10.4056/sigs.541628
10.1093/bioinformatics/btn548
10.1007/BF01188580
10.1186/gb-2009-10-3-r25
10.1093/bioinformatics/btp352
10.1093/molbev/msj030
10.1093/bioinformatics/btq683
10.1093/bioinformatics/btq033
10.1145/360825.360861
10.1101/gr.183201
10.1093/bioinformatics/btr174
10.1101/gr.1536204
10.1093/bioinformatics/btp324
10.1093/bioinformatics/btr562
10.1093/bioinformatics/bth324
10.1145/585265.585267
10.1093/bioinformatics/btn102
10.1126/science.287.5461.2196
10.1101/gr.074492.107
10.1089/cmb.2011.0170
10.1093/nar/30.11.2478
10.1080/10556780701550083
10.1016/j.biosystems.2004.08.002
10.1002/9781118627372
10.1186/1471-2105-11-345
ContentType Journal Article
Copyright 2015 INIST-CNRS
Copyright_xml – notice: 2015 INIST-CNRS
DBID AAYXX
CITATION
IQODW
CGR
CUY
CVF
ECM
EIF
NPM
7X8
ADTOC
UNPAY
DOI 10.1093/bioinformatics/bts175
DatabaseName CrossRef
Pascal-Francis
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE
MEDLINE - Academic
CrossRef
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 3
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1367-4811
EndPage 1437
ExternalDocumentID 10.1093/bioinformatics/bts175
22492642
25893870
10_1093_bioinformatics_bts175
Genre Research Support, Non-U.S. Gov't
Journal Article
GroupedDBID ---
-E4
-~X
.2P
.DC
.I3
0R~
1TH
23N
2WC
4.4
48X
53G
5GY
5WA
70D
AAIJN
AAIMJ
AAJKP
AAJQQ
AAKPC
AAMDB
AAMVS
AAOGV
AAPQZ
AAPXW
AAUQX
AAVAP
AAVLN
AAYXX
ABEJV
ABEUO
ABGNP
ABIXL
ABNKS
ABPQP
ABPTD
ABQLI
ABWST
ABXVV
ABZBJ
ACGFS
ACIWK
ACPRK
ACUFI
ACUXJ
ACYTK
ADBBV
ADEYI
ADEZT
ADFTL
ADGKP
ADGZP
ADHKW
ADHZD
ADMLS
ADOCK
ADPDF
ADRDM
ADRTK
ADVEK
ADYVW
ADZTZ
ADZXQ
AECKG
AEGPL
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFZL
AFGWE
AFIYH
AFOFC
AFRAH
AGINJ
AGKEF
AGQXC
AGSYK
AHMBA
AHXPO
AIJHB
AJEEA
AJEUX
AKHUL
AKWXX
ALMA_UNASSIGNED_HOLDINGS
ALTZX
ALUQC
AMNDL
APIBT
APWMN
ARIXL
ASPBG
AVWKF
AXUDD
AYOIW
AZVOD
BAWUL
BAYMD
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
C1A
C45
CDBKE
CITATION
COF
CS3
CZ4
DAKXR
DIK
DILTD
DU5
D~K
EBD
EBS
EE~
EJD
EMOBN
F5P
F9B
FEDTE
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GROUPED_DOAJ
GX1
H13
H5~
HAR
HW0
HZ~
IOX
J21
JXSIZ
KAQDR
KOP
KQ8
KSI
KSN
M-Z
MK~
ML0
N9A
NGC
NLBLG
NMDNZ
NOMLY
NU-
NVLIB
O0~
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
P2P
PAFKI
PEELM
PQQKQ
Q1.
Q5Y
R44
RD5
RNS
ROL
RPM
RUSNO
RW1
RXO
SV3
TEORI
TJP
TLC
TOX
TR2
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZKX
~91
~KM
.-4
.GJ
ABEFU
ABNGD
ACUKT
AFFNX
AGQPQ
AI.
AQDSO
ATTQO
AZFZN
CAG
ELUNK
HVGLF
IQODW
NTWIH
O~Y
PB-
RIG
RNI
RZF
RZO
VH1
ZGI
ABQTQ
ADRIX
AFXEN
BCRHZ
CGR
CUY
CVF
ECM
EIF
M49
NPM
ROX
7X8
ADTOC
UNPAY
ID FETCH-LOGICAL-c386t-82b9247d5d00bad31366fe9d1ae67f6dde8c4899c89172e3c2857882eb2b84ab3
IEDL.DBID UNPAY
ISSN 1367-4803
1367-4811
IngestDate Sun Oct 26 03:16:49 EDT 2025
Fri Jul 11 07:54:04 EDT 2025
Wed Feb 19 01:51:35 EST 2025
Mon Jul 21 09:16:25 EDT 2025
Tue Jul 01 03:27:06 EDT 2025
Thu Apr 24 22:55:54 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 11
Keywords Algorithm
Sequencing
Language English
License CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c386t-82b9247d5d00bad31366fe9d1ae67f6dde8c4899c89172e3c2857882eb2b84ab3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://proxy.k.utb.cz/login?url=https://academic.oup.com/bioinformatics/article-pdf/28/11/1429/48869328/bioinformatics_28_11_1429.pdf
PMID 22492642
PQID 1015466377
PQPubID 23479
PageCount 9
ParticipantIDs unpaywall_primary_10_1093_bioinformatics_bts175
proquest_miscellaneous_1015466377
pubmed_primary_22492642
pascalfrancis_primary_25893870
crossref_primary_10_1093_bioinformatics_bts175
crossref_citationtrail_10_1093_bioinformatics_bts175
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2012-06-01
PublicationDateYYYYMMDD 2012-06-01
PublicationDate_xml – month: 06
  year: 2012
  text: 2012-06-01
  day: 01
PublicationDecade 2010
PublicationPlace Oxford
PublicationPlace_xml – name: Oxford
– name: England
PublicationTitle Bioinformatics (Oxford, England)
PublicationTitleAlternate Bioinformatics
PublicationYear 2012
Publisher Oxford University Press
Publisher_xml – name: Oxford University Press
References Quinlan (2023012512313706400_B29) 2010; 26
Peng (2023012512313706400_B27) 2010; 13
Langmead (2023012512313706400_B16) 2009; 10
Kent (2023012512313706400_B15) 2001; 11
Barnett (2023012512313706400_B2) 2011; 27
Pardalos (2023012512313706400_B26) 2008; 14
Nesterov (2023012512313706400_B25) 1997
Gao (2023012512313706400_B8) 2011; 18
Beasley (2023012512313706400_B3) 1998
Dayarian (2023012512313706400_B6) 2010; 11
Merz (2023012512313706400_B19) 2004; 78
Nagarajan (2023012512313706400_B23) 2008; 10
Nemhauser (2023012512313706400_B24) 1988
Huson (2023012512313706400_B12) 2006; 23
National Center for Biotechnology Information (2023012512313706400_B22) 2011
Zerbino (2023012512313706400_B31) 2008; 18
Zerbino (2023012512313706400_B32) 2009
Miller (2023012512313706400_B20) 2008; 24
Li (2023012512313706400_B17) 2009; 25
Delcher (2023012512313706400_B7) 2002; 30
IBM |ILOG (2023012512313706400_B13) 2011
Li (2023012512313706400_B18) 2009; 25
Henz (2023012512313706400_B9) 2004; 21
Pop (2023012512313706400_B28) 2004; 14
Kececioglu (2023012512313706400_B14) 1995; 13
Auch (2023012512313706400_B1) 2010; 2
Huson (2023012512313706400_B11) 2002; 49
Boetzer (2023012512313706400_B4) 2011; 27
Salmela (2023012512313706400_B30) 2011; 27
Dantzig (2023012512313706400_B5) 1998
Hirschberg (2023012512313706400_B10) 1975; 18
Myers (2023012512313706400_B21) 2000; 287
References_xml – volume: 2
  start-page: 142
  year: 2010
  ident: 2023012512313706400_B1
  article-title: Standard operating procedure for calculating genome-to-genome distances based on high-scoring segment pairs
  publication-title: Stand. Genomic Sci.
  doi: 10.4056/sigs.541628
– volume: 24
  start-page: 2818
  year: 2008
  ident: 2023012512313706400_B20
  article-title: Aggressive assembly of pyrosequencing reads with mates
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btn548
– volume: 13
  start-page: 7
  year: 1995
  ident: 2023012512313706400_B14
  article-title: Combinatorial algorithms for DNA sequence assembly
  publication-title: Algorithmica
  doi: 10.1007/BF01188580
– volume: 10
  start-page: R25
  year: 2009
  ident: 2023012512313706400_B16
  article-title: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
  publication-title: Genome Biol.
  doi: 10.1186/gb-2009-10-3-r25
– volume: 25
  start-page: 2078
  year: 2009
  ident: 2023012512313706400_B18
  article-title: The Sequence Alignment/Map format and SAMtools
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btp352
– volume-title: Linear Programming and Extensions
  year: 1998
  ident: 2023012512313706400_B5
– volume: 23
  start-page: 254
  year: 2006
  ident: 2023012512313706400_B12
  article-title: Application of phylogenetic networks in evolutionary studies
  publication-title: Mol. Biol. Evol.
  doi: 10.1093/molbev/msj030
– volume: 27
  start-page: 578
  year: 2011
  ident: 2023012512313706400_B4
  article-title: Scaffolding pre-assembled contigs using SSPACE
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btq683
– volume: 26
  start-page: 841
  year: 2010
  ident: 2023012512313706400_B29
  article-title: BEDTools: a flexible suite of utilities for comparing genomic features
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btq033
– volume: 18
  start-page: 341
  year: 1975
  ident: 2023012512313706400_B10
  article-title: A linear space algorithm for computing maximal common subsequences
  publication-title: Commun. ACM
  doi: 10.1145/360825.360861
– volume: 11
  start-page: 1541
  year: 2001
  ident: 2023012512313706400_B15
  article-title: Assembly of the working draft of the human genome with GigAssembler
  publication-title: Genome Res.
  doi: 10.1101/gr.183201
– volume: 27
  start-page: 1691
  year: 2011
  ident: 2023012512313706400_B2
  article-title: BamTools: a C++ API and toolkit for analyzing and managing BAM files
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btr174
– volume: 14
  start-page: 149
  year: 2004
  ident: 2023012512313706400_B28
  article-title: Hierarchical scaffolding with Bambus
  publication-title: Genome Res.
  doi: 10.1101/gr.1536204
– volume: 25
  start-page: 1754
  year: 2009
  ident: 2023012512313706400_B17
  article-title: Fast and accurate short read alignment with Burrows-Wheeler transform
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btp324
– volume: 27
  start-page: 3259
  year: 2011
  ident: 2023012512313706400_B30
  article-title: Fast scaffolding with small independent mixed integer programs
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btr562
– volume: 21
  start-page: 2329
  year: 2004
  ident: 2023012512313706400_B9
  article-title: Whole-genome prokaryotic phylogeny
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bth324
– volume: 49
  start-page: 603
  year: 2002
  ident: 2023012512313706400_B11
  article-title: The greedy path-merging algorithm for contig scaffolding
  publication-title: J. ACM
  doi: 10.1145/585265.585267
– volume: 10
  start-page: 1229
  year: 2008
  ident: 2023012512313706400_B23
  article-title: Scaffolding and validation of bacterial genome assemblies using optical restriction maps
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btn102
– volume-title: Technical Report.
  year: 1998
  ident: 2023012512313706400_B3
  article-title: Heuristic algorithms for the unconstrained binary quadratic programming problem
– volume-title: PhD Thesis
  year: 2009
  ident: 2023012512313706400_B32
  article-title: Genome assembly and comparison
– year: 2011
  ident: 2023012512313706400_B13
  publication-title: ILOG CPLEX: high-performance software for mathematical programming and optimization.
– volume-title: CORE Discussion Papers 1997019.
  year: 1997
  ident: 2023012512313706400_B25
  article-title: Quality of semidefinite relaxation for nonconvex quadratic optimization
– volume: 287
  start-page: 2196
  year: 2000
  ident: 2023012512313706400_B21
  article-title: A whole-genome assembly of Drosophila
  publication-title: Science
  doi: 10.1126/science.287.5461.2196
– volume: 18
  start-page: 821
  year: 2008
  ident: 2023012512313706400_B31
  article-title: Velvet: algorithms for de novo short read assembly using de Bruijn graphs
  publication-title: Genome Res.
  doi: 10.1101/gr.074492.107
– volume: 18
  start-page: 1681
  year: 2011
  ident: 2023012512313706400_B8
  article-title: Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences
  publication-title: J. Comput. Biol.
  doi: 10.1089/cmb.2011.0170
– volume-title: The NCBI C++ Toolkit Book (Internet).
  year: 2011
  ident: 2023012512313706400_B22
  article-title: Biological Sequence Data Model
– volume: 30
  start-page: 2478
  year: 2002
  ident: 2023012512313706400_B7
  article-title: Fast algorithms for large-scale genome alignment and comparison
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/30.11.2478
– volume: 14
  start-page: 129
  year: 2008
  ident: 2023012512313706400_B26
  article-title: Global equilibrium search applied to the unconstrained binary quadratic optimization problem
  publication-title: Optim. Meth. Softw.
  doi: 10.1080/10556780701550083
– volume: 78
  start-page: 99
  year: 2004
  ident: 2023012512313706400_B19
  article-title: Memetic algorithms for the unconstrained binary quadratic programming problem
  publication-title: BioSystems
  doi: 10.1016/j.biosystems.2004.08.002
– volume-title: Integer and combinatorial optimization.
  year: 1988
  ident: 2023012512313706400_B24
  doi: 10.1002/9781118627372
– volume: 13
  start-page: 149
  year: 2010
  ident: 2023012512313706400_B27
  article-title: IDBA – a practical iterative de Bruijn graph de novo assembler
  publication-title: Genome Res.
– volume: 11
  start-page: 345
  year: 2010
  ident: 2023012512313706400_B6
  article-title: SOPRA: scaffolding algorithm for paired reads via statistical optimization
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-11-345
SSID ssj0005056
Score 2.2480967
Snippet Motivation: The increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome...
The increasing availability of second-generation high-throughput sequencing (HTS) technologies has sparked a growing interest in de novo genome sequencing....
SourceID unpaywall
proquest
pubmed
pascalfrancis
crossref
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage 1429
SubjectTerms Algorithms
Biological and medical sciences
Contig Mapping
Escherichia coli - genetics
Fundamental and applied biological sciences. Psychology
General aspects
Genome, Bacterial
High-Throughput Nucleotide Sequencing - methods
Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)
Pseudomonas syringae - genetics
Sequence Analysis, DNA - methods
Xanthomonadaceae - genetics
Title GRASS: a generic algorithm for scaffolding next-generation sequencing assemblies
URI https://www.ncbi.nlm.nih.gov/pubmed/22492642
https://www.proquest.com/docview/1015466377
https://academic.oup.com/bioinformatics/article-pdf/28/11/1429/48869328/bioinformatics_28_11_1429.pdf
UnpaywallVersion publishedVersion
Volume 28
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAFT
  databaseName: Open Access Digital Library
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: KQ8
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html
  providerName: Colorado Alliance of Research Libraries
– providerCode: PRVEBS
  databaseName: Inspec with Full Text
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: ADMLS
  dateStart: 19980101
  isFulltext: true
  titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text
  providerName: EBSCOhost
– providerCode: PRVBFR
  databaseName: Free Medical Journals
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 20241102
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: DIK
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: http://www.freemedicaljournals.com
  providerName: Flying Publisher
– providerCode: PRVFQY
  databaseName: GFMER Free Medical Journals
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 20241102
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: GX1
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php
  providerName: Geneva Foundation for Medical Education and Research
– providerCode: PRVAQN
  databaseName: PubMed Central (ODIN)
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: RPM
  dateStart: 20070101
  isFulltext: true
  titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/
  providerName: National Library of Medicine
– providerCode: PRVOVD
  databaseName: Journals@Ovid LWW All Open Access Journal Collection Rolling
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: OVEED
  dateStart: 20010101
  isFulltext: true
  titleUrlDefault: http://ovidsp.ovid.com/
  providerName: Ovid
– providerCode: PRVASL
  databaseName: Oxford Journals Open Access (Activated by CARLI)
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 20220930
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
– providerCode: PRVASL
  databaseName: Oxford Journals Open Access (Activated by CARLI)
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB61WyFAFe9HeKyMxNVJ4ziJ3duqolQcSkW70nKKbMcpK7bZVTcrVH4Av7vjOFnKCgk4cItkj5OMx_Y38sw3AG_zVJVSGEsFjyXlymRUyVxS7ZgMjFSWq5bt8zg7GvMPk3SyBbbPhVFdVHjYpzTo6byjEHW0xVGnT7ooq4iJKI6jGHfUCM0wQyQiNroXTCDELVyXECW2YSdLEbIPYGd8fDL67HOycspFW0G5e47jPtNHJpuv180ydiGJN86w3YVaojorXwfjd0D1Ltxe1Qt19U3NZjcOr8P78KP_bR-z8jVcNTo03zcYIf-7Xh7AvQ7-kpEf5yFs2foR3PIFMa8ew8n7T6PT032iyLmjw54aombn88tp8-WC4LAEVVBV_r6M1M5nb7u1xkW6oHDXhA6BvdCIrpdPYHz47uzgiHY1H6hJRNZQwTR6hHmZlnt7WpUJzklWWVnGymZ5leFmLAxHH9EI9DOZTQwTuOcIZjXTgiudPIVBPa_tcyBKlEJLbEZIwnNTyiotcfa1c_kqw7MAeD-LhekI0V1djlnhL-aTYkNpfvIDCNdiC88I8ieB4S8mspZiKYJF3C8DeNPbTIGL293YqNrOV0sXf5dyxIR5HsAzb0w_pVuuR84CiNbW9Xcf9OKfJV7CHQSKzIfIvYJBc7myrxGMNXoI22cfJ8NuQV0DK7U7fA
linkProvider Unpaywall
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB6VrRAgxLsQHpWRuCZpHMexua0QpeJQVZSVyimyHaddsc2uullV5QfwuxnHzlJWSMCBWyR7nGQ8tr-RZ74BeFMWqpbC2FiwTMZMGR4rWcpYOyYDI5Vlqmf7POQHE_bxpDjZAjvkwqgQFZ4MKQ16Og8Uoo62OA36jBd1k1KRZlma4Y6aohlyRCJio3tFBULcynVJUOIGbPMCIfsItieHR-MvPierjJnoKyiH5ywbMn1kvvl63S0zF5J47Qy7u1BLVGfj62D8DqjegVurdqGuLtVsdu3w2r8P34ff9jErX5NVpxPzbYMR8r_r5QHcC_CXjP04D2HLto_gpi-IefUYjj58Gh8fvyWKnDo67KkhanY6v5h2Z-cEhyWogqbx92WkdT573603LhKCwl0TOgT2XCO6Xj6Byf77z-8O4lDzITa54F0sqEaPsKyLem9PqzrHOeGNlXWmLC8bjpuxMAx9RCPQz6Q2N1TgniOo1VQLpnS-A6N23tpnQJSohZbYjJCElaaWTVHj7Gvn8jWG8QjYMIuVCYTori7HrPIX83m1oTQ_-REka7GFZwT5k8DuLyaylqIFgkXcLyN4PdhMhYvb3dio1s5XSxd_VzDEhGUZwVNvTD-le65HRiNI19b1dx_0_J8lXsBtBIrUh8i9hFF3sbKvEIx1ejcspR_L5Tpg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=GRASS%3A+a+generic+algorithm+for+scaffolding+next-generation+sequencing+assemblies&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Gritsenko%2C+Alexey+A&rft.au=Nijkamp%2C+Jurgen+F&rft.au=Reinders%2C+Marcel+J+T&rft.au=de+Ridder%2C+Dick&rft.date=2012-06-01&rft.eissn=1367-4811&rft.volume=28&rft.issue=11&rft.spage=1429&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbts175&rft_id=info%3Apmid%2F22492642&rft.externalDocID=22492642
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon