Bayesian Model-Based Approaches for Solexa Sequencing Data

IntroductionRecent advances in next-generation sequencing have hugely impacted biological research through high-throughput platforms that generate megabases of sequence data per day. These technologies improve both speed and cost and have found applications in genotyping, protein-DNA interactions (B...

Full description

Saved in:
Bibliographic Details
Published inAdvances in Statistical Bioinformatics pp. 126 - 137
Main Authors Mitra, Riten, Mueller, Peter, Ji, Yuan
Format Book Chapter
LanguageEnglish
Published Cambridge University Press 10.06.2013
Online AccessGet full text
ISBN1107027527
9781107027527
DOI10.1017/CBO9781139226448.007

Cover

Abstract IntroductionRecent advances in next-generation sequencing have hugely impacted biological research through high-throughput platforms that generate megabases of sequence data per day. These technologies improve both speed and cost and have found applications in genotyping, protein-DNA interactions (Barski et al., 2007; Mikkelsen et al., 2007), transcriptome analysis (Friedländer et al., 2008; Hafner et al., 2008; Vera et al., 2008), and de novo genome assembly (Chaisson and Pevzner, 2008). In this chapter, we focus on the Illumina/Solexa sequencing platform. However, data from other technologies have similar characteristics, and we expect models similar to the one presented here to remain useful also for these technologies.Solexa sequencing (www.illumina.com) produces millions of polymerase chain reaction (PCR) amplified and labeled sequences of short reads. For each short read, the measurements of their fluorescent intensities are stored in an I × 4 matrix, where I is the length of the read (e.g., I = 36). Such amatrix corresponds to a colony. The positions i = 1, …, I in the short read are sequenced in cycles by a biochemical procedure called sequencing-by-synthesis. As a result, each row of the colony matrix contains measurements from a cycle in the experiment in which the sequence of a single base is synthesized. At each cycle, all four nucleotides (A, C, G, and T) labeled with four different fluorescent dyes are probed, thus producing a quadruple vector of fluorescent intensity scores.
AbstractList IntroductionRecent advances in next-generation sequencing have hugely impacted biological research through high-throughput platforms that generate megabases of sequence data per day. These technologies improve both speed and cost and have found applications in genotyping, protein-DNA interactions (Barski et al., 2007; Mikkelsen et al., 2007), transcriptome analysis (Friedländer et al., 2008; Hafner et al., 2008; Vera et al., 2008), and de novo genome assembly (Chaisson and Pevzner, 2008). In this chapter, we focus on the Illumina/Solexa sequencing platform. However, data from other technologies have similar characteristics, and we expect models similar to the one presented here to remain useful also for these technologies.Solexa sequencing (www.illumina.com) produces millions of polymerase chain reaction (PCR) amplified and labeled sequences of short reads. For each short read, the measurements of their fluorescent intensities are stored in an I × 4 matrix, where I is the length of the read (e.g., I = 36). Such amatrix corresponds to a colony. The positions i = 1, …, I in the short read are sequenced in cycles by a biochemical procedure called sequencing-by-synthesis. As a result, each row of the colony matrix contains measurements from a cycle in the experiment in which the sequence of a single base is synthesized. At each cycle, all four nucleotides (A, C, G, and T) labeled with four different fluorescent dyes are probed, thus producing a quadruple vector of fluorescent intensity scores.
Author Mueller, Peter
Mitra, Riten
Ji, Yuan
Author_xml – sequence: 1
  givenname: Riten
  surname: Mitra
  fullname: Mitra, Riten
  organization: University of Texas
– sequence: 2
  givenname: Peter
  surname: Mueller
  fullname: Mueller, Peter
  organization: University of Texas
– sequence: 3
  givenname: Yuan
  surname: Ji
  fullname: Ji, Yuan
  organization: NorthShore University Health-System
BookMark eNqNkM1OwzAQhI0ACVryBhz8Aile58cJnJKWP6moh8I5WjvrEkjjEhepvD1BcClcmMtKO9pvtDNiR53riLFzEBMQoC6m5SJXGUCUS5nGcTYRQh2wYG93yEYAQgmpEqlOWOD9ixiUZWkeRafsssQP8g12_MHV1IYleqp5sdn0Ds0zeW5dz5eupR3yJb29U2eabsVnuMUzdmyx9RT8zDF7url-nN6F88Xt_bSYhwbyZBum0uhEJBJljApsVqvI6iEbYmMExEJpQmMTSkl9mbUyshZGawuEBCaPxqz45hpc676pV1QZ15N27tVXe79Wu3Vb_S6lGEIGxtUfhnb_vf4E03ZmHQ
ContentType Book Chapter
Copyright Cambridge University Press 2013
Copyright_xml – notice: Cambridge University Press 2013
DOI 10.1017/CBO9781139226448.007
DatabaseTitleList

DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISBN 9781139226448
1139226444
EndPage 137
ExternalDocumentID 9781139226448_xml_CBO9781139226448A014
GroupedDBID -G2
-VX
089
20A
38.
A4J
AAAAZ
AABBV
AAHFW
ABARN
ABESS
ABMFC
ABMRC
ABWAU
ABZUC
ACLGV
ACNOG
ADCGF
ADQZK
ADVEM
AEDFS
AERYV
AEWAL
AEWQY
AFQOZ
AHAWV
AHWGJ
AIXPE
AJFER
AJXXZ
ALMA_UNASSIGNED_HOLDINGS
AMJDZ
ANGWU
ASYWF
AZZ
BBABE
BFIBU
BOIVQ
COBLI
COXPH
CZZ
DNKAV
DUGUG
EBSCA
ECOWB
EUQYS
FH2
ICERG
IPICV
JJU
MYL
OLDIN
OTBUH
OZASK
OZBHS
PP-
PQQKQ
S3M
SACVX
SN-
XI1
ZXKUE
ABQPQ
ID FETCH-LOGICAL-c195t-62cb5052a24a71f8d73fb93314cc01407beacf5e6e7f8d7d7c2d0cbbf1eae1c93
ISBN 1107027527
9781107027527
IngestDate Fri Feb 21 02:33:48 EST 2025
Wed Jul 30 03:57:19 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c195t-62cb5052a24a71f8d73fb93314cc01407beacf5e6e7f8d7d7c2d0cbbf1eae1c93
PageCount 12
ParticipantIDs cambridge_corebooks_9781139226448_xml_CBO9781139226448A014
cambridge_cbo_9781139226448_xml_CBO9781139226448A014
PublicationCentury 2000
PublicationDate 20130610
20130605
PublicationDateYYYYMMDD 2013-06-10
2013-06-05
PublicationDate_xml – month: 06
  year: 2013
  text: 20130610
  day: 10
PublicationDecade 2010
PublicationSubtitle Models and Integrative Inference for High-Throughput Data
PublicationTitle Advances in Statistical Bioinformatics
PublicationYear 2013
Publisher Cambridge University Press
Publisher_xml – name: Cambridge University Press
SSID ssj0000886933
Score 1.4183646
Snippet IntroductionRecent advances in next-generation sequencing have hugely impacted biological research through high-throughput platforms that generate megabases of...
SourceID cambridge
SourceType Publisher
StartPage 126
Title Bayesian Model-Based Approaches for Solexa Sequencing Data
URI http://dx.doi.org/10.1017/CBO9781139226448.007
https://doi.org/10.1017/CBO9781139226448.007?locatt=mode:legacy
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF5qRRAvPvFNDt5ktZtNsk1v1gdSqB5U0FPJPgKCNlBTUH-9M7t51Qeol5Ck2e0m3zIzuzPfDCEHftIxTKqQplEaUZB-Po21z2litBSp5JwnuFAcXkWXd8HgPrxvtQaNqKVpLo_U-7e8kv-gCvcAV2TJ_gHZqlO4AeeALxwBYTh-Mn5nt1ldeLHz3tt4VjQZbcZl-OD9x6zIhpo3ItmHj_nE2YnQzbj-zKakAs4E6g6sj_9hWk6dwumbvBnLucQCak-0DwpQoxlrSVnGJnY4vMmQMQMiyEZo4z7EmWO_VXsLts4DLaJMf6SNNUND3CoUl5Do_XQk_0ISMj9qKFXmMrt8kdcuydNp_9p2A7aaXS9iPvNaP1VRgzPPjF6fn0afG550sKb5nOiCxJsHvX4-rPbbQJpGMee2WFQ53CLlV3VdUiuZOP5uSM30Gw0j5HaZLCExxUPGCIx3hbTMeJUsuGKib2ukV8LjNeDxang8gMdz8Hg1PB7Cs07uLs5vTy9pURmDKhaHOY18JbECYeIHiWBpVwueSng7FiiFS2YhQZ-moYmMwB-1UL7uKClTZhLDVMw3SHucjc0m8XQgQmbQX4-5H4WOFe8m8HwaSaNZILdIUL30SMls9DsUtkiv0SybWNf_yy8bb__vP3fIYj2Jd0k7n0zNHpiMudwvZsMHlThdCg
linkProvider ProQuest Ebooks
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Advances+in+Statistical+Bioinformatics&rft.au=Mitra%2C+Riten&rft.au=Mueller%2C+Peter&rft.au=Ji%2C+Yuan&rft.atitle=Bayesian+Model-Based+Approaches+for+Solexa+Sequencing+Data&rft.date=2013-06-10&rft.pub=Cambridge+University+Press&rft.isbn=9781107027527&rft.spage=126&rft.epage=137&rft_id=info:doi/10.1017%2FCBO9781139226448.007&rft.externalDocID=9781139226448_xml_CBO9781139226448A014
thumbnail_m http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Fassets.cambridge.org%2F97811070%2F27527%2Fcover%2F9781107027527.jpg