Bayesian coclustering of Anopheles gene expression time series: study of immune defense response to multiple experimental challenges

We present a method for Bayesian model-based hierarchical coclustering of gene expression data and use it to study the temporal transcription responses of an Anopheles gambiae cell line upon challenge with multiple microbial elicitors. The method fits statistical regression models to the gene expres...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the National Academy of Sciences - PNAS Vol. 102; no. 47; pp. 16939 - 16944
Main Authors Heard, N.A, Holmes, C.C, Stephens, D.A, Hand, D.J, Dimopoulos, G
Format Journal Article
LanguageEnglish
Published United States National Academy of Sciences 22.11.2005
National Acad Sciences
Subjects
Online AccessGet full text
ISSN0027-8424
1091-6490
1091-6490
DOI10.1073/pnas.0408393102

Cover

More Information
Summary:We present a method for Bayesian model-based hierarchical coclustering of gene expression data and use it to study the temporal transcription responses of an Anopheles gambiae cell line upon challenge with multiple microbial elicitors. The method fits statistical regression models to the gene expression time series for each experiment and performs coclustering on the genes by optimizing a joint probability model, characterizing gene coregulation between multiple experiments. We compute the model using a two-stage Expectation-Maximization-type algorithm, first fixing the cross-experiment covariance structure and using efficient Bayesian hierarchical clustering to obtain a locally optimal clustering of the gene expression profiles and then, conditional on that clustering, carrying out Bayesian inference on the cross-experiment covariance using Markov chain Monte Carlo simulation to obtain an expectation. For the problem of model choice, we use a cross-validatory approach to decide between individual experiment modeling and varying levels of coclustering. Our method successfully generates tightly coregulated clusters of genes that are implicated in related processes and therefore can be used for analysis of global transcript responses to various stimuli and prediction of gene functions.
Bibliography:SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
Author contributions: N.A.H., C.C.H., and D.A.S. designed research and performed research; N.A.H. and G.D. analyzed data; and N.A.H., C.C.H., D.A.S., D.J.H., and G.D. wrote the paper.
Conflict of interest statement: No conflicts declared.
This paper was submitted directly (Track II) to the PNAS office.
Abbreviations: CV, cross-validation; EM, Expectation-Maximization; MC, Monte Carlo; L.m., Listeria monocytogenes; M.l., Micrococcus luteus; S.t., Salmonella typhimurium.
To whom correspondence may be addressed. E-mail: n.heard@imperial.ac.uk or gdimopou@jhsph.edu.
Edited by James O. Berger, Duke University, Durham, NC
ISSN:0027-8424
1091-6490
1091-6490
DOI:10.1073/pnas.0408393102