Reconstructing the temporal ordering of biological samples using microarray data

Motivation: Accurate time series for biological processes are difficult to estimate due to problems of synchronization, temporal sampling and rate heterogeneity. Methods are needed that can utilize multi-dimensional data, such as those resulting from DNA microarray experiments, in order to reconstru...

Full description

Saved in:

Bibliographic Details
Published in	Bioinformatics (Oxford, England) Vol. 19; no. 7; pp. 842 - 850
Main Authors	Magwene, Paul M., Lizardi, Paul, Kim, Junhyong
Format	Journal Article
Language	English
Published	Oxford Oxford University Press 01.05.2003 Oxford Publishing Limited (England)
Subjects	Algorithms Biological and medical sciences Biological samples Caulobacter crescentus - genetics Caulobacter crescentus - metabolism Computer Simulation Deoxyribonucleic acid DNA Fundamental and applied biological sciences. Psychology Gene Expression Profiling - methods General aspects Heterogeneity Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Models, Genetic Oligonucleotide Array Sequence Analysis - methods Saccharomyces cerevisiae - genetics Saccharomyces cerevisiae - metabolism Sample Size Sequence Analysis, DNA - methods Time Factors Time series Transcription, Genetic - genetics Statistical analysis Graph Computer program DNA chip Gene expression Web site Bioinformatics Algorithm
Online Access	Get full text
ISSN	1367-4803 1367-4811 1367-4811
DOI	10.1093/bioinformatics/btg081

Cover

More Information
Summary:	Motivation: Accurate time series for biological processes are difficult to estimate due to problems of synchronization, temporal sampling and rate heterogeneity. Methods are needed that can utilize multi-dimensional data, such as those resulting from DNA microarray experiments, in order to reconstruct time series from unordered or poorly ordered sets of observations. Results: We present a set of algorithms for estimating temporal orderings from unordered sets of sample elements. The techniques we describe are based on modifications of a minimum-spanning tree calculated from a weighted, undirected graph. We demonstrate the efficacy of our approach by applying these techniques to an artificial data set as well as several gene expression data sets derived from DNA microarray experiments. In addition to estimating orderings, the techniques we describe also provide useful heuristics for assessing relevant properties of sample datasets such as noise and sampling intensity, and we show how a data structure called a PQ-tree can be used to represent uncertainty in a reconstructed ordering. Availability: Academic implementations of the ordering algorithms are available as source code (in the programming language Python) on our web site, along with documentation on their use. The artificial ‘jelly roll’ data set upon which the algorithm was tested is also available from this web site. The publicly available gene expression data may be found at http://genome-www.stanford.edu/cellcycle/ and http://caulobacter.stanford.edu/CellCycle/ Contact: junhyong@sas.upenn.edu * To whom correspondence should be addressed at present address. Department of Biology, Uhiversity of Pennsylvania, Philadelphia, PA, USA
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23 ObjectType-Undefined-3
ISSN:	1367-4803 1367-4811 1367-4811
DOI:	10.1093/bioinformatics/btg081