Stability-Based Comparison of Class Discovery Methods for DNA Copy Number Profiles

Array-CGH can be used to determine DNA copy number, imbalances in which are a fundamental factor in the genesis and progression of tumors. The discovery of classes with similar patterns of array-CGH profiles therefore adds to our understanding of cancer and the treatment of patients. Various input d...

Full description

Saved in:

Bibliographic Details
Published in	PloS one Vol. 8; no. 12; p. e81458
Main Authors	Brito, Isabel, Hupé, Philippe, Neuvial, Pierre, Barillot, Emmanuel
Format	Journal Article
Language	English
Published	United States Public Library of Science 05.12.2013 Public Library of Science (PLoS)
Subjects	Algorithms Arrays Bioinformatics Breast cancer Cancer Chromosomes Cluster Analysis Clustering Comparative analysis Comparative Genomic Hybridization - methods Computational Biology - methods Computer Science Copy number Cytogenetics Data analysis Data processing Deoxyribonucleic acid DNA Gene Dosage Genetic testing Genomes Hybridization Informatics Mathematics Medical research Metastasis Methods Oligonucleotide Array Sequence Analysis Polymorphism, Single Nucleotide Representations Stability analysis Statistics Tumors France
Online Access	Get full text
ISSN	1932-6203 1932-6203
DOI	10.1371/journal.pone.0081458

Cover

More Information
Summary:	Array-CGH can be used to determine DNA copy number, imbalances in which are a fundamental factor in the genesis and progression of tumors. The discovery of classes with similar patterns of array-CGH profiles therefore adds to our understanding of cancer and the treatment of patients. Various input data representations for array-CGH, dissimilarity measures between tumor samples and clustering algorithms may be used for this purpose. The choice between procedures is often difficult. An evaluation procedure is therefore required to select the best class discovery method (combination of one input data representation, one dissimilarity measure and one clustering algorithm) for array-CGH. Robustness of the resulting classes is a common requirement, but no stability-based comparison of class discovery methods for array-CGH profiles has ever been reported. We applied several class discovery methods and evaluated the stability of their solutions, with a modified version of Bertoni's [Formula: see text]-based test [1]. Our version relaxes the assumption of independency required by original Bertoni's [Formula: see text]-based test. We conclude that Minimal Regions of alteration (a concept introduced by [2]) for input data representation, sim [3] or agree [4] for dissimilarity measure and the use of average group distance in the clustering algorithm produce the most robust classes of array-CGH profiles. The software is available from http://bioinfo.curie.fr/projects/cgh-clustering. It has also been partly integrated into "Visualization and analysis of array-CGH"(VAMP)[5]. The data sets used are publicly available from ACTuDB [6].
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 PMCID: PMC3855312 Conceived and designed the experiments: IB PH PN EB. Performed the experiments: IB PH PN EB. Analyzed the data: IB PH PN EB. Contributed reagents/materials/analysis tools: IB PH PN EB. Wrote the paper: IB PH PN EB. Competing Interests: The authors have declared that no competing interests exist.
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0081458