A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data

Motivation: High-throughput sequencing of tumor samples has shown that most tumors exhibit extensive intra-tumor heterogeneity, with multiple subpopulations of tumor cells containing different somatic mutations. Recent studies have quantified this intra-tumor heterogeneity by clustering mutations in...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics (Oxford, England) Vol. 30; no. 12; pp. i78 - i86
Main Authors Hajirasouliha, Iman, Mahmoody, Ahmad, Raphael, Benjamin J.
Format Journal Article
LanguageEnglish
Published England Oxford University Press 15.06.2014
Subjects
Online AccessGet full text
ISSN1367-4803
1367-4811
1367-4811
DOI10.1093/bioinformatics/btu284

Cover

More Information
Summary:Motivation: High-throughput sequencing of tumor samples has shown that most tumors exhibit extensive intra-tumor heterogeneity, with multiple subpopulations of tumor cells containing different somatic mutations. Recent studies have quantified this intra-tumor heterogeneity by clustering mutations into subpopulations according to the observed counts of DNA sequencing reads containing the variant allele. However, these clustering approaches do not consider that the population frequencies of different tumor subpopulations are correlated by their shared ancestry in the same population of cells. Results: We introduce the binary tree partition (BTP), a novel combinatorial formulation of the problem of constructing the subpopulations of tumor cells from the variant allele frequencies of somatic mutations. We show that finding a BTP is an NP-complete problem; derive an approximation algorithm for an optimization version of the problem; and present a recursive algorithm to find a BTP with errors in the input. We show that the resulting algorithm outperforms existing clustering approaches on simulated and real sequencing data. Availability and implementation: Python and MATLAB implementations of our method are available at http://compbio.cs.brown.edu/software/ Contact:  braphael@cs.brown.edu Supplementary information:  Supplementary data are available at Bioinformatics online.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
The authors wish it to be known that in their opinion, the first two authors should be regarded as Joint First Authors.
ISSN:1367-4803
1367-4811
1367-4811
DOI:10.1093/bioinformatics/btu284