Learning regulatory programs by threshold SVD regression
Significance With the increase in high-throughput data in genomic studies, the study of regulatory relationships between multidimensional predictors and responses is becoming a common task. Although high-dimensional data hold promise for revealing rich and complex regulations, it remains challenging...
Saved in:
| Published in | Proceedings of the National Academy of Sciences - PNAS Vol. 111; no. 44; pp. 15675 - 15680 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
United States
National Academy of Sciences
04.11.2014
National Acad Sciences |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0027-8424 1091-6490 1091-6490 |
| DOI | 10.1073/pnas.1417808111 |
Cover
| Summary: | Significance With the increase in high-throughput data in genomic studies, the study of regulatory relationships between multidimensional predictors and responses is becoming a common task. Although high-dimensional data hold promise for revealing rich and complex regulations, it remains challenging to infer the relations between tens of thousands of responses and thousands of predictors, as the desired signal must be searched among an overwhelming number of irrelevant responses. Here we show that by formulating the regulatory programs as hidden-intermediate nodes in a linear network, a sparsity-inducing modeling and inference approach is effective in extracting the regulatory relations among very high-dimensional responses and predictors, even when the sample size is much lower.
We formulate a statistical model for the regulation of global gene expression by multiple regulatory programs and propose a thresholding singular value decomposition (T-SVD) regression method for learning such a model from data. Extensive simulations demonstrate that this method offers improved computational speed and higher sensitivity and specificity over competing approaches. The method is used to analyze microRNA (miRNA) and long noncoding RNA (lncRNA) data from The Cancer Genome Atlas (TCGA) consortium. The analysis yields previously unidentified insights into the combinatorial regulation of gene expression by noncoding RNAs, as well as findings that are supported by evidence from the literature. |
|---|---|
| Bibliography: | http://dx.doi.org/10.1073/pnas.1417808111 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23 Author contributions: X.M., L.X., and W.H.W. designed research; X.M. and L.X. performed research; X.M. and L.X. contributed new reagents/analytic tools; X.M. analyzed data; and X.M., L.X., and W.H.W. wrote the paper. Contributed by Wing Hung Wong, September 18, 2014 (sent for review August 3, 2014; reviewed by Hongyu Zhao) 1X.M. and L.X. contributed equally to this work. Reviewers included: H.Z., Yale University. |
| ISSN: | 0027-8424 1091-6490 1091-6490 |
| DOI: | 10.1073/pnas.1417808111 |