Graph embedding and unsupervised learning predict genomic sub-compartments from HiC chromatin interaction data

Chromatin interaction studies can reveal how the genome is organized into spatially confined sub-compartments in the nucleus. However, accurately identifying sub-compartments from chromatin interaction data remains a challenge in computational biology. Here, we present Sub-Compartment Identifier (SC...

Full description

Saved in:

Bibliographic Details
Published in	Nature communications Vol. 11; no. 1; pp. 1173 - 11
Main Authors	Ashoor, Haitham, Chen, Xiaowen, Rosikiewicz, Wojciech, Wang, Jiahui, Cheng, Albert, Wang, Ping, Ruan, Yijun, Li, Sheng
Format	Journal Article
Language	English
Published	London Nature Publishing Group UK 03.03.2020 Nature Publishing Group Nature Portfolio
Subjects	38/91 45/23 631/114/1305 631/114/1314 631/114/794 Algorithms Artificial neural networks Chromatin Chromatin - genetics Chromatin - metabolism Cluster Analysis Clustering Compartments Computer applications Computer Graphics Data Analysis Embedding Epigenome Gene Expression Genomes Genomics - methods Humanities and Social Sciences Humans K562 Cells Learning Machine learning Markov Chains multidisciplinary Neural networks Neural Networks, Computer Predictions Reproducibility of Results Science Science (multidisciplinary) Unsupervised learning Unsupervised Machine Learning
Online Access	Get full text
ISSN	2041-1723 2041-1723
DOI	10.1038/s41467-020-14974-x

Cover

More Information
Summary:	Chromatin interaction studies can reveal how the genome is organized into spatially confined sub-compartments in the nucleus. However, accurately identifying sub-compartments from chromatin interaction data remains a challenge in computational biology. Here, we present Sub-Compartment Identifier (SCI), an algorithm that uses graph embedding followed by unsupervised learning to predict sub-compartments using Hi-C chromatin interaction data. We find that the network topological centrality and clustering performance of SCI sub-compartment predictions are superior to those of hidden Markov model (HMM) sub-compartment predictions. Moreover, using orthogonal Chromatin Interaction Analysis by in-situ Paired-End Tag Sequencing (ChIA-PET) data, we confirmed that SCI sub-compartment prediction outperforms HMM. We show that SCI-predicted sub-compartments have distinct epigenetic marks, transcriptional activities, and transcription factor enrichment. Moreover, we present a deep neural network to predict sub-compartments using epigenome, replication timing, and sequence data. Our neural network predicts more accurate sub-compartment predictions when SCI-determined sub-compartments are used as labels for training. Accurate identification of sub-compartments from chromatin interaction data remains a challenge. Here, the authors introduce an algorithm combining graph embedding and unsupervised learning to predict sub-compartments using Hi-C data.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2041-1723 2041-1723
DOI:	10.1038/s41467-020-14974-x