Graph embedding and unsupervised learning predict genomic sub-compartments from HiC chromatin interaction data

Chromatin interaction studies can reveal how the genome is organized into spatially confined sub-compartments in the nucleus. However, accurately identifying sub-compartments from chromatin interaction data remains a challenge in computational biology. Here, we present Sub-Compartment Identifier (SC...

Full description

Saved in:
Bibliographic Details
Published inNature communications Vol. 11; no. 1; pp. 1173 - 11
Main Authors Ashoor, Haitham, Chen, Xiaowen, Rosikiewicz, Wojciech, Wang, Jiahui, Cheng, Albert, Wang, Ping, Ruan, Yijun, Li, Sheng
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 03.03.2020
Nature Publishing Group
Nature Portfolio
Subjects
Online AccessGet full text
ISSN2041-1723
2041-1723
DOI10.1038/s41467-020-14974-x

Cover

More Information
Summary:Chromatin interaction studies can reveal how the genome is organized into spatially confined sub-compartments in the nucleus. However, accurately identifying sub-compartments from chromatin interaction data remains a challenge in computational biology. Here, we present Sub-Compartment Identifier (SCI), an algorithm that uses graph embedding followed by unsupervised learning to predict sub-compartments using Hi-C chromatin interaction data. We find that the network topological centrality and clustering performance of SCI sub-compartment predictions are superior to those of hidden Markov model (HMM) sub-compartment predictions. Moreover, using orthogonal Chromatin Interaction Analysis by in-situ Paired-End Tag Sequencing (ChIA-PET) data, we confirmed that SCI sub-compartment prediction outperforms HMM. We show that SCI-predicted sub-compartments have distinct epigenetic marks, transcriptional activities, and transcription factor enrichment. Moreover, we present a deep neural network to predict sub-compartments using epigenome, replication timing, and sequence data. Our neural network predicts more accurate sub-compartment predictions when SCI-determined sub-compartments are used as labels for training. Accurate identification of sub-compartments from chromatin interaction data remains a challenge. Here, the authors introduce an algorithm combining graph embedding and unsupervised learning to predict sub-compartments using Hi-C data.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2041-1723
2041-1723
DOI:10.1038/s41467-020-14974-x