Clustering of cancer data based on Stiefel manifold for multiple views

Background In recent years, various sequencing techniques have been used to collect biomedical omics datasets. It is usually possible to obtain multiple types of omics data from a single patient sample. Clustering of omics data plays an indispensable role in biological and medical research, and it i...

Full description

Saved in:

Bibliographic Details
Published in	BMC bioinformatics Vol. 22; no. 1; pp. 1 - 15
Main Authors	Tian, Jing, Zhao, Jianping, Zheng, Chunhou
Format	Journal Article
Language	English
Published	London BioMed Central 25.05.2021 BioMed Central Ltd Springer Nature B.V BMC
Subjects	Algorithms Bioinformatics Biomedical and Life Sciences Cancer Cancer data Clustering Computational Biology/Bioinformatics Computer Appl. in Life Sciences Data analysis Data structures Datasets Diagnosis DNA methylation Experiments Gene expression Information management Innovations Life Sciences Linear search algorithm Manifolds (mathematics) Medical research Methods Microarrays Multi-view clustering Optimization Optimization model Patients Sample variance Search algorithms Stiefel manifold Structural analysis China Cancer data Linear search algorithm Multi-view clustering Stiefel manifold Optimization model
Online Access	Get full text
ISSN	1471-2105 1471-2105
DOI	10.1186/s12859-021-04195-4

Cover

More Information
Summary:	Background In recent years, various sequencing techniques have been used to collect biomedical omics datasets. It is usually possible to obtain multiple types of omics data from a single patient sample. Clustering of omics data plays an indispensable role in biological and medical research, and it is helpful to reveal data structures from multiple collections. Nevertheless, clustering of omics data consists of many challenges. The primary challenges in omics data analysis come from high dimension of data and small size of sample. Therefore, it is difficult to find a suitable integration method for structural analysis of multiple datasets. Results In this paper, a multi-view clustering based on Stiefel manifold method (MCSM) is proposed. The MCSM method comprises three core steps. Firstly, we established a binary optimization model for the simultaneous clustering problem. Secondly, we solved the optimization problem by linear search algorithm based on Stiefel manifold. Finally, we integrated the clustering results obtained from three omics by using k-nearest neighbor method. We applied this approach to four cancer datasets on TCGA. The result shows that our method is superior to several state-of-art methods, which depends on the hypothesis that the underlying omics cluster class is the same. Conclusion Particularly, our approach has better performance than compared approaches when the underlying clusters are inconsistent. For patients with different subtypes, both consistent and differential clusters can be identified at the same time.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1471-2105 1471-2105
DOI:	10.1186/s12859-021-04195-4