A deep generative approach to cancer prognosis: MMD-VAE for multi-omics data fusion
Cancer progression triggers various molecular changes that disrupt cellular functions, impacting DNA, proteins, and other biomolecules. To gain deeper insights into the development of breast cancer, a multi-omics strategy is crucial for integrating diverse molecular data. In this study, we explore t...
Saved in:
| Published in | Network modeling and analysis in health informatics and bioinformatics (Wien) Vol. 14; no. 1; p. 94 |
|---|---|
| Main Authors | , , , |
| Format | Journal Article |
| Language | English |
| Published |
Vienna
Springer Vienna
31.08.2025
Springer Nature B.V |
| Subjects | |
| Online Access | Get full text |
| ISSN | 2192-6670 2192-6662 2192-6670 |
| DOI | 10.1007/s13721-025-00578-2 |
Cover
| Summary: | Cancer progression triggers various molecular changes that disrupt cellular functions, impacting DNA, proteins, and other biomolecules. To gain deeper insights into the development of breast cancer, a multi-omics strategy is crucial for integrating diverse molecular data. In this study, we explore the use of variational autoencoder (VAE) and maximum mean discrepancy VAE (MMD-VAE) to integrate and analyze multi-omics data related to breast cancer. This deep learning framework supports supervised feature extraction, dimensionality reduction, and classification of breast cancer samples. Standard VAEs may fall short in capturing meaningful latent representations due to the restrictive nature of the KL divergence, which can hinder the effective integration of complex datasets. MMD-VAE replaces the KL divergence with maximum mean discrepancy, offering a more adaptable and informative latent space that improves classification and prognosis performance. We evaluated the performance of VAE and MMD-VAE in three different sample sizes and demonstrated their robustness to breast cancer classification, molecular subtype grouping, and survival analysis. In the current work, we employed a VAE to tackle the imbalance of combined di-omics and tri-omics statistics in breast cancer studies. Our evaluation included random forests, partial least squares, naive Bayes, decision trees, neural networks, and lasso regularization. Our results show that the proposed framework can effectively integrate multi-omics data and improve breast cancer analysis. Compared to VAE, MMD-VAE performs better at clustering molecular subtypes, and integrating T-SNE with MMD-VAE enhances data preservation and separation of molecular subtypes. MMD-VAE has shown superior accuracy rates to standard VAE, capturing complex associations at multi-omics levels and improving interpretability. In large-scale studies, it has good predictive power and is computationally efficient. MMD-VAE captures the interactions of multiple omics datasets, increasing accuracy and interpretability in cancer research. The random forest and lasso regularization classifiers demonstrated maximum prediction accuracy, achieving AUC values of 0.99. The study provides a novel approach to breast cancer research and has potential applications in precision medicine. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2192-6670 2192-6662 2192-6670 |
| DOI: | 10.1007/s13721-025-00578-2 |