Latent space representation of electronic health records for clustering dialysis-associated kidney failure subtypes

Kidney failure manifests in various forms, from sudden occurrences such as Acute Kidney Injury (AKI) to progressive like Chronic Kidney Disease (CKD). Given its intricate nature, marked by overlapping comorbidities and clinical similarities—including treatment modalities like dialysis—we sought to d...

Full description

Saved in:
Bibliographic Details
Published inComputers in biology and medicine Vol. 183; p. 109243
Main Authors Onthoni, Djeane Debora, Lin, Ming-Yen, Lan, Kuei-Yuan, Huang, Tsung-Hsien, Lin, Hong-Ming, Chiou, Hung-Yi, Hsu, Chih-Cheng, Chung, Ren-Hua
Format Journal Article
LanguageEnglish
Published United States Elsevier Ltd 01.12.2024
Elsevier Limited
Subjects
Online AccessGet full text
ISSN0010-4825
1879-0534
1879-0534
DOI10.1016/j.compbiomed.2024.109243

Cover

More Information
Summary:Kidney failure manifests in various forms, from sudden occurrences such as Acute Kidney Injury (AKI) to progressive like Chronic Kidney Disease (CKD). Given its intricate nature, marked by overlapping comorbidities and clinical similarities—including treatment modalities like dialysis—we sought to design and validate an end-to-end framework for clustering kidney failure subtypes. Our emphasis was on dialysis, utilizing a comprehensive dataset from the UK Biobank (UKB). We transformed raw Electronic Health Record (EHR) data into standardized matrices that incorporate patient demographics, clinical visit data, and the innovative feature of visit time-gaps. This matrix structure was achieved using a unique data cutting method. Latent space transformation was facilitated using a convolution autoencoder (ConvAE) model, which was then subjected to clustering using Principal Component Analysis (PCA) and K-means algorithms. Our transformation model effectively reduced data dimensionality, thereby accelerating computational processes. The derived latent space demonstrated remarkable clustering capacities. Through cluster analysis, two distinct groups were identified: CKD-majority (cluster 1) and a mixed group of non-CKD and some CKD subtypes (cluster 0). Cluster 1 exhibited notably low survival probability, suggesting it predominantly represented severe CKD. In contrast, cluster 0, with substantially higher survival probability, likely to include milder CKD forms and severe AKI. Our end-to-end framework effectively differentiates kidney failure subtypes using the UKB dataset, offering potential for nuanced therapeutic interventions. This innovative approach integrates diverse data sources, providing a holistic understanding of kidney failure, which is imperative for patient management and targeted therapeutic interventions. •A new EHR preprocessing method utilizing visit time-gap features to improve the analysis of temporal patterns in the sequence.•Optimized convolution autoencoder (ConvAE) for clustering tasks with a low-dimensionality latent space.•The cluster findings reveal insights into dialysis-related kidney failure manifestations and progression for tailored care.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0010-4825
1879-0534
1879-0534
DOI:10.1016/j.compbiomed.2024.109243