AI-driven glomerular morphology quantification: a novel pipeline for assessing basement membrane thickness and podocyte foot process effacement in kidney diseases

•Hand-annotated electron microscopy images provided by the consensus of 3 nephrologists and 2 nephropathologists enabled methods development.•DeepLabV3+ model accurately segmented glomerular basement membrane, podocytes, erythrocytes and other glomerular ultrastructures in electron microscopy images...

Full description

Saved in:
Bibliographic Details
Published inComputer methods and programs in biomedicine Vol. 268; p. 108842
Main Authors Yamashita, Michifumi, Piaseczna, Natalia, Takahashi, Akira, Kiyozawa, Daisuke, Tatsumoto, Narihito, Kaneko, Shohei, Zurek, Natalia, Gertych, Arkadiusz
Format Journal Article
LanguageEnglish
Published Ireland Elsevier B.V 01.08.2025
Subjects
Online AccessGet full text
ISSN0169-2607
1872-7565
1872-7565
DOI10.1016/j.cmpb.2025.108842

Cover

More Information
Summary:•Hand-annotated electron microscopy images provided by the consensus of 3 nephrologists and 2 nephropathologists enabled methods development.•DeepLabV3+ model accurately segmented glomerular basement membrane, podocytes, erythrocytes and other glomerular ultrastructures in electron microscopy images and outperformed models reported in literature.•AI-powered pipeline accurately measured glomerular basement membrane (GBM) and estimated the percentage of effaced podocytes (%PFPE).•GBM measurements and %PFPE estimates provided by the pipeline were in high agreement with those provided by the consensus masks and those by the nephropathologist. Measuring the thickness of the glomerular basement membrane (GBM) and assessing the percentage of podocyte foot process effacement (%PFPE) are important for diagnosing non-neoplastic kidney diseases. However, when performed manually by nephropathologists using electron microscopy (EM) images, these assessments are hindered by the lack of universally standardized guidelines, leading to technical challenges. We have developed a novel deep learning (DL)-based pipeline which has the potential to reduce human error and enhance the consistency and efficiency of GBMs and %PFPE quantifications. This study utilized 196 EM images from kidney biopsies (representing 21 different kidney diseases from 83 subjects) which were manually annotated by consensus of 3 nephrologists and 2 nephropathologist providing ground truth (GT) masks of GBMs, podocytes, red blood cells and other glomerular ultrastructures. Of these, 165 images were used to develop two DL models (DeepLabV3+ and U-Net architectures) for EM image segmentation. Subsequently, the models were evaluated on the remaining 31 images and compared for segmentation accuracy, and the predicted GBM and podocyte masks were analyzed by algorithms in the pipeline which automatically measured the corrected harmonic mean of GBM thickness (cmGBM) and estimated the %PFPE. The automated measurements were statistically compared to the corresponding cmGBM measured and %PFPE estimated using the consensus GBM and podocyte GT masks. The goal was to identify differences between measurements provided by these three methods. Statistical evaluations were carried out using the intraclass correlation coefficient (ICC), and the Bland-Altman plots estimating the bias and limits of agreement (LoAs) between the GT and DL mask-based measurements. In the 31 test set images, the DeepLabV3+ model achieved a global accuracy (gACC) of 92.8 % and a weighted intersection over union (wIoU) of 0.869, outperforming the U-Net model, which recorded a gACC of 88.9 % and a wIoU of 0.800. For GBM thickness measurements, the cmGBM derived from DeepLabV3+ masks exhibited excellent agreement with GT-masks based measurements (ICC = 0.991, p < 0.001), whereas the U-Net model showed good agreement (ICC = 0.881, p < 0.001). The %PFPE estimates obtained using the DL-generated podocyte masks were highly consistent with those based on GT, with ICC values of 0.926 and 0.928 for DeepLabV3+ and U-Net, respectively. The Bland-Altman plots revealed a positive bias in the cmGBM and %PFPE obtained from the masks generated by the DeepLabV3+ model, and negative bias in the cmGBM and %PFPE obtained from the masks generated by the U-Net model. However, the DeepLabV3+ masks provided narrower LoA ranges than the U-Net masks for measuring cmGBM. This study highlights the potential of AI to address the limitations of manual assessments of glomerular ultrastructures in EM images by providing comprehensive, objective and accurate measurements of GBM thickness and %PFPE estimates. Our pipeline with DeepLabV3+ demonstrated robust EM image segmentation efficiency and excellent reliability of measurements when compared to expert ground truth. Further refinement of this AI-driven method for advancing the diagnostic capabilities and standardization of AI in nephropathology is warranted. [Display omitted]
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0169-2607
1872-7565
1872-7565
DOI:10.1016/j.cmpb.2025.108842