De Novo Identification and Visualization of Important Cell Populations for Classic Hodgkin Lymphoma Using Flow Cytometry and Machine Learning

Abstract Objectives Automated classification of flow cytometry data has the potential to reduce errors and accelerate flow cytometry interpretation. We desired a machine learning approach that is accurate, is intuitively easy to understand, and highlights the cells that are most important in the alg...

Full description

Saved in:
Bibliographic Details
Published inAmerican journal of clinical pathology Vol. 156; no. 6; pp. 1092 - 1102
Main Authors Simonson, Paul D, Wu, Yue, Wu, David, Fromm, Jonathan R, Lee, Aaron Y
Format Journal Article
LanguageEnglish
Published US Oxford University Press 01.12.2021
Subjects
Online AccessGet full text
ISSN0002-9173
1943-7722
1943-7722
DOI10.1093/ajcp/aqab076

Cover

More Information
Summary:Abstract Objectives Automated classification of flow cytometry data has the potential to reduce errors and accelerate flow cytometry interpretation. We desired a machine learning approach that is accurate, is intuitively easy to understand, and highlights the cells that are most important in the algorithm’s prediction for a given case. Methods We developed an ensemble of convolutional neural networks for classification and visualization of impactful cell populations in detecting classic Hodgkin lymphoma using two-dimensional (2D) histograms. Data from 977 and 245 clinical flow cytometry cases were used for training and testing, respectively. Seventy-eight nongated 2D histograms were created per flow cytometry file. Shapley additive explanation (SHAP) values were calculated to determine the most impactful 2D histograms and regions within histograms. SHAP values from all 78 histograms were then projected back to the original cell data for gating and visualization using standard flow cytometry software. Results The algorithm achieved 67.7% recall (sensitivity), 82.4% precision, and 0.92 area under the receiver operating characteristic. Visualization of the important cell populations for individual predictions demonstrated correlations with known biology. Conclusions The method presented enables model explainability while highlighting important cell populations in individual flow cytometry specimens, with potential applications in both diagnosis and discovery of previously overlooked key cell populations.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Senior authors.
ISSN:0002-9173
1943-7722
1943-7722
DOI:10.1093/ajcp/aqab076