Potential applications and performance of machine learning techniques and algorithms in clinical practice: A systematic review

•Excellent performance metrics of ML do not guarantee adoption in clinical practice.•To date, most studies have fallen short in terms of external validation of ML models.•Studies on ML have a high to moderate risk of bias in reporting.•ML algorithms deployed for clinical use are much fewer than expe...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of medical informatics (Shannon, Ireland) Vol. 159; p. 104679
Main Authors Nwanosike, Ezekwesiri Michael, Conway, Barbara R, Merchant, Hamid A, Hasan, Syed Shahzad
Format Journal Article
LanguageEnglish
Published Ireland Elsevier B.V 01.03.2022
Subjects
Online AccessGet full text
ISSN1386-5056
1872-8243
1872-8243
DOI10.1016/j.ijmedinf.2021.104679

Cover

More Information
Summary:•Excellent performance metrics of ML do not guarantee adoption in clinical practice.•To date, most studies have fallen short in terms of external validation of ML models.•Studies on ML have a high to moderate risk of bias in reporting.•ML algorithms deployed for clinical use are much fewer than expected. The advent of clinically adapted machine learning algorithms can solve numerous problems ranging from disease diagnosis and prognosis to therapy recommendations. This systematic review examines the performance of machine learning (ML) algorithms and evaluates the progress made to date towards their implementation in clinical practice. Systematic searching of databases (PubMed, MEDLINE, Scopus, Google Scholar, Cochrane Library and WHO Covid-19 database) to identify original articles published between January 2011 and October 2021. Studies reporting ML techniques in clinical practice involving humans and ML algorithms with a performance metric were considered. Of 873 unique articles identified, 36 studies were eligible for inclusion. The XGBoost (extreme gradient boosting) algorithm showed the highest potential for clinical applications (n = 7 studies); this was followed jointly by random forest algorithm, logistic regression, and the support vector machine, respectively (n = 5 studies). Prediction of outcomes (n = 33), in particular Inflammatory diseases (n = 7) received the most attention followed by cancer and neuropsychiatric disorders (n = 5 for each) and Covid-19 (n = 4). Thirty-three out of the thirty-six included studies passed more than 50% of the selected quality assessment criteria in the TRIPOD checklist. In contrast, none of the studies could achieve an ideal overall bias rating of ‘low’ based on the PROBAST checklist. In contrast, only three studies showed evidence of the deployment of ML algorithm(s) in clinical practice. ML is potentially a reliable tool for clinical decision support. Although advocated widely in clinical practice, work is still in progress to validate clinically adapted ML algorithms. Improving quality standards, transparency, and interpretability of ML models will further lower the barriers to acceptability.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Undefined-3
ISSN:1386-5056
1872-8243
1872-8243
DOI:10.1016/j.ijmedinf.2021.104679