Isolation forest-voting fusion-multioutput: A stroke risk classification method based on the multidimensional output of abnormal sample detection

•This article is the first to explore the impact of abnormal samples of stroke screening data on the prediction of the stroke risk grade and stroke occurrence.•For the first time, an isolation forest-voting fusion-multi output (IF-VF-MO) predictive classification model, which can comprehensively and...

Full description

Saved in:
Bibliographic Details
Published inComputer methods and programs in biomedicine Vol. 253; p. 108255
Main Authors He, Hai, Yang, Haibo, Mercaldo, Francesco, Santone, Antonella, Huang, Pan
Format Journal Article
LanguageEnglish
Published Ireland Elsevier B.V 01.08.2024
Subjects
Online AccessGet full text
ISSN0169-2607
1872-7565
1872-7565
DOI10.1016/j.cmpb.2024.108255

Cover

More Information
Summary:•This article is the first to explore the impact of abnormal samples of stroke screening data on the prediction of the stroke risk grade and stroke occurrence.•For the first time, an isolation forest-voting fusion-multi output (IF-VF-MO) predictive classification model, which can comprehensively and accurately predict the various stroke risk levels and stroke occurrence, also provide multidimensional auxiliary diagnostic information to medical staff, is built.•The characteristic composite score index is used to analyze the importance of different risk factors in the screening data for identifying all stroke risk levels. Stroke has become a major disease threatening the health of people around the world. It has the characteristics of high incidence, high fatality, and a high recurrence rate. At this stage, problems such as poor recognition accuracy of stroke screening based on electronic medical records and insufficient recognition of stroke risk levels exist. These problems occur because of the systematic errors of medical equipment and the characteristics of the collectors during the process of electronic medical record collection. Errors can also occur due to misreporting or underreporting by the collection personnel and the strong subjectivity of the evaluation indicators. This paper proposes an isolation forest-voting fusion-multioutput algorithm model. First, the screening data are collected for numerical processing and normalization. The composite feature score index of this paper is used to analyze the importance of risk factors, and then, the isolation forest is used. The algorithm detects abnormal samples, uses the voting fusion algorithm proposed in this article to perform decision fusion prediction classification, and outputs multidimensional (risk factor importance score, abnormal sample label, risk level classification, and stroke prediction) results that can be used as auxiliary decision information by doctors and medical staff. The isolation forest-voting fusion-multioutput algorithm proposed in this article has five categories (zero risk, low risk, high risk, ischemic stroke (TIA), and hemorrhagic stroke (HE)). The average accuracy rate of stroke prediction reached 79.59 %. The isolation forest-voting fusion-multioutput algorithm model proposed in this paper can not only accurately identify the various categories of stroke risk levels and stroke prediction but can also output multidimensional auxiliary decision-making information to help medical staff make decisions, thereby greatly improving the screening efficiency.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0169-2607
1872-7565
1872-7565
DOI:10.1016/j.cmpb.2024.108255