Identifying risk factors for adverse diseases using dynamic rare association rule mining

•Devising a tree structure for efficient generation of complete set of patterns.•Development of a single pass dynamic rare association rule mining algorithm.•Evaluation of the algorithm based on transaction modification and threshold update.•Analysis of risk factors for three clinical diseases using...

Full description

Saved in:
Bibliographic Details
Published inExpert systems with applications Vol. 113; pp. 233 - 263
Main Authors Borah, Anindita, Nath, Bhabesh
Format Journal Article
LanguageEnglish
Published New York Elsevier Ltd 15.12.2018
Elsevier BV
Subjects
Online AccessGet full text
ISSN0957-4174
1873-6793
DOI10.1016/j.eswa.2018.07.010

Cover

More Information
Summary:•Devising a tree structure for efficient generation of complete set of patterns.•Development of a single pass dynamic rare association rule mining algorithm.•Evaluation of the algorithm based on transaction modification and threshold update.•Analysis of risk factors for three clinical diseases using proposed approach.•Comparison with existing approaches using synthetic and real-life datasets. The increase in mortality rate due to life-threatening diseases has become an issue of concern in today’s world. Early detection and diagnosis of diseases thus becomes necessary to reduce the severity of their side effects. Computational intelligence techniques like rare association rule mining can be extensively used for the analysis of diseases. This paper introduces an efficient technique to identify the symptoms and risk factors for three adverse diseases: cardiovascular disease, hepatitis and breast cancer, in terms of rare association rules. Existing research on rare association rule mining is based on the notion that the entire data to be operated on is available at the onset of the mining process. The medical databases in practice may get modified over time due to the addition of new records or deletion of previous records. Moreover, the user may switch to a new threshold for generating the desired set of rare association rules when the database gets updated. A straightforward yet incompetent solution for generating the current set of rare association rules would be to re-execute the entire mining algorithm from scratch, for each modified bunch of data and updated threshold. The algorithm proposed in this study is capable of generating the new set of rare association rules from updated medical databases in a single database scan without re-executing the entire mining process. It can efficiently handle the cases of transaction insertion and deletion and also provides flexibility to the user to generate the new set of rare association rules when threshold is updated. Experimental analysis illustrates the significance of proposed approach over traditional approach of repeatedly mining the entire updated database.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2018.07.010