Discovering high utility itemsets with multiple minimum supports

Generally, association rule mining uses only a single minimum support threshold for the whole database. This model implicitly assumes that all items in the database have the same nature. In real applications, however, each item can have different nature such as medical datasets which contain informa...

Full description

Saved in:
Bibliographic Details
Published inIntelligent data analysis Vol. 18; no. 6; pp. 1027 - 1047
Main Authors Ryang, Heungmo, Yun, Unil, Ryu, Keun Ho
Format Journal Article
LanguageEnglish
Published London, England SAGE Publications 01.01.2014
Subjects
Online AccessGet full text
ISSN1088-467X
1571-4128
DOI10.3233/IDA-140683

Cover

More Information
Summary:Generally, association rule mining uses only a single minimum support threshold for the whole database. This model implicitly assumes that all items in the database have the same nature. In real applications, however, each item can have different nature such as medical datasets which contain information of both diseases and symptoms or status related to the diseases. Therefore, association rule mining needs to consider multiple minimum supports. Association rule mining with multiple minimum supports discovers all item rules by reflecting their characteristics. Although this model can identify meaningful association rules including rare item rules, not only the importance of items such as fatality rate of diseases but also attribute of items such as duration of symptoms are not considered since it treats each item with equal importance and represents the occurrences of items in transactions as binary values. In this paper, we propose a novel tree structure, called MHU-Tree (Multiple item supports with High Utility Tree), which is constructed with a single scan. Moreover, we propose an algorithm, named MHU-Growth (Multiple item supports with High Utility Growth), for mining high utility itemsets with multiple minimum supports. Experimental results show that MHU-Growth outperforms the previous algorithm on both real and synthetic datasets, and can discover useful rules from a medical dataset.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1088-467X
1571-4128
DOI:10.3233/IDA-140683