Rare association rule mining for data stream

The immense volumes of data is populated into repositories from various applications. More over data arrives into the repositories continuously i.e. stream of data that cannot be stored into repository due to its varying characteristics. Frequent itemset mining is thoroughly studied by many research...

Full description

Saved in:
Bibliographic Details
Published inInternational Conference on Computing and Communication Technologies pp. 1 - 6
Main Authors Vanamala, Sunitha, Sree, L. Padma, Bhavani, S. Durga
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2014
Subjects
Online AccessGet full text
DOI10.1109/ICCCT2.2014.7066696

Cover

More Information
Summary:The immense volumes of data is populated into repositories from various applications. More over data arrives into the repositories continuously i.e. stream of data that cannot be stored into repository due to its varying characteristics. Frequent itemset mining is thoroughly studied by many researchers but important rare items are not discovered by these algorithms. In many cases, the contradictions or exceptions also offers useful associations. In the recent past the researchers started to focus on the discovery of such kind of associations called rare associations. Rare itemsets can be obtained by setting low support but generates huge number of rules. The rare association rule mining is a challenging area of research on data streams. In this paper we proposed an idea to analyze the data stream to identify interesting rare association rules. Rare association rule mining is the process of identifying associations that are having low support but occurs with high confidence. The rare association rules are useful for many applications such as fraudulent credit card usage, anomaly detection in networks, detection of network failures, educational data, medical diagnosis etc. The proposed rare association rule mining algorithm for data stream is implemented using sliding window technique over a stream of data, data is represented in vertical bit sequence format. The advantage of proposed algorithm is that it requires single scan to discover all rare associations. The proposed algorithm outperforms both in terms of memory and time.
DOI:10.1109/ICCCT2.2014.7066696