EFIM: A Highly Efficient Algorithm for High-Utility Itemset Mining

High-utility itemset mining (HUIM) is an important data mining task with wide applications. In this paper, we propose a novel algorithm named EFIM (EFficient high-utility Itemset Mining), which introduces several new ideas to more efficiently discovers high-utility itemsets both in terms of executio...

Full description

Saved in:

Bibliographic Details
Published in	Advances in Artificial Intelligence and Soft Computing Vol. 9413; pp. 530 - 546
Main Authors	Zida, Souleymane, Fournier-Viger, Philippe, Lin, Jerry Chun-Wei, Wu, Cheng-Wei, Tseng, Vincent S.
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2015 Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	Artificial intelligence Computer vision Health & safety aspects of computing High-utility mining Itemset mining Pattern mining
Online Access	Get full text
ISBN	9783319270593 3319270591
ISSN	0302-9743 1611-3349
DOI	10.1007/978-3-319-27060-9_44

Cover

More Information
Summary:	High-utility itemset mining (HUIM) is an important data mining task with wide applications. In this paper, we propose a novel algorithm named EFIM (EFficient high-utility Itemset Mining), which introduces several new ideas to more efficiently discovers high-utility itemsets both in terms of execution time and memory. EFIM relies on two upper-bounds named sub-tree utility and local utility to more effectively prune the search space. It also introduces a novel array-based utility counting technique named Fast Utility Counting to calculate these upper-bounds in linear time and space. Moreover, to reduce the cost of database scans, EFIM proposes efficient database projection and transaction merging techniques. An extensive experimental study on various datasets shows that EFIM is in general two to three orders of magnitude faster and consumes up to eight times less memory than the state-of-art algorithms d $$^2$$ HUP, HUI-Miner, HUP-Miner, FHM and UP-Growth+.
Bibliography:	Original Abstract: High-utility itemset mining (HUIM) is an important data mining task with wide applications. In this paper, we propose a novel algorithm named EFIM (EFficient high-utility Itemset Mining), which introduces several new ideas to more efficiently discovers high-utility itemsets both in terms of execution time and memory. EFIM relies on two upper-bounds named sub-tree utility and local utility to more effectively prune the search space. It also introduces a novel array-based utility counting technique named Fast Utility Counting to calculate these upper-bounds in linear time and space. Moreover, to reduce the cost of database scans, EFIM proposes efficient database projection and transaction merging techniques. An extensive experimental study on various datasets shows that EFIM is in general two to three orders of magnitude faster and consumes up to eight times less memory than the state-of-art algorithms d\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^2$$\end{document}HUP, HUI-Miner, HUP-Miner, FHM and UP-Growth+.
ISBN:	9783319270593 3319270591
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-319-27060-9_44