Sequential Pattern Mining with Wildcards

Sequential pattern mining is an important research task in many domains, such as biological science. In this paper, we study the problem of mining frequent patterns from sequences with wildcards. The user can specify the gap constraints with flexibility. Given a subject sequence, a minimal support t...

Full description

Saved in:
Bibliographic Details
Published in2010 22nd IEEE International Conference on Tools with Artificial Intelligence Vol. 1; pp. 241 - 247
Main Authors Fei Xie, Xindong Wu, Xuegang Hu, Jun Gao, Dan Guo, Yulian Fei, Ertian Hua
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.10.2010
Subjects
Online AccessGet full text
ISBN1424488176
9781424488179
ISSN1082-3409
DOI10.1109/ICTAI.2010.42

Cover

More Information
Summary:Sequential pattern mining is an important research task in many domains, such as biological science. In this paper, we study the problem of mining frequent patterns from sequences with wildcards. The user can specify the gap constraints with flexibility. Given a subject sequence, a minimal support threshold and a gap constraint, we aim to find frequent patterns whose supports in the sequence are no less than the given support threshold. We design an efficient mining algorithm MAIL that utilizes the candidate occurrences of the prefix to compute the support of a pattern that avoids the rescanning of the sequence. We present two pruning strategies to improve the completeness and the time efficiency of MAIL. Experiments show that MAIL mines 2 times more patterns than one of its peers and the time performance is 12 times faster on average than its another peer.
ISBN:1424488176
9781424488179
ISSN:1082-3409
DOI:10.1109/ICTAI.2010.42