Sequential Pattern Mining with Wildcards
Sequential pattern mining is an important research task in many domains, such as biological science. In this paper, we study the problem of mining frequent patterns from sequences with wildcards. The user can specify the gap constraints with flexibility. Given a subject sequence, a minimal support t...
Saved in:
Published in | 2010 22nd IEEE International Conference on Tools with Artificial Intelligence Vol. 1; pp. 241 - 247 |
---|---|
Main Authors | , , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.10.2010
|
Subjects | |
Online Access | Get full text |
ISBN | 1424488176 9781424488179 |
ISSN | 1082-3409 |
DOI | 10.1109/ICTAI.2010.42 |
Cover
Abstract | Sequential pattern mining is an important research task in many domains, such as biological science. In this paper, we study the problem of mining frequent patterns from sequences with wildcards. The user can specify the gap constraints with flexibility. Given a subject sequence, a minimal support threshold and a gap constraint, we aim to find frequent patterns whose supports in the sequence are no less than the given support threshold. We design an efficient mining algorithm MAIL that utilizes the candidate occurrences of the prefix to compute the support of a pattern that avoids the rescanning of the sequence. We present two pruning strategies to improve the completeness and the time efficiency of MAIL. Experiments show that MAIL mines 2 times more patterns than one of its peers and the time performance is 12 times faster on average than its another peer. |
---|---|
AbstractList | Sequential pattern mining is an important research task in many domains, such as biological science. In this paper, we study the problem of mining frequent patterns from sequences with wildcards. The user can specify the gap constraints with flexibility. Given a subject sequence, a minimal support threshold and a gap constraint, we aim to find frequent patterns whose supports in the sequence are no less than the given support threshold. We design an efficient mining algorithm MAIL that utilizes the candidate occurrences of the prefix to compute the support of a pattern that avoids the rescanning of the sequence. We present two pruning strategies to improve the completeness and the time efficiency of MAIL. Experiments show that MAIL mines 2 times more patterns than one of its peers and the time performance is 12 times faster on average than its another peer. |
Author | Yulian Fei Xuegang Hu Xindong Wu Jun Gao Dan Guo Fei Xie Ertian Hua |
Author_xml | – sequence: 1 surname: Fei Xie fullname: Fei Xie email: xiefei9815057@sina.com organization: Coll. of Comput. Sci. & Info. Eng., Hefei Univ. of Tech., Hefei, China – sequence: 2 surname: Xindong Wu fullname: Xindong Wu email: xwu@cs.uvm.edu organization: Coll. of Comput. Sci. & Info. Eng., Hefei Univ. of Tech., Hefei, China – sequence: 3 surname: Xuegang Hu fullname: Xuegang Hu organization: Coll. of Comput. Sci. & Info. Eng., Hefei Univ. of Tech., Hefei, China – sequence: 4 surname: Jun Gao fullname: Jun Gao organization: Coll. of Comput. Sci. & Info. Eng., Hefei Univ. of Tech., Hefei, China – sequence: 5 surname: Dan Guo fullname: Dan Guo organization: Coll. of Comput. Sci. & Info. Eng., Hefei Univ. of Tech., Hefei, China – sequence: 6 surname: Yulian Fei fullname: Yulian Fei organization: Coll. of Comput. Sci. & Info. Eng., Zhejiang Gongshang Univ., Hangzhou, China – sequence: 7 surname: Ertian Hua fullname: Ertian Hua organization: Coll. of Comput. Sci. & Info. Eng., Zhejiang Gongshang Univ., Hangzhou, China |
BookMark | eNotjE1Lw0AURQesYFu7dOUmSzep773MV5YlaA1UFKy4LC-TGR2JoyYR8d8b0Ls5nMW5CzFL78kLcYawRoTysq72m3pNMLmkI7FASVJai0bPxBzBUl5IKE_EahheYZoiIw3MxcWD__zyaYzcZfc8jr5P2W1MMT1n33F8yZ5i1zru2-FUHAfuBr_651I8Xl_tq5t8d7etq80uj2jUmDfWsQ3Oo1clsAmG2BJo47CxSnPpGrbWBYMtm8JZYsaGaCoCa40BiqU4__uN3vvDRx_fuP85KG0AJBa_OqJCUg |
ContentType | Conference Proceeding |
DBID | 6IE 6IH CBEJK RIE RIO |
DOI | 10.1109/ICTAI.2010.42 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EndPage | 247 |
ExternalDocumentID | 5670041 |
Genre | orig-research |
GroupedDBID | 23M 29O 6IE 6IF 6IH 6IK 6IL 6IN AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP M43 OCL RIE RIL RIO |
ID | FETCH-LOGICAL-i175t-b8ca8fce1e590a7f72a82067c1b856a9cba88cf71da73c82aa1b228fcfa661f03 |
IEDL.DBID | RIE |
ISBN | 1424488176 9781424488179 |
ISSN | 1082-3409 |
IngestDate | Wed Aug 27 03:03:20 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i175t-b8ca8fce1e590a7f72a82067c1b856a9cba88cf71da73c82aa1b228fcfa661f03 |
PageCount | 7 |
ParticipantIDs | ieee_primary_5670041 |
PublicationCentury | 2000 |
PublicationDate | 2010-Oct. |
PublicationDateYYYYMMDD | 2010-10-01 |
PublicationDate_xml | – month: 10 year: 2010 text: 2010-Oct. |
PublicationDecade | 2010 |
PublicationTitle | 2010 22nd IEEE International Conference on Tools with Artificial Intelligence |
PublicationTitleAbbrev | ictai |
PublicationYear | 2010 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0000527470 ssj0020523 |
Score | 1.504646 |
Snippet | Sequential pattern mining is an important research task in many domains, such as biological science. In this paper, we study the problem of mining frequent... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 241 |
SubjectTerms | Algorithm design and analysis Bioinformatics candidate occurrence pruning Complexity theory DNA Genomics one-off condition Pattern matching Postal services sequential pattern mining wildcard |
Title | Sequential Pattern Mining with Wildcards |
URI | https://ieeexplore.ieee.org/document/5670041 |
Volume | 1 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NS8MwHA1zJ09TN_GbHDx4MFvTJE1ylOHYhMnADXYbSZqAKJ1Id_GvN0k_JuLBW1so9EdIXn8f7z0Abi31MOaRHAmcMUQpSZBgmUQ8VcZJTlxFpJ0_Z9MVfVqzdQfct1wYa20cPrPDcBl7-fnW7EKpbMQCpySw1A84lxVXq62nJCzkV0mbbIVyZzVcnyLik5iG1CUE5lmj9VTfy7345mg2Xj7MqpGvYN_-w3IlIs6kB-bNt1aDJm_DXamH5uuXjON_gzkCgz23Dy5a1DoGHVucgF5j7gDrvd4Hdy9xyNofAO9wEUU4CziPbhIw1G6hP05yExhbA7CaPC7HU1TbKqBX_69QIi2MEs5YbJlMFHd-WaKIu8HaL5OSRishjOM4V5wYkSqFdZr6N5zyYO4Scgq6xbawZwBKRjUmOjj4UA9suRKhXeCsdTnJMyrPQT8EvvmolDM2dcwXfz--BIexNx9H5a5At_zc2WsP-aW-iWv9DQj6oiU |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NS8MwFH-MedDT1E38tgcPHszWNkmbHmU4Nl3HwA12G0magCidSHfxrzdJPybiwVtbKPQRmpf33u8D4FYRk8ZMJkcsiCgiBPuI0ShBccilTmKsSyJtOovGS_K0oqsW3DdcGKWUA5-pvr10s_xsI7e2VTagllNiWep71FQVccnWajoqPrUVlt-UW7bhWcLrQ4RNGVPTuhgL4qhWe6ruk5385mAyXDxMStCXNXD_Ybrics6oA2n9tSXU5K2_LURffv0ScvxvOIfQ27H7vHmTt46gpfJj6NT2Dl71t3fh7sXBrM0W8O7NnQxn7qXOT8Kz3VvPbCiZtJytHixHj4vhGFXGCujVnBYKJJjkTEsVKJr4PNZmYZyMuwyEWSieSMEZkzoOMh5jyULOAxGG5g3NTTrXPj6Bdr7J1Sl4CSUiwMJ6-BCT2jLO7MBAK6UznEUkOYOuDXz9UWpnrKuYz_9-fAP740U6XU8ns-cLOHCTegecu4R28blVV-YAUIhrt-7fyPyldg |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2010+22nd+IEEE+International+Conference+on+Tools+with+Artificial+Intelligence&rft.atitle=Sequential+Pattern+Mining+with+Wildcards&rft.au=Fei+Xie&rft.au=Xindong+Wu&rft.au=Xuegang+Hu&rft.au=Jun+Gao&rft.date=2010-10-01&rft.pub=IEEE&rft.isbn=9781424488179&rft.issn=1082-3409&rft.volume=1&rft.spage=241&rft.epage=247&rft_id=info:doi/10.1109%2FICTAI.2010.42&rft.externalDocID=5670041 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1082-3409&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1082-3409&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1082-3409&client=summon |