IMine: Index Support for Item Set Mining

This paper presents the IMine index, a general and compact structure which provides tight integration of item set extraction in a relational DBMS. Since no constraint is enforced during the index creation phase, IMine provides a complete representation of the original database. To reduce the I/O cos...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on knowledge and data engineering Vol. 21; no. 4; pp. 493 - 506
Main Authors Baralis, E., Cerquitelli, T., Chiusano, S.
Format Journal Article
LanguageEnglish
Published New York, NY IEEE 01.04.2009
IEEE Computer Society
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1041-4347
1558-2191
DOI10.1109/TKDE.2008.180

Cover

Abstract This paper presents the IMine index, a general and compact structure which provides tight integration of item set extraction in a relational DBMS. Since no constraint is enforced during the index creation phase, IMine provides a complete representation of the original database. To reduce the I/O cost, data accessed together during the same extraction phase are clustered on the same disk block. The IMine index structure can be efficiently exploited by different item set extraction algorithms. In particular, IMine data access methods currently support the FP-growth and LCM v.2 algorithms, but they can straightforwardly support the enforcement of various constraint categories. The IMine index has been integrated into the PostgreSQL DBMS and exploits its physical level access methods. Experiments, run for both sparse and dense data distributions, show the efficiency of the proposed index and its linear scalability also for large datasets. Item set mining supported by the IMine index shows performance always comparable with, and sometimes better than, state of the art algorithms accessing data on flat file.
AbstractList This paper presents the IMine index, a general and compact structure which provides tight integration of item set extraction in a relational DBMS. Since no constraint is enforced during the index creation phase, IMine provides a complete representation of the original database. To reduce the I/O cost, data accessed together during the same extraction phase are clustered on the same disk block. The IMine index structure can be efficiently exploited by different item set extraction algorithms. In particular, IMine data access methods currently support the FP-growth and LCM v.2 algorithms, but they can straightforwardly support the enforcement of various constraint categories. The IMine index has been integrated into the PostgreSQL DBMS and exploits its physical level access methods. Experiments, run for both sparse and dense data distributions, show the efficiency of the proposed index and its linear scalability also for large datasets. Item set mining supported by the IMine index shows performance always comparable with, and sometimes better than, state of the art algorithms accessing data on flat file.
Since no constraint is enforced during the index creation phase, IMine provides a complete representation of the original database.
Author Baralis, E.
Chiusano, S.
Cerquitelli, T.
Author_xml – sequence: 1
  givenname: E.
  surname: Baralis
  fullname: Baralis, E.
  organization: Dipt. di Autom. e Inf., Politec. di Torino, Torino
– sequence: 2
  givenname: T.
  surname: Cerquitelli
  fullname: Cerquitelli, T.
  organization: Dipt. di Autom. e Inf., Politec. di Torino, Torino
– sequence: 3
  givenname: S.
  surname: Chiusano
  fullname: Chiusano, S.
  organization: Dipt. di Autom. e Inf., Politec. di Torino, Torino
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=21474196$$DView record in Pascal Francis
BookMark eNp90U1LAzEQBuAgCtbq0ZOXRfDjsjWTZHcTb1KrFhUP6jmEdCKRbbYmW9B_b0qLB0FPmcMzMxnePbIduoCEHAIdAVB18XJ_PRkxSuUIJN0iA6gqWTJQsJ1rKqAUXDS7ZC-ld5pVI2FAzqePPuBlMQ0z_Cyel4tFF_vCdbGY9jgvnrEvMvDhbZ_sONMmPNi8Q_J6M3kZ35UPT7fT8dVDaXnT9CXHmlFE6mzNLNSVkRTYDGayMWiFM0yBEKiUMNQoC-jAceEqi2ihQQp8SM7Wcxex-1hi6vXcJ4ttawJ2y6RlU1GWL2NZnv4ruahBSLGCx7_ge7eMIV-hFTDKgUme0ckGmWRN66IJ1ie9iH5u4pdmIBoBqs6uXDsbu5Qiuh8CVK9i0KsY9CoGnWPInv_y1vem913oo_Htn11H6y6PiD8bRE0Vz3_9BolHki8
CODEN ITKEEH
CitedBy_id crossref_primary_10_4018_ijgc_2014010101
crossref_primary_10_1016_j_eswa_2013_06_002
crossref_primary_10_1016_j_ins_2014_08_073
crossref_primary_10_1007_s00778_020_00633_6
crossref_primary_10_1016_j_ins_2010_04_013
crossref_primary_10_14778_3297753_3297761
crossref_primary_10_1007_s11042_022_13225_z
crossref_primary_10_1016_j_eswa_2011_08_018
Cites_doi 10.1145/956750.956766
10.1145/276305.276335
10.1145/568574.568581
10.1023/B:DAMI.0000023674.74932.4c
10.1109/69.250074
10.1109/ICDE.2002.994758
10.1145/335191.335372
10.1109/ICDE.2005.80
10.1007/3-540-46145-0_1
10.1145/956750.956827
10.1007/BF00288683
10.1007/3-540-48521-X_23
10.1109/69.846291
10.1109/ICDE.1999.754960
10.1109/ICDE.2002.994772
10.1145/170036.170072
10.1109/ICDM.2004.10116
10.1109/ICDM.2002.1183892
10.1109/TKDE.2004.44
ContentType Journal Article
Copyright 2009 INIST-CNRS
Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2009
Copyright_xml – notice: 2009 INIST-CNRS
– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2009
DBID 97E
RIA
RIE
AAYXX
CITATION
IQODW
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
F28
FR3
DOI 10.1109/TKDE.2008.180
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Pascal-Francis
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
Engineering Research Database
ANTE: Abstracts in New Technology & Engineering
DatabaseTitleList Technology Research Database
Technology Research Database

Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
Applied Sciences
EISSN 1558-2191
EndPage 506
ExternalDocumentID 2543465641
21474196
10_1109_TKDE_2008_180
4609383
Genre orig-research
GroupedDBID -~X
.DC
0R~
1OL
29I
4.4
5GY
5VS
6IK
97E
9M8
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABFSI
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
F5P
HZ~
H~9
ICLAB
IEDLZ
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNI
RNS
RXW
RZB
TAE
TAF
TN5
UHB
VH1
AAYXX
CITATION
IQODW
RIG
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
F28
FR3
ID FETCH-LOGICAL-c377t-3e620ee0fc62c165a8012d1d87aec4fa29144e994a0a9c1ef1f34f5ceec17e013
IEDL.DBID RIE
ISSN 1041-4347
IngestDate Thu Oct 02 06:33:46 EDT 2025
Sat Sep 27 16:08:47 EDT 2025
Sun Jun 29 16:28:22 EDT 2025
Mon Jul 21 09:14:19 EDT 2025
Wed Oct 01 06:41:34 EDT 2025
Thu Apr 24 23:12:00 EDT 2025
Wed Aug 27 02:52:17 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 4
Keywords Data Mining
Itemset Extraction
Indexing
Content access
Data analysis
Scalability
Database
Very large databases
Data distribution
Information extraction
Data mining
item set extraction
Database management system
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c377t-3e620ee0fc62c165a8012d1d87aec4fa29144e994a0a9c1ef1f34f5ceec17e013
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Article-2
ObjectType-Feature-1
content type line 23
PQID 912031283
PQPubID 23500
PageCount 14
ParticipantIDs pascalfrancis_primary_21474196
crossref_primary_10_1109_TKDE_2008_180
crossref_citationtrail_10_1109_TKDE_2008_180
proquest_miscellaneous_875021552
proquest_miscellaneous_34614842
ieee_primary_4609383
proquest_journals_912031283
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2009-04-01
PublicationDateYYYYMMDD 2009-04-01
PublicationDate_xml – month: 04
  year: 2009
  text: 2009-04-01
  day: 01
PublicationDecade 2000
PublicationPlace New York, NY
PublicationPlace_xml – name: New York, NY
– name: New York
PublicationTitle IEEE transactions on knowledge and data engineering
PublicationTitleAbbrev TKDE
PublicationYear 2009
Publisher IEEE
IEEE Computer Society
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: IEEE Computer Society
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref12
Srikant (ref13)
ref34
ref15
ref31
ref30
ref11
ref33
Ramesh (ref9)
ref10
ref32
Grahne (ref17)
Pietracaprina (ref20)
ref2
ref19
Mannila (ref4)
(ref22) 2008
Uno (ref14)
ref24
ref23
ref26
Savasere (ref5)
ref21
Moerkotte (ref18)
Agrawal (ref25)
Agrawal (ref1)
Toivonen (ref6)
ref29
ref8
ref7
ref3
Meo (ref28)
Han (ref27)
(ref16) 2008
References_xml – ident: ref7
  doi: 10.1145/956750.956766
– volume-title: Proc. IEEE ICDM Workshop Frequent Itemset Mining Implementations (FIMI)
  ident: ref20
  article-title: Mining Frequent Itemsets Using Patricia Tries
– ident: ref26
  doi: 10.1145/276305.276335
– volume-title: Proc. IEEE ICDM Workshop Frequent Itemset Mining Implementations (FIMI ’03)
  ident: ref17
  article-title: Efficiently Using Prefix-Trees in Mining Frequent Itemsets
– ident: ref12
  doi: 10.1145/568574.568581
– start-page: 134
  volume-title: Proc. 22nd Int’l Conf. Very Large Data Bases (VLDB ’96)
  ident: ref6
  article-title: Sampling Large Databases for Association Rules
– start-page: 476
  volume-title: Proc. 24th Int’l Conf. Very Large Data Bases (VLDB ’98)
  ident: ref18
  article-title: Small Materialized Aggregates: A Light Weight Index Structure for Data Warehousing
– ident: ref15
  doi: 10.1023/B:DAMI.0000023674.74932.4c
– ident: ref23
  doi: 10.1109/69.250074
– volume-title: Proc. Second Int’l Conf. Knowledge Discovery in Databases and Data Mining (KDD)
  ident: ref25
  article-title: Developing Tightly-Coupled Data Mining Applications on a Relational Database System
– volume-title: POSTGRESQL
  year: 2008
  ident: ref16
– ident: ref32
  doi: 10.1109/ICDE.2002.994758
– ident: ref3
  doi: 10.1145/335191.335372
– volume-title: Proc. ACM SIGMOD Workshop Data Mining and Knowledge Discovery (DMKD)
  ident: ref9
  article-title: Indexing and Data Access Methods for Database Mining
– ident: ref33
  doi: 10.1109/ICDE.2005.80
– ident: ref29
  doi: 10.1007/3-540-46145-0_1
– ident: ref34
  doi: 10.1145/956750.956827
– start-page: 181
  volume-title: Proc. AAAI Workshop Knowledge Discovery in Databases (KDD ’94)
  ident: ref4
  article-title: Efficient Algorithms for Discovering Association Rules
– ident: ref21
  doi: 10.1007/BF00288683
– volume-title: Proc. 20th Int’l Conf. Very Large Data Bases (VLDB ’94)
  ident: ref1
  article-title: Fast Algorithm for Mining Association Rules
– start-page: 432
  volume-title: Proc. 21st Int’l Conf. Very Large Data Bases (VLDB ’95)
  ident: ref5
  article-title: An Efficient Algorithm for Mining Association Rules in Large Databases
– start-page: 67
  volume-title: Proc. Third Int’l Conf. Knowledge Discovery and Data Mining (KDD ’97)
  ident: ref13
  article-title: Mining Association Rules with Item Constraints
– volume-title: FIMI
  year: 2008
  ident: ref22
– ident: ref24
  doi: 10.1007/3-540-48521-X_23
– volume-title: Proc. ACM SIGMOD Workshop Data Mining and Knowledge Discovery (DMKD)
  ident: ref27
  article-title: DMQL: A Data Mining Query Language for Relational Databases
– ident: ref31
  doi: 10.1109/69.846291
– volume-title: Proc. IEEE ICDM Workshop Frequent Itemset Mining Implementations (FIMI)
  ident: ref14
  article-title: LCM ver. 2: Efficient Mining Algorithms for Frequent/Closed/Maximal Itemsets
– ident: ref19
  doi: 10.1109/ICDE.1999.754960
– ident: ref30
  doi: 10.1109/ICDE.2002.994772
– ident: ref2
  doi: 10.1145/170036.170072
– ident: ref8
  doi: 10.1109/ICDM.2004.10116
– ident: ref11
  doi: 10.1109/ICDM.2002.1183892
– volume-title: Proc. 22nd Int’l Conf. Very Large Data Bases (VLDB)
  ident: ref28
  article-title: A New SQL-Like Operator for Mining Association Rules
– ident: ref10
  doi: 10.1109/TKDE.2004.44
SSID ssj0008781
Score 2.0123827
Snippet This paper presents the IMine index, a general and compact structure which provides tight integration of item set extraction in a relational DBMS. Since no...
Since no constraint is enforced during the index creation phase, IMine provides a complete representation of the original database.
SourceID proquest
pascalfrancis
crossref
ieee
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 493
SubjectTerms Algorithms
Applied sciences
Association rules
Blocking
Categories
Clustering algorithms
Computer science; control theory; systems
Computer systems and distributed systems. User interface
Costs
Data base management systems
Data mining
Data processing. List processing. Character string processing
Data structures
Exact sciences and technology
Extraction
Imines
Indexes
Indexing
Itemset Extraction
Memory organisation. Data processing
Mining
Relational data bases
Relational databases
Scalability
Software
Studies
Transaction databases
Title IMine: Index Support for Item Set Mining
URI https://ieeexplore.ieee.org/document/4609383
https://www.proquest.com/docview/912031283
https://www.proquest.com/docview/34614842
https://www.proquest.com/docview/875021552
Volume 21
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1558-2191
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0008781
  issn: 1041-4347
  databaseCode: RIE
  dateStart: 19890101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1JT-wwDLaAExxYH6KsOSD0DnRom7RpuCEWsWi4PJC4VWnqXkAziOlc-PXY3VgeSNxaxZVSO3H8JY4_gH0lXYJWSz9h6gBVGPKDhHf8mKxfYJCnUvJt5OFtcnmvrh_ihxk47O_CIGKdfIYDfqzP8ouxm_JW2ZFKCH-nchZmdZo0d7V6r5vqmpCU0AVhIqn0ez3No7ubs_MmazLk6o8f1p-aUIXTIe2ENFI2VBb_eeV6qblYgmHXySbD5HEwrfKBe_1Sv_G3f7EMi23MKU6aQbICMzhahaWOz0G003sVFj4UJ1yDv1dDejkWV1xPUTD7J0XqgmJcwdv74h9WYlizS_yB-4vzu9NLv-VV8J3UuvIlJlGAGJQuiVyYxJZXqSIsUm3RqdJGhlAWGqNsYI0LsQxLqcqYllMXaiSbrsPcaDzCDRCpsdLkTnGcp7AwNgrKgiye5pZwX649OOy0nbm26DhzXzxlNfgITMbGabgwyTgeHPTiz021jZ8E11ixvVCrUw92P5myb2c2JkXOxoOtzrZZO1knmQkjcm0Rf77Xt9Is46MTO8LxdJJJxQVTVeSB-EGCcB-HT3G0-X3XtmC-O4gKwm2Yq16muEPxTJXv1gP5DSLC78I
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwEB4tcCgcyrNqugV8QBUHsiSx83BviId2gXBhkbhFjjO5UO1WbPbSX9-ZvEpbkLgl8kRyZuzxfPZ4PoAjJW2EJpZuxNQBqtDkBwnvuCFZv0AvT6Tk28jpXTR-UNeP4eMATvq7MIhYJ5_hiB_rs_xibpe8VXaqIsLfiVyBtVApFTa3tXq_m8Q1JSnhC0JFUsV_KmqeTm8uLpu8SZ_rP75YgWpKFU6INAvSSdmQWfznl-vF5moT0q6bTY7J02hZ5SP7658Kju_9jy342Ead4qwZJtswwNkObHaMDqKd4Duw8aI84S4cT1J6-S4mXFFRMP8nxeqColzBG_ziHiuR1vwSe_BwdTk9H7sts4JrZRxXrsQo8BC90kaB9aPQ8DpV-EUSG7SqNIEmnIVaK-MZbX0s_VKqMqQF1foxklU_wepsPsPPIBJtpM6t4khPYaFN4JUF2TzJDSG_PHbgpNN2Ztuy48x-8SOr4YenMzZOw4ZJxnHgWy_-s6m38ZbgLiu2F2p16sDBX6bs25mPSZG7cWDY2TZrp-si035Azi3gzw_7VppnfHhiZjhfLjKpuGSqChwQb0gQ8uMAKgy-vN61Q_gwnqa32e3k7mYI692xlOd_hdXqeYn7FN1U-UE9qH8D1EzzDw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=IMine%3A+Index+Support+for+Item+Set+Mining&rft.jtitle=IEEE+transactions+on+knowledge+and+data+engineering&rft.au=BARALIS%2C+Elena&rft.au=CERQUITELLI%2C+Tania&rft.au=CHIUSANO%2C+Silvia&rft.date=2009-04-01&rft.pub=IEEE+Computer+Society&rft.issn=1041-4347&rft.volume=21&rft.issue=4&rft.spage=493&rft.epage=506&rft_id=info:doi/10.1109%2FTKDE.2008.180&rft.externalDBID=n%2Fa&rft.externalDocID=21474196
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1041-4347&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1041-4347&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1041-4347&client=summon