Frequent item set generation based on transaction hashing

Hashing & Pruning is very popular association rule mining technique to improve the performance of traditional Apriori algorithm. Hashing technique uses hash function to reduce the size of candidate item set. Direct Hashing & Pruning (DHP), Perfect Hashing &Pruning (PHP) are the basic has...

Full description

Saved in:
Bibliographic Details
Published inConfluence 2014 : proceedings of the 5th International Conference the Next Generation Information Technology Summit : 25-26 September 2014, Amity University, Uttar Pradesh pp. 182 - 187
Main Authors Agarwal, Jyoti, Singh, Archana
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.09.2014
Subjects
Online AccessGet full text
DOI10.1109/CONFLUENCE.2014.6949340

Cover

Abstract Hashing & Pruning is very popular association rule mining technique to improve the performance of traditional Apriori algorithm. Hashing technique uses hash function to reduce the size of candidate item set. Direct Hashing & Pruning (DHP), Perfect Hashing &Pruning (PHP) are the basic hashing algorithms. Many algorithms have been also proposed by researchers. All algorithms have their own pros and cons. DHP algorithm suffer from collision and require more database scans to count the frequency of collided item sets. PHP algorithm eliminates collision problem but this algorithm increases the size of hash table which requires large amount of memory space and uses complex hash function. The main objective of this paper is to reduce the number of collision, database scans to count the frequency of collided item sets and to make sure that the size of hash table does not increase. A new algorithm Transaction Hashing and Pruning (THP) is proposed in this paper. THP arranges the item sets into vertical format and after finding out the bucket number of candidate-k item sets, and hashes the transaction id (TID) of that the candidate item set into that bucket. THP algorithm overcomes the item set collision problem of DHP algorithm and large hash table problem of PHP algorithm. Experimental results are also shown in the paper.
AbstractList Hashing & Pruning is very popular association rule mining technique to improve the performance of traditional Apriori algorithm. Hashing technique uses hash function to reduce the size of candidate item set. Direct Hashing & Pruning (DHP), Perfect Hashing &Pruning (PHP) are the basic hashing algorithms. Many algorithms have been also proposed by researchers. All algorithms have their own pros and cons. DHP algorithm suffer from collision and require more database scans to count the frequency of collided item sets. PHP algorithm eliminates collision problem but this algorithm increases the size of hash table which requires large amount of memory space and uses complex hash function. The main objective of this paper is to reduce the number of collision, database scans to count the frequency of collided item sets and to make sure that the size of hash table does not increase. A new algorithm Transaction Hashing and Pruning (THP) is proposed in this paper. THP arranges the item sets into vertical format and after finding out the bucket number of candidate-k item sets, and hashes the transaction id (TID) of that the candidate item set into that bucket. THP algorithm overcomes the item set collision problem of DHP algorithm and large hash table problem of PHP algorithm. Experimental results are also shown in the paper.
Author Agarwal, Jyoti
Singh, Archana
Author_xml – sequence: 1
  givenname: Jyoti
  surname: Agarwal
  fullname: Agarwal, Jyoti
  email: jagarwal1@amity.edu
  organization: AITEM, Amity Univ., Noida, India
– sequence: 2
  givenname: Archana
  surname: Singh
  fullname: Singh, Archana
  email: asingh27@amity.edu
  organization: ASET, Amity Univ., Noida, India
BookMark eNotj7FOwzAURY0EElDyBQzkB1L8bOfZHlGUUKSoXehcOc5za0RdiMPA34Og0z06w5HuLbtMp0SMPQBfAnD72GzWXb9t1027FBzUEq2yUvELVlhtQGlrlZCor1mR8xvnHCwaAHnDbDfR5xeluYwzHctMc7mnRJOb4ymVg8s0lr8wTy5l5__kweVDTPs7dhXce6bivAu27drXZlX1m-eX5qmvIuh6rpAHodDVSmkBA41OW4_B1AF98CSd0RgGMoPAGn0NhF4DehyDNRK8UHLB7v-7kYh2H1M8uul7dz4ofwD_3Em_
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/CONFLUENCE.2014.6949340
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9781479942367
1479942367
1479942375
9781479942374
EndPage 187
ExternalDocumentID 6949340
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
AAWTH
ADFMO
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
IEGSK
IERZE
OCL
RIE
RIL
ID FETCH-LOGICAL-i175t-60f246a544721beda79c6f85f6cfce3a876fbe8b2656c51e6c716c6df9831c243
IEDL.DBID RIE
IngestDate Wed Aug 27 04:46:21 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-60f246a544721beda79c6f85f6cfce3a876fbe8b2656c51e6c716c6df9831c243
PageCount 6
ParticipantIDs ieee_primary_6949340
PublicationCentury 2000
PublicationDate 2014-Sept.
PublicationDateYYYYMMDD 2014-09-01
PublicationDate_xml – month: 09
  year: 2014
  text: 2014-Sept.
PublicationDecade 2010
PublicationTitle Confluence 2014 : proceedings of the 5th International Conference the Next Generation Information Technology Summit : 25-26 September 2014, Amity University, Uttar Pradesh
PublicationTitleAbbrev CONFLUENCE
PublicationYear 2014
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0001968113
Score 1.5834434
Snippet Hashing & Pruning is very popular association rule mining technique to improve the performance of traditional Apriori algorithm. Hashing technique uses hash...
SourceID ieee
SourceType Publisher
StartPage 182
SubjectTerms Association rules
Clustering algorithms
Data Mining
Hashing and Pruning
Information technology
Memory management
Next generation networking
Transaction Hashing
Title Frequent item set generation based on transaction hashing
URI https://ieeexplore.ieee.org/document/6949340
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELYKEyw8WsRbHhhJmsTOxZ4rogrRwkClblVsnwEhtai4C78eOwktIAY2y5Jl-07y3X3-7o6QKyENZxoCrRyTiCuASAmWRiqzuoJCsQoD3jEaw3DCb6f5tEOu17kwiFiTzzAOw_ov3yz0KkBlfZBcMu4D9K1CQJOrtcFTJIg0ZS2FK01kf3A_Lu8mAagJDC4et6t_tFGprUi5R0Zf-zfkkdd45VSsP36VZvzvAfdJb5OvRx_WluiAdHB-SHa_lRrsElkua9K0owGtpe_o6FNdcToohgZbZqgfuE37cPrc9FnqkUl58zgYRm3bhOjF-wIugsRmHKqccx_dKTRVITVYkVvQViOr_PtnFQqVeVdO5ymC9jGTBmOlV5LOODsi2_PFHI8JVQZEYTAzSkieVCgK5h0OhoKB8jLmJ6QbhDB7aypjzNr7n_49fUZ2giIahtY52XbLFV54k-7UZa3LT22boOA
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDI6mcQAuPDbEmxw40q5tUjc5T1QDtsFhk3abmsQFNGlDo7vw60nasgHiwC2KFMmxpdj-8tkm5FpIw5kGRyvHwOMKwFOChZ6Kcp1BoliGDu8YDKE35veTeNIgN-taGEQsyWfou2X5l28WeuWgsg5ILhm3CfpWzDmPq2qtDaIiQYQhq0lcYSA73cdh2h87qMZxuLhfn_8xSKX0I-keGXxJUNFHZv6qUL7--NWc8b8i7pP2pmKPPq190QFp4PyQ7H5rNtgiMl2WtOmCOryWvmNBn8ue08401HkzQ-2i2AwQpy_VpKU2Gae3o27PqwcneK82Gig8CPKIQ2aVZPM7hSZLpIZcxDnoXCPL7AuYKxQqssGcjkMEbbMmDSaX1kw64uyINOeLOR4TqgyIxGBklJA8yFAkzIYcDAUDZXXMT0jLKWH6VvXGmNb3P_17-4ps90aD_rR_N3w4IzvOKBVf65w0i-UKL6yDL9RladdPDBGkLQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Confluence+2014+%3A+proceedings+of+the+5th+International+Conference+the+Next+Generation+Information+Technology+Summit+%3A+25-26+September+2014%2C+Amity+University%2C+Uttar+Pradesh&rft.atitle=Frequent+item+set+generation+based+on+transaction+hashing&rft.au=Agarwal%2C+Jyoti&rft.au=Singh%2C+Archana&rft.date=2014-09-01&rft.pub=IEEE&rft.spage=182&rft.epage=187&rft_id=info:doi/10.1109%2FCONFLUENCE.2014.6949340&rft.externalDocID=6949340