Frequent item set generation based on transaction hashing
Hashing & Pruning is very popular association rule mining technique to improve the performance of traditional Apriori algorithm. Hashing technique uses hash function to reduce the size of candidate item set. Direct Hashing & Pruning (DHP), Perfect Hashing &Pruning (PHP) are the basic has...
Saved in:
| Published in | Confluence 2014 : proceedings of the 5th International Conference the Next Generation Information Technology Summit : 25-26 September 2014, Amity University, Uttar Pradesh pp. 182 - 187 |
|---|---|
| Main Authors | , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
01.09.2014
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.1109/CONFLUENCE.2014.6949340 |
Cover
| Summary: | Hashing & Pruning is very popular association rule mining technique to improve the performance of traditional Apriori algorithm. Hashing technique uses hash function to reduce the size of candidate item set. Direct Hashing & Pruning (DHP), Perfect Hashing &Pruning (PHP) are the basic hashing algorithms. Many algorithms have been also proposed by researchers. All algorithms have their own pros and cons. DHP algorithm suffer from collision and require more database scans to count the frequency of collided item sets. PHP algorithm eliminates collision problem but this algorithm increases the size of hash table which requires large amount of memory space and uses complex hash function. The main objective of this paper is to reduce the number of collision, database scans to count the frequency of collided item sets and to make sure that the size of hash table does not increase. A new algorithm Transaction Hashing and Pruning (THP) is proposed in this paper. THP arranges the item sets into vertical format and after finding out the bucket number of candidate-k item sets, and hashes the transaction id (TID) of that the candidate item set into that bucket. THP algorithm overcomes the item set collision problem of DHP algorithm and large hash table problem of PHP algorithm. Experimental results are also shown in the paper. |
|---|---|
| DOI: | 10.1109/CONFLUENCE.2014.6949340 |