Utility-Driven Data Analytics Algorithm for Transaction Modifications Using Pre-Large Concept With Single Database Scan

Utility-driven pattern analysis is a fundamental method for analyzing noteworthy patterns with high utility for diverse quantitative transactional databases. Recently, various approaches have emerged to handle large, dynamic database environments more efficiently by reducing the number of data scans...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on big data Vol. 11; no. 5; pp. 2792 - 2808
Main Authors Yun, Unil, Kim, Hanju, Cho, Myungha, Ryu, Taewoong, Park, Seungwan, Kim, Doyoon, Kim, Doyoung, Lee, Chanhee, Pedrycz, Witold
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.10.2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN2332-7790
2372-2096
DOI10.1109/TBDATA.2025.3556615

Cover

More Information
Summary:Utility-driven pattern analysis is a fundamental method for analyzing noteworthy patterns with high utility for diverse quantitative transactional databases. Recently, various approaches have emerged to handle large, dynamic database environments more efficiently by reducing the number of data scans and pattern expansion operations with the pre-large concept. However, existing pre-large-based high utility pattern mining methods either fail to handle real-time transaction modifications or require additional data scans to validate candidate patterns. In this paper, we propose a novel efficient utility-driven pattern mining algorithm using the pre-large concept for transaction modifications. Our method incorporates a single-scan-based framework through the management of actual utility values and discovers high utility patterns without candidate generation for efficient utility-driven dynamic data analysis in the modification environment. We compared the performance of the proposed method with state-of-the-art methods through extensive performance evaluation utilizing real and synthetic datasets. According to the evaluation results and a case study, the suggested method performs a minimum of 1.5 times faster than state-of-the-art methods alongside minimal compromise in memory, and it scaled well with increases in database size. Further statistical analyses indicate that the proposed method reduces the pattern search space compared to the previous method while delivering a complete set of accurate results without loss.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2332-7790
2372-2096
DOI:10.1109/TBDATA.2025.3556615