Utility-Driven Data Analytics Algorithm for Transaction Modifications Using Pre-Large Concept With Single Database Scan

Utility-driven pattern analysis is a fundamental method for analyzing noteworthy patterns with high utility for diverse quantitative transactional databases. Recently, various approaches have emerged to handle large, dynamic database environments more efficiently by reducing the number of data scans...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on big data Vol. 11; no. 5; pp. 2792 - 2808
Main Authors	Yun, Unil, Kim, Hanju, Cho, Myungha, Ryu, Taewoong, Park, Seungwan, Kim, Doyoon, Kim, Doyoung, Lee, Chanhee, Pedrycz, Witold
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.10.2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Big Data Data analysis Data mining Data structures Dynamic databases high utility patterns Pattern analysis Pattern search Performance evaluation pre-large concept Real time Real-time systems Runtime single database scan Statistical analysis Streams Synthetic data transaction modifications Upper bound
Online Access	Get full text
ISSN	2332-7790 2372-2096
DOI	10.1109/TBDATA.2025.3556615

Cover

More Information
Summary:	Utility-driven pattern analysis is a fundamental method for analyzing noteworthy patterns with high utility for diverse quantitative transactional databases. Recently, various approaches have emerged to handle large, dynamic database environments more efficiently by reducing the number of data scans and pattern expansion operations with the pre-large concept. However, existing pre-large-based high utility pattern mining methods either fail to handle real-time transaction modifications or require additional data scans to validate candidate patterns. In this paper, we propose a novel efficient utility-driven pattern mining algorithm using the pre-large concept for transaction modifications. Our method incorporates a single-scan-based framework through the management of actual utility values and discovers high utility patterns without candidate generation for efficient utility-driven dynamic data analysis in the modification environment. We compared the performance of the proposed method with state-of-the-art methods through extensive performance evaluation utilizing real and synthetic datasets. According to the evaluation results and a case study, the suggested method performs a minimum of 1.5 times faster than state-of-the-art methods alongside minimal compromise in memory, and it scaled well with increases in database size. Further statistical analyses indicate that the proposed method reduces the pattern search space compared to the previous method while delivering a complete set of accurate results without loss.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2332-7790 2372-2096
DOI:	10.1109/TBDATA.2025.3556615