A scalable bottom-up data mining algorithm for relational databases

Machine learning induction algorithms are difficult to scale to very large databases because of their memory-bound nature. Using virtual memory results in a significant performance degradation. To overcome such shortcomings, we developed a classification rule induction algorithm for relational datab...

Full description

Saved in:
Bibliographic Details
Published inProceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243) pp. 206 - 209
Main Authors Giuffrida, G., Cooper, L.G., Chu, W.W.
Format Conference Proceeding
LanguageEnglish
Published IEEE 1998
Subjects
Online AccessGet full text
ISBN0818685751
9780818685750
ISSN1099-3371
DOI10.1109/SSDM.1998.688125

Cover

More Information
Summary:Machine learning induction algorithms are difficult to scale to very large databases because of their memory-bound nature. Using virtual memory results in a significant performance degradation. To overcome such shortcomings, we developed a classification rule induction algorithm for relational databases. Our algorithm uses a bottom-up rule generation strategy that is more effective for mining databases having large cardinality of nominal variables. We have successfully used our algorithm to mine a retail grocery database containing more than 1.6 million records in about 5 hours on a dual Pentium processor PC.
ISBN:0818685751
9780818685750
ISSN:1099-3371
DOI:10.1109/SSDM.1998.688125