Scaling up Data Mining Algorithms for Big Data
The rapid development of science and technology and replacement of digital equipment have presided over today’s era of big data. Automatically discovering and extracting hidden knowledge in the forms of patterns from these big data is known as data mining. However, the emergence of big data era has...
Saved in:
| Published in | International Journal For Multidisciplinary Research Vol. 7; no. 1 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
19.01.2025
|
| Online Access | Get full text |
| ISSN | 2582-2160 2582-2160 |
| DOI | 10.36948/ijfmr.2025.v07i01.34838 |
Cover
| Summary: | The rapid development of science and technology and replacement of digital equipment have presided over today’s era of big data. Automatically discovering and extracting hidden knowledge in the forms of patterns from these big data is known as data mining. However, the emergence of big data era has brought a series of challenges to data mining techniques including too long processing time, insufficient memory capacity and excessive power consumption. Aim of this paper is to study scaling up data mining algorithms for big data by Random Forest and Naïve Bayes. The background and applications of data mining, big data and cloud computing are briefly introduced together with the basic principles of Random Forest and Naive Bayes as well as MapReduce model in cloud computing. Then, the feasibility of parallelism of Random Forest and Naive Bayes is studied. Two parallel Random Forest and Naive Bayes algorithms based on MapReduce are developed and realized in Hadoop platform. Finally, the parallelism of Random Forest and Naive Bayes is validated by experiments. Their execution efficiency is analyzed through the experimental results on the different sizes of data sets and different numbers of clusters. It is shown that the proposed methods have a good performance and can be applied in process of big data. |
|---|---|
| ISSN: | 2582-2160 2582-2160 |
| DOI: | 10.36948/ijfmr.2025.v07i01.34838 |