Performance Comparison of Bagging Classifiers Between Multi-Core And Single-Core Processors Using Python Multiprocessing

In recent years, the demand for efficient and scalable machine learning algorithms has surged. Bagging (Bootstrap Aggregating) stands out as a widely used ensemble technique that combines multiple base classifiers to enhance predictive accuracy and mitigate over fitting. However, when implementing b...

Full description

Saved in:
Bibliographic Details
Published in2024 2nd International Conference on Advancements and Key Challenges in Green Energy and Computing (AKGEC) pp. 1 - 6
Main Authors Rawat, Aman, Krishna, Charu, Kumar, Divya
Format Conference Proceeding
LanguageEnglish
Published IEEE 21.11.2024
Subjects
Online AccessGet full text
DOI10.1109/AKGEC62572.2024.10867840

Cover

More Information
Summary:In recent years, the demand for efficient and scalable machine learning algorithms has surged. Bagging (Bootstrap Aggregating) stands out as a widely used ensemble technique that combines multiple base classifiers to enhance predictive accuracy and mitigate over fitting. However, when implementing bagging classifiers in Python, a crucial consideration lies in their effective utilization of multi-core processors. Python, as a language, faces parallelism limitations due to the Global Interpreter Lock (GIL) present in its C implementation (cPython). The GIL restricts Python threads from fully exploiting multiple CPU cores, thereby limiting their concurrency. Consequently, relying solely on regular threads for concurrency would not fully harness the potential of multi-core processors. Python, as a language, faces parallelism limitations due to the Global Interpreter Lock (GIL) present in its C implementation (cPython). The GIL restricts Python threads from fully exploiting multiple CPU cores, thereby limiting their concurrency. Consequently, relying solely on regular threads for concurrency would not fully harness the potential of multi-core processors. To overcome this limitation, we recommend leveraging Python's multiprocessing module. Unlike threads, which share the same memory space and are subject to the GIL, multiprocessing creates separate processes, each with its own Python interpreter instance. As a result, these processes can run in parallel, effectively utilizing multiple CPU cores. Our study delves into the performance comparison of bagging classifiers on multi-core processors versus single-core using Python multiprocessing. Our experiments reveal that multiprocessing significantly improves execution time, particularly for CPU-bound tasks. By leveraging multiple cores, bagging classifiers can better generalize to unseen data and achieve enhanced predictive accuracy.
DOI:10.1109/AKGEC62572.2024.10867840