Performance Comparison of Bagging Classifiers Between Multi-Core And Single-Core Processors Using Python Multiprocessing
In recent years, the demand for efficient and scalable machine learning algorithms has surged. Bagging (Bootstrap Aggregating) stands out as a widely used ensemble technique that combines multiple base classifiers to enhance predictive accuracy and mitigate over fitting. However, when implementing b...
Saved in:
| Published in | 2024 2nd International Conference on Advancements and Key Challenges in Green Energy and Computing (AKGEC) pp. 1 - 6 |
|---|---|
| Main Authors | , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
21.11.2024
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.1109/AKGEC62572.2024.10867840 |
Cover
| Summary: | In recent years, the demand for efficient and scalable machine learning algorithms has surged. Bagging (Bootstrap Aggregating) stands out as a widely used ensemble technique that combines multiple base classifiers to enhance predictive accuracy and mitigate over fitting. However, when implementing bagging classifiers in Python, a crucial consideration lies in their effective utilization of multi-core processors. Python, as a language, faces parallelism limitations due to the Global Interpreter Lock (GIL) present in its C implementation (cPython). The GIL restricts Python threads from fully exploiting multiple CPU cores, thereby limiting their concurrency. Consequently, relying solely on regular threads for concurrency would not fully harness the potential of multi-core processors. Python, as a language, faces parallelism limitations due to the Global Interpreter Lock (GIL) present in its C implementation (cPython). The GIL restricts Python threads from fully exploiting multiple CPU cores, thereby limiting their concurrency. Consequently, relying solely on regular threads for concurrency would not fully harness the potential of multi-core processors. To overcome this limitation, we recommend leveraging Python's multiprocessing module. Unlike threads, which share the same memory space and are subject to the GIL, multiprocessing creates separate processes, each with its own Python interpreter instance. As a result, these processes can run in parallel, effectively utilizing multiple CPU cores. Our study delves into the performance comparison of bagging classifiers on multi-core processors versus single-core using Python multiprocessing. Our experiments reveal that multiprocessing significantly improves execution time, particularly for CPU-bound tasks. By leveraging multiple cores, bagging classifiers can better generalize to unseen data and achieve enhanced predictive accuracy. |
|---|---|
| DOI: | 10.1109/AKGEC62572.2024.10867840 |