Balancing Throughput and Fair Execution of Multi-DNN Workloads on Heterogeneous Embedded Devices

The rise of Deep Neural Networks (DNNs) has resulted in complex workloads employing multiple DNNs concurrently. This trend introduces unique challenges related to workload distribution, particularly in heterogeneous embedded systems. Current run-time managers struggle to efficiently utilize all comp...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on emerging topics in computing Vol. 13; no. 2; pp. 409 - 422
Main Authors	Karatzas, Andreas, Anagnostopoulos, Iraklis
Format	Journal Article
Language	English
Published	New York IEEE 01.04.2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Artificial neural networks Computation Computational modeling Embedded systems Graphics processing units Heterogeneous embedded systems Machine learning multi-DNN workload Pipelines reinforcement learning (RL) Task analysis Throughput throughput optimization Vectors Workload Workloads
Online Access	Get full text
ISSN	2168-6750 2168-6750
DOI	10.1109/TETC.2024.3407055

Cover

More Information
Summary:	The rise of Deep Neural Networks (DNNs) has resulted in complex workloads employing multiple DNNs concurrently. This trend introduces unique challenges related to workload distribution, particularly in heterogeneous embedded systems. Current run-time managers struggle to efficiently utilize all computing components on these platforms, resulting in two major problems. First, the system throughput deteriorates due to contention on the computing resources. Second, not all DNNs are affected equally, leading to inconsistent performance levels across different models. To address these challenges, we introduce FairBoost, a framework for efficient and fair multi-DNN inference on heterogeneous embedded systems. FairBoost employs Reinforcement Learning (RL) to efficiently manage multi-DNN workloads. Additionally, it incorporates a novel numerical representation of DNN layers via a Vector Quantized Variational Auto-Encoder (VQ-VAE). Finally, it enables knowledge transfer to similar heterogeneous embedded systems without retraining and/or fine-tuning. Experimental evaluation of FairBoost over 18 DNNs and various multi-DNN scenarios shows an average throughput/fairness improvement of <inline-formula><tex-math notation="LaTeX">\times 3.24</tex-math> <mml:math><mml:mrow><mml:mo>×</mml:mo><mml:mn>3</mml:mn><mml:mo>.</mml:mo><mml:mn>24</mml:mn></mml:mrow></mml:math><inline-graphic xlink:href="karatzas-ieq1-3407055.gif"/> </inline-formula>. Additionally, FairBoost facilitates knowledge transfer from the initial platform, Orange Pi 5, to a new system, Odroid N2+, without any retraining or fine-tuning achieving similar gains.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2168-6750 2168-6750
DOI:	10.1109/TETC.2024.3407055