Robust Distributed Learning Against Both Distributional Shifts and Byzantine Attacks

In distributed learning systems, robustness threat may arise from two major sources. On the one hand, due to distributional shifts between training data and test data, the trained model could exhibit poor out-of-sample performance. On the other hand, a portion of working nodes might be subject to By...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transaction on neural networks and learning systems Vol. 36; no. 5; pp. 7955 - 7969
Main Authors	Zhou, Guanqiang, Xu, Ping, Wang, Yue, Tian, Zhi
Format	Journal Article
Language	English
Published	United States IEEE 01.05.2025
Subjects	Byzantine attacks Computational modeling Computer aided instruction Convergence Distance learning distributed learning distributional shifts NIST norm-based screening (NBS) Robustness Servers Wasserstein distance
Online Access	Get full text
ISSN	2162-237X 2162-2388 2162-2388
DOI	10.1109/TNNLS.2024.3436149

Cover

More Information
Summary:	In distributed learning systems, robustness threat may arise from two major sources. On the one hand, due to distributional shifts between training data and test data, the trained model could exhibit poor out-of-sample performance. On the other hand, a portion of working nodes might be subject to Byzantine attacks, which could invalidate the learning result. In this article, we propose a new research direction that jointly considers distributional shifts and Byzantine attacks. We illuminate the major challenges in addressing these two issues simultaneously. Accordingly, we design a new algorithm that equips distributed learning with both distributional robustness and Byzantine robustness. Our algorithm is built on recent advances in distributionally robust optimization (DRO) as well as norm-based screening (NBS), a robust aggregation scheme against Byzantine attacks. We provide convergence proofs in three cases of the learning model being nonconvex, convex, and strongly convex for the proposed algorithm, shedding light on its convergence behaviors and endurability against Byzantine attacks. In particular, we deduce that any algorithm employing NBS (including ours) cannot converge when the percentage of Byzantine nodes is <inline-formula> <tex-math notation="LaTeX">(1/3) </tex-math></inline-formula> or higher, instead of <inline-formula> <tex-math notation="LaTeX">(1/2) </tex-math></inline-formula>, which is the common belief in current literature. The experimental results verify our theoretical findings (on the breakpoint of NBS and others) and also demonstrate the effectiveness of our algorithm against both robustness issues, justifying our choice of NBS over other widely used robust aggregation schemes. To the best of our knowledge, this is the first work to address distributional shifts and Byzantine attacks simultaneously.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	2162-237X 2162-2388 2162-2388
DOI:	10.1109/TNNLS.2024.3436149