DHC: Distributed Homomorphic Compression for Gradient Aggregation in Allreduce

Distributed training is critical for efficiently developing deep neural networks (DNNs) on tasks like image classification and natural language processing. However, as model and dataset sizes continue to grow, high communication overhead during gradient exchanges has become a major bottleneck in dis...

Full description

Saved in:

Bibliographic Details
Published in	IEEE International Conference on Communications (2003) pp. 1 - 6
Main Authors	Liao, Lida, Lin, Zhengli, Chen, Haodong, Zhu, Longlong, Liu, Hongyan, Yu, Jiashuo, Zhang, Dong, Wu, Chunming
Format	Conference Proceeding
Language	English
Published	IEEE 08.06.2025
Subjects	Computational modeling Image classification Image coding Indexing Integer linear programming Libraries Memory management Natural language processing Throughput Training
Online Access	Get full text
ISSN	1938-1883
DOI	10.1109/ICC52391.2025.11161970

Cover

Abstract	Distributed training is critical for efficiently developing deep neural networks (DNNs) on tasks like image classification and natural language processing. However, as model and dataset sizes continue to grow, high communication overhead during gradient exchanges has become a major bottleneck in distributed training. Although existing homomorphic compression frameworks effectively reduce communication overhead, their reliance on centralized architectures makes them unsuitable for the mainstream decentralized AllReduce architecture. To address this, we propose DHC, a framework for homomorphic gradient compression in AllReduce architectures. Its key idea is HG-Sketch, which leverages multi-level index tables for direct in-network aggregation of compressed gradients, thereby eliminating additional computational overhead. Additionally, DHC introduces an index-sharing method to optimize memory usage on programmable switches. Furthermore, we establish an Integer Linear Programming (ILP) model to optimize the deployment strategy of programmable switches, further enhancing in-network aggregation capabilities. Experimental results demonstrate that DHC achieves a 3.8 \times increase in aggregation speed and a 4.2 \times improvement in aggregation throughput.
AbstractList	Distributed training is critical for efficiently developing deep neural networks (DNNs) on tasks like image classification and natural language processing. However, as model and dataset sizes continue to grow, high communication overhead during gradient exchanges has become a major bottleneck in distributed training. Although existing homomorphic compression frameworks effectively reduce communication overhead, their reliance on centralized architectures makes them unsuitable for the mainstream decentralized AllReduce architecture. To address this, we propose DHC, a framework for homomorphic gradient compression in AllReduce architectures. Its key idea is HG-Sketch, which leverages multi-level index tables for direct in-network aggregation of compressed gradients, thereby eliminating additional computational overhead. Additionally, DHC introduces an index-sharing method to optimize memory usage on programmable switches. Furthermore, we establish an Integer Linear Programming (ILP) model to optimize the deployment strategy of programmable switches, further enhancing in-network aggregation capabilities. Experimental results demonstrate that DHC achieves a 3.8 \times increase in aggregation speed and a 4.2 \times improvement in aggregation throughput.
Author	Chen, Haodong Zhang, Dong Yu, Jiashuo Wu, Chunming Lin, Zhengli Liu, Hongyan Liao, Lida Zhu, Longlong
Author_xml	– sequence: 1 givenname: Lida surname: Liao fullname: Liao, Lida organization: College of Computer and Data Science, Fuzhou University – sequence: 2 givenname: Zhengli surname: Lin fullname: Lin, Zhengli organization: College of Computer and Data Science, Fuzhou University – sequence: 3 givenname: Haodong surname: Chen fullname: Chen, Haodong organization: College of Computer and Data Science, Fuzhou University – sequence: 4 givenname: Longlong surname: Zhu fullname: Zhu, Longlong organization: College of Computer and Data Science, Fuzhou University – sequence: 5 givenname: Hongyan surname: Liu fullname: Liu, Hongyan organization: College of Computer and Data Science, Fuzhou University – sequence: 6 givenname: Jiashuo surname: Yu fullname: Yu, Jiashuo organization: College of Computer and Data Science, Fuzhou University – sequence: 7 givenname: Dong surname: Zhang fullname: Zhang, Dong organization: College of Computer and Data Science, Fuzhou University – sequence: 8 givenname: Chunming surname: Wu fullname: Wu, Chunming organization: Zhejiang University
BookMark	eNo1j81Kw0AUhUdRsK19A5F5gdS5mUwy111ItSkU3XRfJpObOJI_ZtKFb29F5SwOfB8cOEt2M4wDMfYIYgMg8GlfFCqWCJtYxOqCIAXMxBVbY4ZaSlBCxYDXbAEodQRayzu2DOFTXDhKWLC3bVk8860Ls3fVeaaal2N_iZ8-nOXF2E-eQnDjwJvR8503taNh5nnbemrN_CPcwPOu81SfLd2z28Z0gdZ_vWLH15djUUaH992-yA-RQzlHmMSWTJUQgM5SnarKxkZr22CCVkAmRZNIBdakNrFKC9IAErNMWbSNroVcsYffWUdEp8m73viv0_99-Q1t_FCE
ContentType	Conference Proceeding
DBID	6IE 6IH CBEJK RIE RIO
DOI	10.1109/ICC52391.2025.11161970
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore (NTUSG) IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Xplore (NTUSG) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISBN	9798331505219
EISSN	1938-1883
EndPage	6
ExternalDocumentID	11161970
Genre	orig-research
GroupedDBID	29F 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP IPLJI OCL RIE RIL RIO
ID	FETCH-LOGICAL-i93t-942ceab4e11876865bc2a88cf949c01730f4351ca6c4c580e81139775c9cf8d03
IEDL.DBID	RIE
IngestDate	Wed Oct 01 07:05:03 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i93t-942ceab4e11876865bc2a88cf949c01730f4351ca6c4c580e81139775c9cf8d03
PageCount	6
ParticipantIDs	ieee_primary_11161970
PublicationCentury	2000
PublicationDate	2025-June-8
PublicationDateYYYYMMDD	2025-06-08
PublicationDate_xml	– month: 06 year: 2025 text: 2025-June-8 day: 08
PublicationDecade	2020
PublicationTitle	IEEE International Conference on Communications (2003)
PublicationTitleAbbrev	ICC
PublicationYear	2025
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0052931
Score	2.2965705
Snippet	Distributed training is critical for efficiently developing deep neural networks (DNNs) on tasks like image classification and natural language processing....
SourceID	ieee
SourceType	Publisher
StartPage	1
SubjectTerms	Computational modeling Image classification Image coding Indexing Integer linear programming Libraries Memory management Natural language processing Throughput Training
Title	DHC: Distributed Homomorphic Compression for Gradient Aggregation in Allreduce
URI	https://ieeexplore.ieee.org/document/11161970
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEF60J734qvhmD16T5rG72fVWUmsULB4q9Fayk00p1lRKevHXO7tpfIEguSyBkGWGnW8mme8bQq4DA7oQZeKFIio8BhzjYMK0BwguJtdJIrQlJz-ORPbMHiZ8siGrOy6MMcY1nxnfLt2__GIJa_uprIfnEvP9BCv07USKhqzVhl2OuBVuKMBhoHr3aYo1lrIlYMT99skfM1QchAz3yKh9edM58uKva-3D-y9dxn_vbp90v9h69OkThw7IlqkOye43ocEjMhpk6Q0dWJFcO9_KFDRbvuKFRp4DtTGhaYetKOaw9G7l-sBq2p9hNT5zvqPzivYXi5VVejVdMh7ejtPM20xS8OYqrj3FIkDDM2NniwspuIYolxJKxRTgkYyDErOmEHIB6C0ZGBm6xJCDglIWQXxMOtWyMieEFhE3iOhRqWLOcpkraRE_jCBUTJZanZKutcz0rdHKmLZGOfvj_jnZsQ5yzVfygnTq1dpcIszX-sq59wPaTaZi
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELYQDMDCq4g3HliTJo6d2GxVSkmhjRiK1K2KHaeqKCmq0oVfz9lpeElIKIsVKYp1J993l9z3HUI3nlYyD4vI8UOSO1QxiIMRlY4CcNGZjKJQGnLyMA2TZ_owZuM1Wd1yYbTWtvlMu2Zp_-XnC7Uyn8racC4h34-gQt9ilFJW07WawMsAufw1Cdj3RLsfx1BlCVMEEuY2z_6YomJBpLeH0ub1de_Ii7uqpKvefykz_nt_-6j1xdfDT59IdIA2dHmIdr9JDR6htJvEt7hrZHLNhCud42TxCheYeaawiQp1Q2yJIYvF90vbCVbhzhTq8an1Hp6VuDOfL43Wq26hUe9uFCfOepaCMxNB5QhKFJieajNdPOQhk4pknKtCUKHgUAZeAXmTr7JQgb-4p7lvU0OmhCp47gXHaLNclPoE4ZwwDZhOChEwmvFMcIP5PlG-oLyQ4hS1jGUmb7VaxqQxytkf96_RdjIaDiaDfvp4jnaMs2wrFr9Am9VypS8B9Ct5ZV39AcL4qa8
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE+International+Conference+on+Communications+%282003%29&rft.atitle=DHC%3A+Distributed+Homomorphic+Compression+for+Gradient+Aggregation+in+Allreduce&rft.au=Liao%2C+Lida&rft.au=Lin%2C+Zhengli&rft.au=Chen%2C+Haodong&rft.au=Zhu%2C+Longlong&rft.date=2025-06-08&rft.pub=IEEE&rft.eissn=1938-1883&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FICC52391.2025.11161970&rft.externalDocID=11161970