DHC: Distributed Homomorphic Compression for Gradient Aggregation in Allreduce
Distributed training is critical for efficiently developing deep neural networks (DNNs) on tasks like image classification and natural language processing. However, as model and dataset sizes continue to grow, high communication overhead during gradient exchanges has become a major bottleneck in dis...
Saved in:
| Published in | IEEE International Conference on Communications (2003) pp. 1 - 6 |
|---|---|
| Main Authors | , , , , , , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
08.06.2025
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 1938-1883 |
| DOI | 10.1109/ICC52391.2025.11161970 |
Cover
| Abstract | Distributed training is critical for efficiently developing deep neural networks (DNNs) on tasks like image classification and natural language processing. However, as model and dataset sizes continue to grow, high communication overhead during gradient exchanges has become a major bottleneck in distributed training. Although existing homomorphic compression frameworks effectively reduce communication overhead, their reliance on centralized architectures makes them unsuitable for the mainstream decentralized AllReduce architecture. To address this, we propose DHC, a framework for homomorphic gradient compression in AllReduce architectures. Its key idea is HG-Sketch, which leverages multi-level index tables for direct in-network aggregation of compressed gradients, thereby eliminating additional computational overhead. Additionally, DHC introduces an index-sharing method to optimize memory usage on programmable switches. Furthermore, we establish an Integer Linear Programming (ILP) model to optimize the deployment strategy of programmable switches, further enhancing in-network aggregation capabilities. Experimental results demonstrate that DHC achieves a 3.8 \times increase in aggregation speed and a 4.2 \times improvement in aggregation throughput. |
|---|---|
| AbstractList | Distributed training is critical for efficiently developing deep neural networks (DNNs) on tasks like image classification and natural language processing. However, as model and dataset sizes continue to grow, high communication overhead during gradient exchanges has become a major bottleneck in distributed training. Although existing homomorphic compression frameworks effectively reduce communication overhead, their reliance on centralized architectures makes them unsuitable for the mainstream decentralized AllReduce architecture. To address this, we propose DHC, a framework for homomorphic gradient compression in AllReduce architectures. Its key idea is HG-Sketch, which leverages multi-level index tables for direct in-network aggregation of compressed gradients, thereby eliminating additional computational overhead. Additionally, DHC introduces an index-sharing method to optimize memory usage on programmable switches. Furthermore, we establish an Integer Linear Programming (ILP) model to optimize the deployment strategy of programmable switches, further enhancing in-network aggregation capabilities. Experimental results demonstrate that DHC achieves a 3.8 \times increase in aggregation speed and a 4.2 \times improvement in aggregation throughput. |
| Author | Chen, Haodong Zhang, Dong Yu, Jiashuo Wu, Chunming Lin, Zhengli Liu, Hongyan Liao, Lida Zhu, Longlong |
| Author_xml | – sequence: 1 givenname: Lida surname: Liao fullname: Liao, Lida organization: College of Computer and Data Science, Fuzhou University – sequence: 2 givenname: Zhengli surname: Lin fullname: Lin, Zhengli organization: College of Computer and Data Science, Fuzhou University – sequence: 3 givenname: Haodong surname: Chen fullname: Chen, Haodong organization: College of Computer and Data Science, Fuzhou University – sequence: 4 givenname: Longlong surname: Zhu fullname: Zhu, Longlong organization: College of Computer and Data Science, Fuzhou University – sequence: 5 givenname: Hongyan surname: Liu fullname: Liu, Hongyan organization: College of Computer and Data Science, Fuzhou University – sequence: 6 givenname: Jiashuo surname: Yu fullname: Yu, Jiashuo organization: College of Computer and Data Science, Fuzhou University – sequence: 7 givenname: Dong surname: Zhang fullname: Zhang, Dong organization: College of Computer and Data Science, Fuzhou University – sequence: 8 givenname: Chunming surname: Wu fullname: Wu, Chunming organization: Zhejiang University |
| BookMark | eNo1j81Kw0AUhUdRsK19A5F5gdS5mUwy111ItSkU3XRfJpObOJI_ZtKFb29F5SwOfB8cOEt2M4wDMfYIYgMg8GlfFCqWCJtYxOqCIAXMxBVbY4ZaSlBCxYDXbAEodQRayzu2DOFTXDhKWLC3bVk8860Ls3fVeaaal2N_iZ8-nOXF2E-eQnDjwJvR8503taNh5nnbemrN_CPcwPOu81SfLd2z28Z0gdZ_vWLH15djUUaH992-yA-RQzlHmMSWTJUQgM5SnarKxkZr22CCVkAmRZNIBdakNrFKC9IAErNMWbSNroVcsYffWUdEp8m73viv0_99-Q1t_FCE |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/ICC52391.2025.11161970 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore (NTUSG) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore (NTUSG) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISBN | 9798331505219 |
| EISSN | 1938-1883 |
| EndPage | 6 |
| ExternalDocumentID | 11161970 |
| Genre | orig-research |
| GroupedDBID | 29F 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP IPLJI OCL RIE RIL RIO |
| ID | FETCH-LOGICAL-i93t-942ceab4e11876865bc2a88cf949c01730f4351ca6c4c580e81139775c9cf8d03 |
| IEDL.DBID | RIE |
| IngestDate | Wed Oct 01 07:05:03 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i93t-942ceab4e11876865bc2a88cf949c01730f4351ca6c4c580e81139775c9cf8d03 |
| PageCount | 6 |
| ParticipantIDs | ieee_primary_11161970 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-June-8 |
| PublicationDateYYYYMMDD | 2025-06-08 |
| PublicationDate_xml | – month: 06 year: 2025 text: 2025-June-8 day: 08 |
| PublicationDecade | 2020 |
| PublicationTitle | IEEE International Conference on Communications (2003) |
| PublicationTitleAbbrev | ICC |
| PublicationYear | 2025 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0052931 |
| Score | 2.2965705 |
| Snippet | Distributed training is critical for efficiently developing deep neural networks (DNNs) on tasks like image classification and natural language processing.... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1 |
| SubjectTerms | Computational modeling Image classification Image coding Indexing Integer linear programming Libraries Memory management Natural language processing Throughput Training |
| Title | DHC: Distributed Homomorphic Compression for Gradient Aggregation in Allreduce |
| URI | https://ieeexplore.ieee.org/document/11161970 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEF60J734qvhmD16T5rG72fVWUmsULB4q9Fayk00p1lRKevHXO7tpfIEguSyBkGWGnW8mme8bQq4DA7oQZeKFIio8BhzjYMK0BwguJtdJIrQlJz-ORPbMHiZ8siGrOy6MMcY1nxnfLt2__GIJa_uprIfnEvP9BCv07USKhqzVhl2OuBVuKMBhoHr3aYo1lrIlYMT99skfM1QchAz3yKh9edM58uKva-3D-y9dxn_vbp90v9h69OkThw7IlqkOye43ocEjMhpk6Q0dWJFcO9_KFDRbvuKFRp4DtTGhaYetKOaw9G7l-sBq2p9hNT5zvqPzivYXi5VVejVdMh7ejtPM20xS8OYqrj3FIkDDM2NniwspuIYolxJKxRTgkYyDErOmEHIB6C0ZGBm6xJCDglIWQXxMOtWyMieEFhE3iOhRqWLOcpkraRE_jCBUTJZanZKutcz0rdHKmLZGOfvj_jnZsQ5yzVfygnTq1dpcIszX-sq59wPaTaZi |
| linkProvider | IEEE |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELYQDMDCq4g3HliTJo6d2GxVSkmhjRiK1K2KHaeqKCmq0oVfz9lpeElIKIsVKYp1J993l9z3HUI3nlYyD4vI8UOSO1QxiIMRlY4CcNGZjKJQGnLyMA2TZ_owZuM1Wd1yYbTWtvlMu2Zp_-XnC7Uyn8racC4h34-gQt9ilFJW07WawMsAufw1Cdj3RLsfx1BlCVMEEuY2z_6YomJBpLeH0ub1de_Ii7uqpKvefykz_nt_-6j1xdfDT59IdIA2dHmIdr9JDR6htJvEt7hrZHLNhCud42TxCheYeaawiQp1Q2yJIYvF90vbCVbhzhTq8an1Hp6VuDOfL43Wq26hUe9uFCfOepaCMxNB5QhKFJieajNdPOQhk4pknKtCUKHgUAZeAXmTr7JQgb-4p7lvU0OmhCp47gXHaLNclPoE4ZwwDZhOChEwmvFMcIP5PlG-oLyQ4hS1jGUmb7VaxqQxytkf96_RdjIaDiaDfvp4jnaMs2wrFr9Am9VypS8B9Ct5ZV39AcL4qa8 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE+International+Conference+on+Communications+%282003%29&rft.atitle=DHC%3A+Distributed+Homomorphic+Compression+for+Gradient+Aggregation+in+Allreduce&rft.au=Liao%2C+Lida&rft.au=Lin%2C+Zhengli&rft.au=Chen%2C+Haodong&rft.au=Zhu%2C+Longlong&rft.date=2025-06-08&rft.pub=IEEE&rft.eissn=1938-1883&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FICC52391.2025.11161970&rft.externalDocID=11161970 |