AdaGL: Adaptive Learning for Agile Distributed Training of Gigantic GNNs

Distributed GNN training on contemporary massive and densely connected graphs requires information aggregation from all neighboring nodes, which leads to an explosion of inter-server communications. This paper proposes AdaGL, a highly scalable end-to-end framework for rapid distributed GNN training....

Full description

Saved in:

Bibliographic Details
Published in	2023 60th ACM/IEEE Design Automation Conference (DAC) pp. 1 - 6
Main Authors	Zhang, Ruisi, Javaheripi, Mojan, Ghodsi, Zahra, Bleiweiss, Amit, Koushanfar, Farinaz
Format	Conference Proceeding
Language	English
Published	IEEE 09.07.2023
Subjects	Adaptive learning Design automation Explosions Linear programming Servers Training
Online Access	Get full text
DOI	10.1109/DAC56929.2023.10248003

Cover

Abstract	Distributed GNN training on contemporary massive and densely connected graphs requires information aggregation from all neighboring nodes, which leads to an explosion of inter-server communications. This paper proposes AdaGL, a highly scalable end-to-end framework for rapid distributed GNN training. AdaGL novelty lies upon our adaptive-learning based graph-allocation engine as well as utilizing multi-resolution coarse representation of dense graphs. As a result, AdaGL achieves an unprecedented level of balanced server computation while minimizing the communication overhead. Extensive proof-of-concept evaluations on billion-scale graphs show AdaGL attains ∼30−40% faster convergence compared with prior arts.
AbstractList	Distributed GNN training on contemporary massive and densely connected graphs requires information aggregation from all neighboring nodes, which leads to an explosion of inter-server communications. This paper proposes AdaGL, a highly scalable end-to-end framework for rapid distributed GNN training. AdaGL novelty lies upon our adaptive-learning based graph-allocation engine as well as utilizing multi-resolution coarse representation of dense graphs. As a result, AdaGL achieves an unprecedented level of balanced server computation while minimizing the communication overhead. Extensive proof-of-concept evaluations on billion-scale graphs show AdaGL attains ∼30−40% faster convergence compared with prior arts.
Author	Koushanfar, Farinaz Zhang, Ruisi Ghodsi, Zahra Bleiweiss, Amit Javaheripi, Mojan
Author_xml	– sequence: 1 givenname: Ruisi surname: Zhang fullname: Zhang, Ruisi email: ruz032@ucsd.edu organization: University of California,San Diego – sequence: 2 givenname: Mojan surname: Javaheripi fullname: Javaheripi, Mojan email: mojan@ucsd.edu organization: University of California,San Diego – sequence: 3 givenname: Zahra surname: Ghodsi fullname: Ghodsi, Zahra email: zahra@purdue.edu organization: Purdue University – sequence: 4 givenname: Amit surname: Bleiweiss fullname: Bleiweiss, Amit email: amit.bleiweiss@intel.com organization: Intel Lab – sequence: 5 givenname: Farinaz surname: Koushanfar fullname: Koushanfar, Farinaz email: farinaz@ucsd.edu organization: University of California,San Diego
BookMark	eNo1j81KxDAURiMoqGPfQCQv0HqTm5_GXZnRjlDGTfdDOr0pgbEd0ir49g7-rM7ig8N3btnlOI3E2IOAQghwj5tqrY2TrpAgsRAgVQmAFyxz1pWoASWqUlyzbJ5jBwZ0qcCoG7atel83T_yM0xI_iTfk0xjHgYcp8WqIR-KbOC8pdh8L9bxNPv7MU-B1HPy4xAOvd7v5jl0Ff5wp--OKtS_P7XqbN2_167pqci8dLHmwmqDTlhR0AlHL8zXhtZAheBeEVYhABh2R7zyZEnqrJFphO6OVP-CK3f9qIxHtTym--_S1_-_FbwkgS9I
ContentType	Conference Proceeding
DBID	6IE 6IH CBEJK RIE RIO
DOI	10.1109/DAC56929.2023.10248003
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	9798350323481
EndPage	6
ExternalDocumentID	10248003
Genre	orig-research
GroupedDBID	6IE 6IH ACM ALMA_UNASSIGNED_HOLDINGS CBEJK RIE RIO
ID	FETCH-LOGICAL-a290t-f75e0b57e40b133525031a512ffa9f174330e639eeabae680d7423717b654ac3
IEDL.DBID	RIE
IngestDate	Wed Aug 27 02:50:59 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a290t-f75e0b57e40b133525031a512ffa9f174330e639eeabae680d7423717b654ac3
PageCount	6
ParticipantIDs	ieee_primary_10248003
PublicationCentury	2000
PublicationDate	2023-July-9
PublicationDateYYYYMMDD	2023-07-09
PublicationDate_xml	– month: 07 year: 2023 text: 2023-July-9 day: 09
PublicationDecade	2020
PublicationTitle	2023 60th ACM/IEEE Design Automation Conference (DAC)
PublicationTitleAbbrev	DAC
PublicationYear	2023
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssib060584064
Score	2.227539
Snippet	Distributed GNN training on contemporary massive and densely connected graphs requires information aggregation from all neighboring nodes, which leads to an...
SourceID	ieee
SourceType	Publisher
StartPage	1
SubjectTerms	Adaptive learning Design automation Explosions Linear programming Servers Training
Title	AdaGL: Adaptive Learning for Agile Distributed Training of Gigantic GNNs
URI	https://ieeexplore.ieee.org/document/10248003
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA66kycVJ_4mB6-tWZukjbcx3YZo8TBht_GSvo4hbEO7i3-9L1mrKAieWgqh5b0k30vf-77H2DWAkVVpZJSlDiKp0iSCqoKIZhdFr9DTZaBHPxV6_CIfpmrakNUDFwYRQ_EZxv425PLLldv4X2W0whOZB23P3SzXW7JWO3l8eo_ASTYs4J4wN3f9gdIE_7FvER63g3-0UQkoMtxnRfv-bfHIa7ypbew-fkkz_vsDD1j3m7DHn7-g6JDt4PKIjfsljB5vOV3WflPjjZbqnFOgyvtz2g_4ndfN9S2vsOSTplsEX1V8tJiTyReOj4rivcsmw_vJYBw1jRMiSIyooypTKKzKUArb86QqRUsXCNrJC6byZ5BUIIUmiGABdS5Kn6-lg53VSoJLj1lnuVriCeMiAyllUMnxwm_WlJg4BUrn4ARgesq63gqz9VYaY9Ya4OyP5-dszzsj1LuaC9ap3zZ4Sahe26vgzU-kd57F
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA6iBz2pOPG3OXhNzdqkXb2Nza3qVjxU2G28Nq9jCNvQ7uJf70vWKgqCp5ZCILyX5Hvpe9_3GLsBiFVpYiWioAChdOALKEsQtLooeoV2aBw9epyGyYt6nOhJTVZ3XBhEdMVn6NlXl8s3y2Jtf5XRDvdVx2l77millN7QtZrlYxN8BE-q5gG3ZXzb7_Z0SAGAZ5uEe83wH41UHI4M9lnazGBTPvLqravcKz5-iTP-e4oHrPVN2ePPX2B0yLZwccSSroHh6I7TY2WPNV6rqc44haq8O6MTgfetcq5teoWGZ3W_CL4s-XA-I6PPCz5M0_cWywb3WS8RdesEAX4sK1FGGmWuI1Qyb1talabNCwTu5Ie4tLeQQCIFJ4iQA4YdaWzGlq52eagVFMEx214sF3jCuIyATO10cqz0Wx4b9AsNOuxAIQGDU9ayVpiuNuIY08YAZ398v2a7STYeTUcP6dM527OOcdWv8QXbrt7WeEkYX-VXzrOfKmmiEg
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+60th+ACM%2FIEEE+Design+Automation+Conference+%28DAC%29&rft.atitle=AdaGL%3A+Adaptive+Learning+for+Agile+Distributed+Training+of+Gigantic+GNNs&rft.au=Zhang%2C+Ruisi&rft.au=Javaheripi%2C+Mojan&rft.au=Ghodsi%2C+Zahra&rft.au=Bleiweiss%2C+Amit&rft.date=2023-07-09&rft.pub=IEEE&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FDAC56929.2023.10248003&rft.externalDocID=10248003