Input Feature Pruning for Accelerating GNN Inference on Heterogeneous Platforms

Graph Neural Networks (GNNs) are an emerging class of machine learning models which utilize structured graph information and node features to reduce high-dimensional input data to low-dimensional embeddings, from which predictions can be made. Due to the compounding effect of aggregating neighbor in...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings - International Conference on High Performance Computing pp. 282 - 291
Main Authors	Yik, Jason, Kuppannagari, Sanmukh R., Zeng, Hanqing, Prasanna, Viktor K.
Format	Conference Proceeding
Language	English
Published	IEEE 01.12.2022
Subjects	accuracy/performance trade-off Analytical models Computational modeling data science algorithms graph neural network High performance computing input feature pruning Machine learning Prediction algorithms Predictive models Solid modeling
Online Access	Get full text
ISSN	2640-0316
DOI	10.1109/HiPC56025.2022.00045

Cover

Abstract	Graph Neural Networks (GNNs) are an emerging class of machine learning models which utilize structured graph information and node features to reduce high-dimensional input data to low-dimensional embeddings, from which predictions can be made. Due to the compounding effect of aggregating neighbor information, GNN inferences require raw data from many times more nodes than are targeted for prediction. Thus, on heterogeneous compute platforms, inference latency can be largely subject to the inter-device communication cost of transferring input feature data to the GPU/accelerator before computation has even begun. In this paper, we analyze the trade-off effect of pruning input features from GNN models, reducing the volume of raw data that the model works with to lower communication latency at the expense of an expected decrease in the overall model accuracy. We develop greedy and regression-based algorithms to determine which features to retain for optimal prediction accuracy. We evaluate pruned model variants and find that they can reduce inference latency by up to 80% with an accuracy loss of less than 5% compared to non-pruned models. Furthermore, we show that the latency reductions from input feature pruning can be extended under different system variables such as batch size and floating point precision.
AbstractList	Graph Neural Networks (GNNs) are an emerging class of machine learning models which utilize structured graph information and node features to reduce high-dimensional input data to low-dimensional embeddings, from which predictions can be made. Due to the compounding effect of aggregating neighbor information, GNN inferences require raw data from many times more nodes than are targeted for prediction. Thus, on heterogeneous compute platforms, inference latency can be largely subject to the inter-device communication cost of transferring input feature data to the GPU/accelerator before computation has even begun. In this paper, we analyze the trade-off effect of pruning input features from GNN models, reducing the volume of raw data that the model works with to lower communication latency at the expense of an expected decrease in the overall model accuracy. We develop greedy and regression-based algorithms to determine which features to retain for optimal prediction accuracy. We evaluate pruned model variants and find that they can reduce inference latency by up to 80% with an accuracy loss of less than 5% compared to non-pruned models. Furthermore, we show that the latency reductions from input feature pruning can be extended under different system variables such as batch size and floating point precision.
Author	Kuppannagari, Sanmukh R. Yik, Jason Zeng, Hanqing Prasanna, Viktor K.
Author_xml	– sequence: 1 givenname: Jason surname: Yik fullname: Yik, Jason email: jyik@g.harvard.edu organization: Harvard University – sequence: 2 givenname: Sanmukh R. surname: Kuppannagari fullname: Kuppannagari, Sanmukh R. email: sanmukh.kuppannagari@case.edu organization: Case Western Reserve University – sequence: 3 givenname: Hanqing surname: Zeng fullname: Zeng, Hanqing email: zengh@meta.com organization: Meta AI – sequence: 4 givenname: Viktor K. surname: Prasanna fullname: Prasanna, Viktor K. email: prasanna@usc.edu organization: University of Southern California
BookMark	eNotj8Fqg0AURaelhSZp_iCL-QHtm-fM6CyDNIkQEhftOozjM1jMGEZd9O9raVcHLtwDZ8mefO-JsY2AWAgwb4e2zJUGVDECYgwAUj2wtUkzobWSRmKiH9kCtYQIEqFf2HIYvgAQBKoFOxf-Po18R3acAvEyTL71V970gW-do46CHX-H_enEC99QIO-I954faKTQX8lTPw287Ow4f27DK3tubDfQ-p8r9rl7_8gP0fG8L_LtMWoR5BhZidDUNc50TeoqkwmrgEC4Sqa1hMxWVhsFWoCjDOcMRypxqnKVJgOUrNjmz9sS0eUe2psN3xcBAnQyJ_8AAllRuQ
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/HiPC56025.2022.00045
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISBN	9781665494236 1665494239
EISSN	2640-0316
EndPage	291
ExternalDocumentID	10106342
Genre	orig-research
GrantInformation_xml	– fundername: National Science Foundation funderid: 10.13039/100000001
GroupedDBID	29H 29O 6IE 6IF 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK OCL RIE RIL
ID	FETCH-LOGICAL-i204t-a420fdd2a42cf7cb981a50e01cb47d408aba6950610ce82494ce53c5bcb6e90e3
IEDL.DBID	RIE
IngestDate	Wed Aug 27 02:21:30 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i204t-a420fdd2a42cf7cb981a50e01cb47d408aba6950610ce82494ce53c5bcb6e90e3
PageCount	10
ParticipantIDs	ieee_primary_10106342
PublicationCentury	2000
PublicationDate	2022-Dec.
PublicationDateYYYYMMDD	2022-12-01
PublicationDate_xml	– month: 12 year: 2022 text: 2022-Dec.
PublicationDecade	2020
PublicationTitle	Proceedings - International Conference on High Performance Computing
PublicationTitleAbbrev	HIPC
PublicationYear	2022
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0020125
Score	2.2267404
Snippet	Graph Neural Networks (GNNs) are an emerging class of machine learning models which utilize structured graph information and node features to reduce...
SourceID	ieee
SourceType	Publisher
StartPage	282
SubjectTerms	accuracy/performance trade-off Analytical models Computational modeling data science algorithms graph neural network High performance computing input feature pruning Machine learning Prediction algorithms Predictive models Solid modeling
Title	Input Feature Pruning for Accelerating GNN Inference on Heterogeneous Platforms
URI	https://ieeexplore.ieee.org/document/10106342
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3JTsMwELWgJ05lKWKXD1xTnMRO4iOqKCkSoQcq9VZ5mUgVKEFtcuHrGWcpAgmJUyJLWeSx_ebZ82YIuUXEMIhciad8bTyEgNyTCKSeAm5dJlSbS6d3fs6idMGflmLZidUbLQwANMFnMHa3zVm-LU3ttspwhiOBCTmuuPtxErVirR27wpVWdNo4n8m7dD2fIJoHAjlg0CblFD8qqDQAMh2SrP90GzfyNq4rPTafv7Iy_vvfDsnoW6tH5zsUOiJ7UByTYV-sgXZz94S8zApsoc7lqzfukdrtiFD0Wem9MQg-bihgw2OW0dnuzWVBUxcwU-I4g7Le0vm7qpyfux2RxfThdZJ6XTUFbx0wXnmKByy3NsCryWOjZeIrwYD5RvPYcpYorSIpEN-ZgQRZGTcgQiO00RFIBuEpGRRlAWeEamQ91g9UnEuJ_BOSOAc_zLU2HE0fROdk5Dpo9dEmzFj1fXPxR_slOXBGaqNErsig2tRwjVhf6ZvGxl-Toqlb
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELYQDDCVRxFvPLCmOImdxCOqKCm0oUMrdats5yJVoAS1ycKv55ykRSAhMSWylId8tr_77PvuCLlDxDCIXJGjXG0chIDMkQikjgKe2kyoaSat3nmcBPGMP8_FvBWr11oYAKiDz6Bnb-uz_LQwld0qwxmOBMbnuOLuCc65aORaW36Fa61o1XEuk_fxctJHPPcEskCvScspftRQqSFk0CHJ5uNN5Mhbryp1z3z-ysv47787JN1vtR6dbHHoiOxAfkw6m3INtJ29J-R1mGMLtU5ftbKPVHZPhKLXSh-MQfixgwEbnpKEDrdvLnIa25CZAkcaFNWaTt5VaT3ddZfMBo_Tfuy09RScpcd46SjusSxNPbyaLDRaRq4SDJhrNA9TziKlVSAFIjwzECEv4waEb4Q2OgDJwD8lu3mRwxmhGnlP6noqzKREBgpRmIHrZ1obtIvxgnPStR20-GhSZiw2fXPxR_st2Y-n49FiNExeLsmBNVgTM3JFdstVBdeI_KW-qe39BaFWrKg
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+-+International+Conference+on+High+Performance+Computing&rft.atitle=Input+Feature+Pruning+for+Accelerating+GNN+Inference+on+Heterogeneous+Platforms&rft.au=Yik%2C+Jason&rft.au=Kuppannagari%2C+Sanmukh+R.&rft.au=Zeng%2C+Hanqing&rft.au=Prasanna%2C+Viktor+K.&rft.date=2022-12-01&rft.pub=IEEE&rft.eissn=2640-0316&rft.spage=282&rft.epage=291&rft_id=info:doi/10.1109%2FHiPC56025.2022.00045&rft.externalDocID=10106342