Beyond Histogram Comparison: Distribution-Aware Simple-Path Graph Kernels

R-convolution graph kernels are conventional methods for graph classification. They decompose graphs into substructures and aggregate all the substructure similarity as graph similarity. However, the substructure similarity is based on graph isomorphism, which not only leads to binary similarity val...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on artificial intelligence Vol. 6; no. 8; pp. 2119 - 2132
Main Authors	Ye, Wei, Tang, Shuhao, Tian, Hao, Chen, Qijun
Format	Journal Article
Language	English
Published	IEEE 01.08.2025
Subjects	Accuracy Artificial intelligence Computational modeling Gaussian distribution Gaussian distributions graph kernels Kernel Measurement neural language models Probabilistic logic probabilistic Minkowski distance Probability distribution simple path Training Vectors
Online Access	Get full text
ISSN	2691-4581 2691-4581
DOI	10.1109/TAI.2025.3539642

Cover

Abstract	R-convolution graph kernels are conventional methods for graph classification. They decompose graphs into substructures and aggregate all the substructure similarity as graph similarity. However, the substructure similarity is based on graph isomorphism, which not only leads to binary similarity values but also cannot be aware of the probability distribution of substructures in each graph. Moreover, the simple sum aggregation is not aware of the probability distribution differences of substructures across graphs. These drawbacks cause inaccurate graph similarity. To resolve these problems, we propose a new method called the distribution-aware simple-path (DASP) graph kernel. The neural language models are employed to capture the probability distribution of substructures (specifically, simple paths) in each graph. A new metric called probabilistic Minkowski distance is developed to capture the probability distribution differences of simple paths across graphs. To further improve the performance, the label alphabet is expanded to enlarge the corpus of simple paths for the neural language models and DASP. Experiments demonstrate that DASP achieves the best classification accuracy on all the selected graph benchmark datasets.
AbstractList	R-convolution graph kernels are conventional methods for graph classification. They decompose graphs into substructures and aggregate all the substructure similarity as graph similarity. However, the substructure similarity is based on graph isomorphism, which not only leads to binary similarity values but also cannot be aware of the probability distribution of substructures in each graph. Moreover, the simple sum aggregation is not aware of the probability distribution differences of substructures across graphs. These drawbacks cause inaccurate graph similarity. To resolve these problems, we propose a new method called the distribution-aware simple-path (DASP) graph kernel. The neural language models are employed to capture the probability distribution of substructures (specifically, simple paths) in each graph. A new metric called probabilistic Minkowski distance is developed to capture the probability distribution differences of simple paths across graphs. To further improve the performance, the label alphabet is expanded to enlarge the corpus of simple paths for the neural language models and DASP. Experiments demonstrate that DASP achieves the best classification accuracy on all the selected graph benchmark datasets.
Author	Ye, Wei Chen, Qijun Tian, Hao Tang, Shuhao
Author_xml	– sequence: 1 givenname: Wei orcidid: 0000-0002-3784-7788 surname: Ye fullname: Ye, Wei email: yew@tongji.edu.cn organization: College of Electronic and Information Engineering, Shanghai Institute of Intelligent Science and Technology, Tongji University, Shanghai, China – sequence: 2 givenname: Shuhao orcidid: 0009-0009-6471-4364 surname: Tang fullname: Tang, Shuhao email: tangsh2022@tongji.edu.cn organization: College of Electronic and Information Engineering, Tongji University, Shanghai, China – sequence: 3 givenname: Hao orcidid: 0009-0007-0271-7135 surname: Tian fullname: Tian, Hao email: 2133036@tongji.edu.cn organization: College of Electronic and Information Engineering, Tongji University, Shanghai, China – sequence: 4 givenname: Qijun orcidid: 0000-0001-5644-1188 surname: Chen fullname: Chen, Qijun email: qjchen@tongji.edu.cn organization: College of Electronic and Information Engineering, Tongji University, Shanghai, China
BookMark	eNpNkD1PwzAURS1UJErpzsCQP5Dgj8SJ2UIobUQlkChz5CTP1CiJIzsI9d_jqh063aune95wbtFsMAMgdE9wRAgWj7u8jCimScQSJnhMr9CcckHCOMnI7KLfoKVzPxj7KaGUpnNUPsPBDG2w0W4y31b2QWH6UVrtzPAUvPir1fXvpM0Q5n_SQvCp-7GD8ENO-2Bt5bgP3sAO0Lk7dK1k52B5zgX6el3tik24fV-XRb4NG8LoFMq6xopRCSqVXLG45QklqeCJiIFwIEq2om05BwVcKaa4YD5pnHHwTJOyBcKnv401zllQ1Wh1L-2hIrg62qi8jepoozrb8MjDCdEAcDHP0jRLOPsHMydeNA
CODEN	ITAICB
Cites_doi	10.1016/j.eswa.2013.12.001 10.1145/1961189.1961199 10.48550/arXiv.1310.4546 10.24963/ijcai.2022/310 10.1109/CVPR.2006.68 10.1109/ICDM.2005.132 10.1021/jm00106a046 10.1613/jair.1.13225 10.1137/1.9781611972825.84 10.1109/ICDE.2013.6544842 10.1145/1014052.1014072 10.1109/CVPR42600.2020.00416 10.1007/978-3-319-29659-3 10.1109/TKDE.2019.2946149 10.1007/978-3-031-70371-3_11 10.1093/bib/bbab159 10.1145/2623330.2623732 10.1007/s41109-019-0195-3 10.24963/ijcai.2022/293 10.1609/aaai.v31i1.10839 10.3115/v1/D14-1162 10.1109/TCYB.2016.2526058 10.1145/245108.245121 10.1109/TNNLS.2024.3370918 10.1093/bioinformatics/bti1007 10.1007/s10994-008-5086-2 10.1007/s10994-022-06131-w 10.1609/aaai.v36i8.20793 10.1109/TNNLS.2024.3371592 10.1145/3535101 10.1109/TKDE.2024.3389966 10.1021/ci034143r 10.1016/j.patcog.2014.03.028 10.1145/2783258.2783417 10.1561/2200000076 10.48550/ARXIV.1706.03762 10.1109/TPAMI.2015.2477830 10.1109/TAI.2023.3333830 10.1007/978-3-540-45167-9_11 10.1609/aaai.v34i05.6211
ContentType	Journal Article
DBID	97E RIA RIE AAYXX CITATION
DOI	10.1109/TAI.2025.3539642
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Xplore Digital Library CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISSN	2691-4581
EndPage	2132
ExternalDocumentID	10_1109_TAI_2025_3539642 10877856
Genre	orig-research
GrantInformation_xml	– fundername: National Natural Science Foundation of China grantid: 62176184 funderid: 10.13039/501100001809 – fundername: National Key Research and Development Program of China grantid: 2020AAA0108100 funderid: 10.13039/501100012166 – fundername: Fundamental Research Funds for the Central Universities; Fundamental Research Funds for the Central Universities of China funderid: 10.13039/501100012226
GroupedDBID	0R~ 97E AASAJ AAWTH ABAZT ABJNI ABQJQ ABVLG AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS IEDLZ IFIPE JAVBF M~E OCL RIA RIE AAYXX CITATION
ID	FETCH-LOGICAL-c132t-abb0f32aef7a6f34d6521796594e16e1fad9dd66efe6ff3f6936ff2486eaefc73
IEDL.DBID	RIE
ISSN	2691-4581
IngestDate	Wed Oct 01 05:34:48 EDT 2025 Wed Aug 27 07:40:20 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	8
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c132t-abb0f32aef7a6f34d6521796594e16e1fad9dd66efe6ff3f6936ff2486eaefc73
ORCID	0000-0002-3784-7788 0009-0009-6471-4364 0009-0007-0271-7135 0000-0001-5644-1188
PageCount	14
ParticipantIDs	crossref_primary_10_1109_TAI_2025_3539642 ieee_primary_10877856
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2025-Aug.
PublicationDateYYYYMMDD	2025-08-01
PublicationDate_xml	– month: 08 year: 2025 text: 2025-Aug.
PublicationDecade	2020
PublicationTitle	IEEE transactions on artificial intelligence
PublicationTitleAbbrev	TAI
PublicationYear	2025
Publisher	IEEE
Publisher_xml	– name: IEEE
References	Togninalli (ref28) 2020; 9 ref57 ref12 ref56 ref59 Abboud (ref41) 2022 ref10 ref17 ref16 Kipf (ref48) 2017 ref18 Bai (ref55) 2022 ref51 ref50 Nikolentzos (ref38) 2023 ref45 Ramon (ref23) 2003 ref44 Sedgewick (ref61) 2003; 5 ref43 Harary (ref58) 1969 Costa (ref20) 2010 ref49 Siglidis (ref65) 2018 ref8 ref7 Cuturi (ref32) 2013; 26 ref9 ref4 Shervashidze (ref25) 2009 ref3 ref5 Gretton (ref39) 2012; 13 ref40 Du (ref53) 2019 Shervashidze (ref19) 2009 Shervashidze (ref26) 2011; 12 ref34 ref37 ref36 ref31 ref30 Graziani (ref46) 2024; 235 Tang (ref54) 2022 ref2 ref1 Kondor (ref22) 2016 ref71 ref70 Kersting (ref62) 2016 ref24 Altschuler (ref33) 2017; 30 ref68 ref67 ref69 ref64 Zhang (ref14) 2018 Burdick (ref63) 2018; 1 ref66 ref21 Leman (ref47) 1968; 2 ref27 Devlin (ref35) 2018 Vishwanathan (ref13) 2010; 11 ref29 Haussler (ref11) 1999 Li (ref6) 2021; 34 ref60 Kriege (ref15) 2022; 35 Grauman (ref52) 2007; 8 Michel (ref42) 2023
References_xml	– volume: 13 start-page: 723 issue: 1 year: 2012 ident: ref39 article-title: A kernel two-sample test publication-title: J. Mach. Learn. Res. – volume: 2 start-page: 12 issue: 9 year: 1968 ident: ref47 article-title: A reduction of a graph to a canonical form and an algebra arising during this reduction publication-title: Nauchno-Technicheskaya Informatsiya – ident: ref49 doi: 10.1016/j.eswa.2013.12.001 – ident: ref66 doi: 10.1145/1961189.1961199 – ident: ref31 doi: 10.48550/arXiv.1310.4546 – ident: ref40 doi: 10.24963/ijcai.2022/310 – start-page: 134 volume-title: Proc. Adv. Neur. Inf. Process. Syst. year: 2022 ident: ref54 article-title: Graphqntk: Quantum neural tangent kernel for graph data – volume: 35 start-page: 20119 year: 2022 ident: ref15 article-title: Weisfeiler and Leman go walking: Random walk kernels revisited publication-title: Adv. Neur. Inf. Process. Syst. – volume: 30 start-page: 256 year: 2017 ident: ref33 article-title: Near-linear time approximation algorithms for optimal transport via sinkhorn iteration publication-title: Adv. Neur. Inf. Process. Syst. – ident: ref51 doi: 10.1109/CVPR.2006.68 – ident: ref16 doi: 10.1109/ICDM.2005.132 – start-page: 2990 volume-title: Proc. Adv. Neur. Inf. Process. Syst. year: 2016 ident: ref22 article-title: The multiscale laplacian graph kernel – ident: ref69 doi: 10.1021/jm00106a046 – year: 1969 ident: ref58 article-title: Graph theory Addison-Wesley reading MA USA – ident: ref8 doi: 10.1613/jair.1.13225 – ident: ref27 doi: 10.1137/1.9781611972825.84 – ident: ref71 doi: 10.1109/ICDE.2013.6544842 – ident: ref21 doi: 10.1145/1014052.1014072 – volume: 1 start-page: 2092 volume-title: Proc. 2018 Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Lang. Technologies year: 2018 ident: ref63 article-title: Factors influencing the surprising instability of word embeddings – ident: ref5 doi: 10.1109/CVPR42600.2020.00416 – ident: ref2 doi: 10.1007/978-3-319-29659-3 – start-page: 65 volume-title: Proc. 1st Int. Workshop Mining Graphs, Trees Sequences year: 2003 ident: ref23 article-title: Expressivity versus efficiency of graph kernels – ident: ref17 doi: 10.1109/TKDE.2019.2946149 – start-page: 24737 volume-title: Int. Conf. Mach. Learn. year: 2023 ident: ref42 article-title: Path neural networks: Expressive and accurate graph neural networks – volume: 26 start-page: 234 year: 2013 ident: ref32 article-title: Sinkhorn distances: Lightspeed computation of optimal transport publication-title: Adv. Neural Inf. Process. Syst. – volume: 12 start-page: 2539 issue: 9 year: 2011 ident: ref26 article-title: Weisfeiler-Lehman graph kernels publication-title: J. Mach. Learn. Res. – ident: ref36 doi: 10.1007/978-3-031-70371-3_11 – volume: 235 start-page: 16226 volume-title: Proc. 41st Int. Conf. Mach. Learn., Proceedings of Machine Learning Research year: 2024 ident: ref46 article-title: The expressive power of path-based graph neural networks – volume-title: Proc. Adv. Neur. Inf. Process. Syst. year: 2019 ident: ref53 article-title: Graph neural tangent kernel: Fusing graph neural networks with graph kernels – ident: ref4 doi: 10.1093/bib/bbab159 – ident: ref59 doi: 10.1145/2623330.2623732 – start-page: 255 volume-title: Proc. Int. Conf. Mach. Learn. year: 2010 ident: ref20 article-title: Fast neighborhood subgraph pairwise distance kernel – start-page: 5 volume-title: Learn. Graphs Conf. year: 2022 ident: ref41 article-title: Shortest path networks for graph property prediction – ident: ref9 doi: 10.1007/s41109-019-0195-3 – ident: ref43 doi: 10.24963/ijcai.2022/293 – ident: ref50 doi: 10.1609/aaai.v31i1.10839 – ident: ref64 doi: 10.3115/v1/D14-1162 – ident: ref67 doi: 10.1109/TCYB.2016.2526058 – ident: ref1 doi: 10.1145/245108.245121 – ident: ref44 doi: 10.1109/TNNLS.2024.3370918 – start-page: 1327 volume-title: Int. Conf. Mach. Learn. year: 2022 ident: ref55 article-title: A hierarchical transitive-aligned graph kernel for un-attributed graphs – year: 2016 ident: ref62 article-title: Benchmark data sets for graph kernels – ident: ref70 doi: 10.1093/bioinformatics/bti1007 – ident: ref24 doi: 10.1007/s10994-008-5086-2 – ident: ref29 doi: 10.1007/s10994-022-06131-w – start-page: 3964 volume-title: Proc. Adv. Neur. Inf. Process. Syst. year: 2018 ident: ref14 article-title: RetGK: Graph kernels based on return probabilities of random walks – year: 2018 ident: ref65 article-title: Grakel: A graph kernel library in python – ident: ref37 doi: 10.1609/aaai.v36i8.20793 – volume: 5 issue: 3 year: 2003 ident: ref61 article-title: Part 5: Graph algorithms publication-title: Algorithms C – volume: 34 start-page: 29541 year: 2021 ident: ref6 article-title: Learning distilled collaboration graph for multi-agent perception publication-title: Adv. Neur. Inf. Process. Syst. – ident: ref45 doi: 10.1109/TNNLS.2024.3371592 – ident: ref3 doi: 10.1145/3535101 – ident: ref56 doi: 10.1109/TKDE.2024.3389966 – start-page: 488 volume-title: Proc. Int. Conf. Artif. Intell. Statist. year: 2009 ident: ref19 article-title: Efficient graphlet kernels for large graph comparison – volume: 8 issue: 4 year: 2007 ident: ref52 article-title: The pyramid match kernel: Efficient learning with sets of features. publication-title: J. Mach. Learn. Res. – ident: ref68 doi: 10.1021/ci034143r – volume: 9 start-page: 6407 year: 2020 ident: ref28 article-title: Wasserstein weisfeiler-lehman graph kernels publication-title: Adv. Neur. Inf. Process. Syst. – ident: ref57 doi: 10.1016/j.patcog.2014.03.028 – start-page: 1660 volume-title: Proc. Advances in Neural Inf. Process. Syst. year: 2009 ident: ref25 article-title: Fast subtree kernels on graphs – ident: ref30 doi: 10.1145/2783258.2783417 – year: 2018 ident: ref35 article-title: Bert: Pre-training of deep bidirectional transformers for language understanding – start-page: 2019 volume-title: Int. Conf. Artif. Intell. Statist. year: 2023 ident: ref38 article-title: Graph alignment kernels using weisfeiler and leman hierarchies – ident: ref10 doi: 10.1561/2200000076 – ident: ref34 doi: 10.48550/ARXIV.1706.03762 – start-page: 237 volume-title: Proc. Int. Conf. Learn. Represent. year: 2017 ident: ref48 article-title: Semi-supervised classification with graph convolutional networks – volume: 11 start-page: 1201 issue: 4 year: 2010 ident: ref13 article-title: Graph kernels publication-title: J. Machine Learn. Res. – ident: ref60 doi: 10.1109/TPAMI.2015.2477830 – ident: ref18 doi: 10.1109/TAI.2023.3333830 – ident: ref12 doi: 10.1007/978-3-540-45167-9_11 – ident: ref7 doi: 10.1609/aaai.v34i05.6211 – year: 1999 ident: ref11 article-title: Convolution kernels on discrete structures
SSID	ssj0002512227
Score	2.3015838
Snippet	R-convolution graph kernels are conventional methods for graph classification. They decompose graphs into substructures and aggregate all the substructure...
SourceID	crossref ieee
SourceType	Index Database Publisher
StartPage	2119
SubjectTerms	Accuracy Artificial intelligence Computational modeling Gaussian distribution Gaussian distributions graph kernels Kernel Measurement neural language models Probabilistic logic probabilistic Minkowski distance Probability distribution simple path Training Vectors
Title	Beyond Histogram Comparison: Distribution-Aware Simple-Path Graph Kernels
URI	https://ieeexplore.ieee.org/document/10877856
Volume	6
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 2691-4581 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002512227 issn: 2691-4581 databaseCode: RIE dateStart: 20200101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2691-4581 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0002512227 issn: 2691-4581 databaseCode: M~E dateStart: 20200101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEB5sT16sj4r1UfbgxUNimk324a3U1lZpEWyht7BJZkGEVEqK4MHf7m4epQqCp4SwS5bZXb6Z3ZnvA7j2YuErqwFIaWwCFBpyR8pUO1okNFEpD1DbeufpjI0XweMyXFbF6kUtDCIWyWfo2tfiLj9dJRt7VGZ2uOBchKwBDS5YWay1PVCxQO37vL6K9OTtvD8xAaAfujSkkgX-D-jZ0VIpoGTUglk9iDKD5M3d5LGbfP7iZ_z3KA_hoHIqSb9cBUewh9kxtGrBBlLt3xOYlPUqpOAGsWlZZLCVIbwj95ZDt5K_cvofao3k5dWSBzvPxk0kD5bamjzhOjNw2obFaDgfjJ1KS8FJTLyZOyqOPU19hZorpmmQMoPb3LIJBthj2NMqlWnKGGpkWlPNJDVPPxAMTZ-E01NoZqsMz4BQgcqgvAxobLVKmMJezJJQmP8w4_6pDtzUZo7eS8qMqAg1PBmZKYnslETVlHSgbQ2406603fkf3y9g33YvM_AuoZmvN3hlvII87kJj-jXsFmviG-dJtzI
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bS8MwFA46H_TFeZk4r3nwxYfWrbk1vo3p3NwFwQ32VpL2BEToZHQI_nqTXsYUBJ9aStqGk4TvnOSc70PopqXDQDkNQEK0DVAIE56UifFMGJNYJYKCcfXO4wnvz-jznM3LYvW8FgYA8uQz8N1tfpafLOKV2yqzKzwUImR8G-0wSikryrXWWyoOqoNAVIeRLXk37QxsCBgwnzAiOQ1-gM-GmkoOJr06mlTdKHJI3v1Vpv346xdD47_7eYD2S7cSd4p5cIi2ID1C9UqyAZcr-BgNiooVnLODuMQs3F0LEd7jB8eiWwpgeZ1PtQT8-ubog70X6yjiJ0dujYewTC2gNtCs9zjt9r1STcGLbcSZeUrrliGBAiMUN4Qm3CK3cHyCFNoc2kYlMkk4BwPcGGK4JPYa0JCDfScW5ATV0kUKpwiTEJTFeUmJdmolXEFb85iF9j_cOoCqiW4rM0cfBWlGlAcbLRnZIYnckETlkDRRwxlwo11hu7M_nl-j3f50PIpGg8nwHO25TxX5eBeoli1XcGl9hExf5TPjG1VmuU4
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Beyond+Histogram+Comparison%3A+Distribution-Aware+Simple-Path+Graph+Kernels&rft.jtitle=IEEE+transactions+on+artificial+intelligence&rft.au=Ye%2C+Wei&rft.au=Tang%2C+Shuhao&rft.au=Tian%2C+Hao&rft.au=Chen%2C+Qijun&rft.date=2025-08-01&rft.pub=IEEE&rft.eissn=2691-4581&rft.volume=6&rft.issue=8&rft.spage=2119&rft.epage=2132&rft_id=info:doi/10.1109%2FTAI.2025.3539642&rft.externalDocID=10877856
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2691-4581&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2691-4581&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2691-4581&client=summon