Beyond Histogram Comparison: Distribution-Aware Simple-Path Graph Kernels

R-convolution graph kernels are conventional methods for graph classification. They decompose graphs into substructures and aggregate all the substructure similarity as graph similarity. However, the substructure similarity is based on graph isomorphism, which not only leads to binary similarity val...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on artificial intelligence Vol. 6; no. 8; pp. 2119 - 2132
Main Authors Ye, Wei, Tang, Shuhao, Tian, Hao, Chen, Qijun
Format Journal Article
LanguageEnglish
Published IEEE 01.08.2025
Subjects
Online AccessGet full text
ISSN2691-4581
2691-4581
DOI10.1109/TAI.2025.3539642

Cover

Abstract R-convolution graph kernels are conventional methods for graph classification. They decompose graphs into substructures and aggregate all the substructure similarity as graph similarity. However, the substructure similarity is based on graph isomorphism, which not only leads to binary similarity values but also cannot be aware of the probability distribution of substructures in each graph. Moreover, the simple sum aggregation is not aware of the probability distribution differences of substructures across graphs. These drawbacks cause inaccurate graph similarity. To resolve these problems, we propose a new method called the distribution-aware simple-path (DASP) graph kernel. The neural language models are employed to capture the probability distribution of substructures (specifically, simple paths) in each graph. A new metric called probabilistic Minkowski distance is developed to capture the probability distribution differences of simple paths across graphs. To further improve the performance, the label alphabet is expanded to enlarge the corpus of simple paths for the neural language models and DASP. Experiments demonstrate that DASP achieves the best classification accuracy on all the selected graph benchmark datasets.
AbstractList R-convolution graph kernels are conventional methods for graph classification. They decompose graphs into substructures and aggregate all the substructure similarity as graph similarity. However, the substructure similarity is based on graph isomorphism, which not only leads to binary similarity values but also cannot be aware of the probability distribution of substructures in each graph. Moreover, the simple sum aggregation is not aware of the probability distribution differences of substructures across graphs. These drawbacks cause inaccurate graph similarity. To resolve these problems, we propose a new method called the distribution-aware simple-path (DASP) graph kernel. The neural language models are employed to capture the probability distribution of substructures (specifically, simple paths) in each graph. A new metric called probabilistic Minkowski distance is developed to capture the probability distribution differences of simple paths across graphs. To further improve the performance, the label alphabet is expanded to enlarge the corpus of simple paths for the neural language models and DASP. Experiments demonstrate that DASP achieves the best classification accuracy on all the selected graph benchmark datasets.
Author Ye, Wei
Chen, Qijun
Tian, Hao
Tang, Shuhao
Author_xml – sequence: 1
  givenname: Wei
  orcidid: 0000-0002-3784-7788
  surname: Ye
  fullname: Ye, Wei
  email: yew@tongji.edu.cn
  organization: College of Electronic and Information Engineering, Shanghai Institute of Intelligent Science and Technology, Tongji University, Shanghai, China
– sequence: 2
  givenname: Shuhao
  orcidid: 0009-0009-6471-4364
  surname: Tang
  fullname: Tang, Shuhao
  email: tangsh2022@tongji.edu.cn
  organization: College of Electronic and Information Engineering, Tongji University, Shanghai, China
– sequence: 3
  givenname: Hao
  orcidid: 0009-0007-0271-7135
  surname: Tian
  fullname: Tian, Hao
  email: 2133036@tongji.edu.cn
  organization: College of Electronic and Information Engineering, Tongji University, Shanghai, China
– sequence: 4
  givenname: Qijun
  orcidid: 0000-0001-5644-1188
  surname: Chen
  fullname: Chen, Qijun
  email: qjchen@tongji.edu.cn
  organization: College of Electronic and Information Engineering, Tongji University, Shanghai, China
BookMark eNpNkD1PwzAURS1UJErpzsCQP5Dgj8SJ2UIobUQlkChz5CTP1CiJIzsI9d_jqh063aune95wbtFsMAMgdE9wRAgWj7u8jCimScQSJnhMr9CcckHCOMnI7KLfoKVzPxj7KaGUpnNUPsPBDG2w0W4y31b2QWH6UVrtzPAUvPir1fXvpM0Q5n_SQvCp-7GD8ENO-2Bt5bgP3sAO0Lk7dK1k52B5zgX6el3tik24fV-XRb4NG8LoFMq6xopRCSqVXLG45QklqeCJiIFwIEq2om05BwVcKaa4YD5pnHHwTJOyBcKnv401zllQ1Wh1L-2hIrg62qi8jepoozrb8MjDCdEAcDHP0jRLOPsHMydeNA
CODEN ITAICB
Cites_doi 10.1016/j.eswa.2013.12.001
10.1145/1961189.1961199
10.48550/arXiv.1310.4546
10.24963/ijcai.2022/310
10.1109/CVPR.2006.68
10.1109/ICDM.2005.132
10.1021/jm00106a046
10.1613/jair.1.13225
10.1137/1.9781611972825.84
10.1109/ICDE.2013.6544842
10.1145/1014052.1014072
10.1109/CVPR42600.2020.00416
10.1007/978-3-319-29659-3
10.1109/TKDE.2019.2946149
10.1007/978-3-031-70371-3_11
10.1093/bib/bbab159
10.1145/2623330.2623732
10.1007/s41109-019-0195-3
10.24963/ijcai.2022/293
10.1609/aaai.v31i1.10839
10.3115/v1/D14-1162
10.1109/TCYB.2016.2526058
10.1145/245108.245121
10.1109/TNNLS.2024.3370918
10.1093/bioinformatics/bti1007
10.1007/s10994-008-5086-2
10.1007/s10994-022-06131-w
10.1609/aaai.v36i8.20793
10.1109/TNNLS.2024.3371592
10.1145/3535101
10.1109/TKDE.2024.3389966
10.1021/ci034143r
10.1016/j.patcog.2014.03.028
10.1145/2783258.2783417
10.1561/2200000076
10.48550/ARXIV.1706.03762
10.1109/TPAMI.2015.2477830
10.1109/TAI.2023.3333830
10.1007/978-3-540-45167-9_11
10.1609/aaai.v34i05.6211
ContentType Journal Article
DBID 97E
RIA
RIE
AAYXX
CITATION
DOI 10.1109/TAI.2025.3539642
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Xplore Digital Library
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 2691-4581
EndPage 2132
ExternalDocumentID 10_1109_TAI_2025_3539642
10877856
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 62176184
  funderid: 10.13039/501100001809
– fundername: National Key Research and Development Program of China
  grantid: 2020AAA0108100
  funderid: 10.13039/501100012166
– fundername: Fundamental Research Funds for the Central Universities; Fundamental Research Funds for the Central Universities of China
  funderid: 10.13039/501100012226
GroupedDBID 0R~
97E
AASAJ
AAWTH
ABAZT
ABJNI
ABQJQ
ABVLG
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
IEDLZ
IFIPE
JAVBF
M~E
OCL
RIA
RIE
AAYXX
CITATION
ID FETCH-LOGICAL-c132t-abb0f32aef7a6f34d6521796594e16e1fad9dd66efe6ff3f6936ff2486eaefc73
IEDL.DBID RIE
ISSN 2691-4581
IngestDate Wed Oct 01 05:34:48 EDT 2025
Wed Aug 27 07:40:20 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 8
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c132t-abb0f32aef7a6f34d6521796594e16e1fad9dd66efe6ff3f6936ff2486eaefc73
ORCID 0000-0002-3784-7788
0009-0009-6471-4364
0009-0007-0271-7135
0000-0001-5644-1188
PageCount 14
ParticipantIDs crossref_primary_10_1109_TAI_2025_3539642
ieee_primary_10877856
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2025-Aug.
PublicationDateYYYYMMDD 2025-08-01
PublicationDate_xml – month: 08
  year: 2025
  text: 2025-Aug.
PublicationDecade 2020
PublicationTitle IEEE transactions on artificial intelligence
PublicationTitleAbbrev TAI
PublicationYear 2025
Publisher IEEE
Publisher_xml – name: IEEE
References Togninalli (ref28) 2020; 9
ref57
ref12
ref56
ref59
Abboud (ref41) 2022
ref10
ref17
ref16
Kipf (ref48) 2017
ref18
Bai (ref55) 2022
ref51
ref50
Nikolentzos (ref38) 2023
ref45
Ramon (ref23) 2003
ref44
Sedgewick (ref61) 2003; 5
ref43
Harary (ref58) 1969
Costa (ref20) 2010
ref49
Siglidis (ref65) 2018
ref8
ref7
Cuturi (ref32) 2013; 26
ref9
ref4
Shervashidze (ref25) 2009
ref3
ref5
Gretton (ref39) 2012; 13
ref40
Du (ref53) 2019
Shervashidze (ref19) 2009
Shervashidze (ref26) 2011; 12
ref34
ref37
ref36
ref31
ref30
Graziani (ref46) 2024; 235
Tang (ref54) 2022
ref2
ref1
Kondor (ref22) 2016
ref71
ref70
Kersting (ref62) 2016
ref24
Altschuler (ref33) 2017; 30
ref68
ref67
ref69
ref64
Zhang (ref14) 2018
Burdick (ref63) 2018; 1
ref66
ref21
Leman (ref47) 1968; 2
ref27
Devlin (ref35) 2018
Vishwanathan (ref13) 2010; 11
ref29
Haussler (ref11) 1999
Li (ref6) 2021; 34
ref60
Kriege (ref15) 2022; 35
Grauman (ref52) 2007; 8
Michel (ref42) 2023
References_xml – volume: 13
  start-page: 723
  issue: 1
  year: 2012
  ident: ref39
  article-title: A kernel two-sample test
  publication-title: J. Mach. Learn. Res.
– volume: 2
  start-page: 12
  issue: 9
  year: 1968
  ident: ref47
  article-title: A reduction of a graph to a canonical form and an algebra arising during this reduction
  publication-title: Nauchno-Technicheskaya Informatsiya
– ident: ref49
  doi: 10.1016/j.eswa.2013.12.001
– ident: ref66
  doi: 10.1145/1961189.1961199
– ident: ref31
  doi: 10.48550/arXiv.1310.4546
– ident: ref40
  doi: 10.24963/ijcai.2022/310
– start-page: 134
  volume-title: Proc. Adv. Neur. Inf. Process. Syst.
  year: 2022
  ident: ref54
  article-title: Graphqntk: Quantum neural tangent kernel for graph data
– volume: 35
  start-page: 20119
  year: 2022
  ident: ref15
  article-title: Weisfeiler and Leman go walking: Random walk kernels revisited
  publication-title: Adv. Neur. Inf. Process. Syst.
– volume: 30
  start-page: 256
  year: 2017
  ident: ref33
  article-title: Near-linear time approximation algorithms for optimal transport via sinkhorn iteration
  publication-title: Adv. Neur. Inf. Process. Syst.
– ident: ref51
  doi: 10.1109/CVPR.2006.68
– ident: ref16
  doi: 10.1109/ICDM.2005.132
– start-page: 2990
  volume-title: Proc. Adv. Neur. Inf. Process. Syst.
  year: 2016
  ident: ref22
  article-title: The multiscale laplacian graph kernel
– ident: ref69
  doi: 10.1021/jm00106a046
– year: 1969
  ident: ref58
  article-title: Graph theory Addison-Wesley reading MA USA
– ident: ref8
  doi: 10.1613/jair.1.13225
– ident: ref27
  doi: 10.1137/1.9781611972825.84
– ident: ref71
  doi: 10.1109/ICDE.2013.6544842
– ident: ref21
  doi: 10.1145/1014052.1014072
– volume: 1
  start-page: 2092
  volume-title: Proc. 2018 Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Lang. Technologies
  year: 2018
  ident: ref63
  article-title: Factors influencing the surprising instability of word embeddings
– ident: ref5
  doi: 10.1109/CVPR42600.2020.00416
– ident: ref2
  doi: 10.1007/978-3-319-29659-3
– start-page: 65
  volume-title: Proc. 1st Int. Workshop Mining Graphs, Trees Sequences
  year: 2003
  ident: ref23
  article-title: Expressivity versus efficiency of graph kernels
– ident: ref17
  doi: 10.1109/TKDE.2019.2946149
– start-page: 24737
  volume-title: Int. Conf. Mach. Learn.
  year: 2023
  ident: ref42
  article-title: Path neural networks: Expressive and accurate graph neural networks
– volume: 26
  start-page: 234
  year: 2013
  ident: ref32
  article-title: Sinkhorn distances: Lightspeed computation of optimal transport
  publication-title: Adv. Neural Inf. Process. Syst.
– volume: 12
  start-page: 2539
  issue: 9
  year: 2011
  ident: ref26
  article-title: Weisfeiler-Lehman graph kernels
  publication-title: J. Mach. Learn. Res.
– ident: ref36
  doi: 10.1007/978-3-031-70371-3_11
– volume: 235
  start-page: 16226
  volume-title: Proc. 41st Int. Conf. Mach. Learn., Proceedings of Machine Learning Research
  year: 2024
  ident: ref46
  article-title: The expressive power of path-based graph neural networks
– volume-title: Proc. Adv. Neur. Inf. Process. Syst.
  year: 2019
  ident: ref53
  article-title: Graph neural tangent kernel: Fusing graph neural networks with graph kernels
– ident: ref4
  doi: 10.1093/bib/bbab159
– ident: ref59
  doi: 10.1145/2623330.2623732
– start-page: 255
  volume-title: Proc. Int. Conf. Mach. Learn.
  year: 2010
  ident: ref20
  article-title: Fast neighborhood subgraph pairwise distance kernel
– start-page: 5
  volume-title: Learn. Graphs Conf.
  year: 2022
  ident: ref41
  article-title: Shortest path networks for graph property prediction
– ident: ref9
  doi: 10.1007/s41109-019-0195-3
– ident: ref43
  doi: 10.24963/ijcai.2022/293
– ident: ref50
  doi: 10.1609/aaai.v31i1.10839
– ident: ref64
  doi: 10.3115/v1/D14-1162
– ident: ref67
  doi: 10.1109/TCYB.2016.2526058
– ident: ref1
  doi: 10.1145/245108.245121
– ident: ref44
  doi: 10.1109/TNNLS.2024.3370918
– start-page: 1327
  volume-title: Int. Conf. Mach. Learn.
  year: 2022
  ident: ref55
  article-title: A hierarchical transitive-aligned graph kernel for un-attributed graphs
– year: 2016
  ident: ref62
  article-title: Benchmark data sets for graph kernels
– ident: ref70
  doi: 10.1093/bioinformatics/bti1007
– ident: ref24
  doi: 10.1007/s10994-008-5086-2
– ident: ref29
  doi: 10.1007/s10994-022-06131-w
– start-page: 3964
  volume-title: Proc. Adv. Neur. Inf. Process. Syst.
  year: 2018
  ident: ref14
  article-title: RetGK: Graph kernels based on return probabilities of random walks
– year: 2018
  ident: ref65
  article-title: Grakel: A graph kernel library in python
– ident: ref37
  doi: 10.1609/aaai.v36i8.20793
– volume: 5
  issue: 3
  year: 2003
  ident: ref61
  article-title: Part 5: Graph algorithms
  publication-title: Algorithms C
– volume: 34
  start-page: 29541
  year: 2021
  ident: ref6
  article-title: Learning distilled collaboration graph for multi-agent perception
  publication-title: Adv. Neur. Inf. Process. Syst.
– ident: ref45
  doi: 10.1109/TNNLS.2024.3371592
– ident: ref3
  doi: 10.1145/3535101
– ident: ref56
  doi: 10.1109/TKDE.2024.3389966
– start-page: 488
  volume-title: Proc. Int. Conf. Artif. Intell. Statist.
  year: 2009
  ident: ref19
  article-title: Efficient graphlet kernels for large graph comparison
– volume: 8
  issue: 4
  year: 2007
  ident: ref52
  article-title: The pyramid match kernel: Efficient learning with sets of features.
  publication-title: J. Mach. Learn. Res.
– ident: ref68
  doi: 10.1021/ci034143r
– volume: 9
  start-page: 6407
  year: 2020
  ident: ref28
  article-title: Wasserstein weisfeiler-lehman graph kernels
  publication-title: Adv. Neur. Inf. Process. Syst.
– ident: ref57
  doi: 10.1016/j.patcog.2014.03.028
– start-page: 1660
  volume-title: Proc. Advances in Neural Inf. Process. Syst.
  year: 2009
  ident: ref25
  article-title: Fast subtree kernels on graphs
– ident: ref30
  doi: 10.1145/2783258.2783417
– year: 2018
  ident: ref35
  article-title: Bert: Pre-training of deep bidirectional transformers for language understanding
– start-page: 2019
  volume-title: Int. Conf. Artif. Intell. Statist.
  year: 2023
  ident: ref38
  article-title: Graph alignment kernels using weisfeiler and leman hierarchies
– ident: ref10
  doi: 10.1561/2200000076
– ident: ref34
  doi: 10.48550/ARXIV.1706.03762
– start-page: 237
  volume-title: Proc. Int. Conf. Learn. Represent.
  year: 2017
  ident: ref48
  article-title: Semi-supervised classification with graph convolutional networks
– volume: 11
  start-page: 1201
  issue: 4
  year: 2010
  ident: ref13
  article-title: Graph kernels
  publication-title: J. Machine Learn. Res.
– ident: ref60
  doi: 10.1109/TPAMI.2015.2477830
– ident: ref18
  doi: 10.1109/TAI.2023.3333830
– ident: ref12
  doi: 10.1007/978-3-540-45167-9_11
– ident: ref7
  doi: 10.1609/aaai.v34i05.6211
– year: 1999
  ident: ref11
  article-title: Convolution kernels on discrete structures
SSID ssj0002512227
Score 2.3015838
Snippet R-convolution graph kernels are conventional methods for graph classification. They decompose graphs into substructures and aggregate all the substructure...
SourceID crossref
ieee
SourceType Index Database
Publisher
StartPage 2119
SubjectTerms Accuracy
Artificial intelligence
Computational modeling
Gaussian distribution
Gaussian distributions
graph kernels
Kernel
Measurement
neural language models
Probabilistic logic
probabilistic Minkowski distance
Probability distribution
simple path
Training
Vectors
Title Beyond Histogram Comparison: Distribution-Aware Simple-Path Graph Kernels
URI https://ieeexplore.ieee.org/document/10877856
Volume 6
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 2691-4581
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002512227
  issn: 2691-4581
  databaseCode: RIE
  dateStart: 20200101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2691-4581
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0002512227
  issn: 2691-4581
  databaseCode: M~E
  dateStart: 20200101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEB5sT16sj4r1UfbgxUNimk324a3U1lZpEWyht7BJZkGEVEqK4MHf7m4epQqCp4SwS5bZXb6Z3ZnvA7j2YuErqwFIaWwCFBpyR8pUO1okNFEpD1DbeufpjI0XweMyXFbF6kUtDCIWyWfo2tfiLj9dJRt7VGZ2uOBchKwBDS5YWay1PVCxQO37vL6K9OTtvD8xAaAfujSkkgX-D-jZ0VIpoGTUglk9iDKD5M3d5LGbfP7iZ_z3KA_hoHIqSb9cBUewh9kxtGrBBlLt3xOYlPUqpOAGsWlZZLCVIbwj95ZDt5K_cvofao3k5dWSBzvPxk0kD5bamjzhOjNw2obFaDgfjJ1KS8FJTLyZOyqOPU19hZorpmmQMoPb3LIJBthj2NMqlWnKGGpkWlPNJDVPPxAMTZ-E01NoZqsMz4BQgcqgvAxobLVKmMJezJJQmP8w4_6pDtzUZo7eS8qMqAg1PBmZKYnslETVlHSgbQ2406603fkf3y9g33YvM_AuoZmvN3hlvII87kJj-jXsFmviG-dJtzI
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bS8MwFA46H_TFeZk4r3nwxYfWrbk1vo3p3NwFwQ32VpL2BEToZHQI_nqTXsYUBJ9aStqGk4TvnOSc70PopqXDQDkNQEK0DVAIE56UifFMGJNYJYKCcfXO4wnvz-jznM3LYvW8FgYA8uQz8N1tfpafLOKV2yqzKzwUImR8G-0wSikryrXWWyoOqoNAVIeRLXk37QxsCBgwnzAiOQ1-gM-GmkoOJr06mlTdKHJI3v1Vpv346xdD47_7eYD2S7cSd4p5cIi2ID1C9UqyAZcr-BgNiooVnLODuMQs3F0LEd7jB8eiWwpgeZ1PtQT8-ubog70X6yjiJ0dujYewTC2gNtCs9zjt9r1STcGLbcSZeUrrliGBAiMUN4Qm3CK3cHyCFNoc2kYlMkk4BwPcGGK4JPYa0JCDfScW5ATV0kUKpwiTEJTFeUmJdmolXEFb85iF9j_cOoCqiW4rM0cfBWlGlAcbLRnZIYnckETlkDRRwxlwo11hu7M_nl-j3f50PIpGg8nwHO25TxX5eBeoli1XcGl9hExf5TPjG1VmuU4
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Beyond+Histogram+Comparison%3A+Distribution-Aware+Simple-Path+Graph+Kernels&rft.jtitle=IEEE+transactions+on+artificial+intelligence&rft.au=Ye%2C+Wei&rft.au=Tang%2C+Shuhao&rft.au=Tian%2C+Hao&rft.au=Chen%2C+Qijun&rft.date=2025-08-01&rft.pub=IEEE&rft.eissn=2691-4581&rft.volume=6&rft.issue=8&rft.spage=2119&rft.epage=2132&rft_id=info:doi/10.1109%2FTAI.2025.3539642&rft.externalDocID=10877856
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2691-4581&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2691-4581&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2691-4581&client=summon