Scalable Representation Learning for Dynamic Heterogeneous Information Networks via Metagraphs

Content representation is a fundamental task in information retrieval. Representation learning is aimed at capturing features of an information object in a low-dimensional space. Most research on representation learning for heterogeneous information networks (HINs) focuses on static HINs. In practic...

Full description

Saved in:

Bibliographic Details
Published in	ACM transactions on information systems Vol. 40; no. 4; pp. 1 - 27
Main Authors	Fang, Yang, Zhao, Xiang, Huang, Peixin, Xiao, Weidong, de Rijke, Maarten
Format	Journal Article
Language	English
Published	01.10.2022
Online Access	Get full text
ISSN	1046-8188 1558-2868 1558-2868
DOI	10.1145/3485189

Cover

Abstract	Content representation is a fundamental task in information retrieval. Representation learning is aimed at capturing features of an information object in a low-dimensional space. Most research on representation learning for heterogeneous information networks (HINs) focuses on static HINs. In practice, however, networks are dynamic and subject to constant change. In this article, we propose a novel and scalable representation learning model, M-DHIN , to explore the evolution of a dynamic HIN. We regard a dynamic HIN as a series of snapshots with different time stamps. We first use a static embedding method to learn the initial embeddings of a dynamic HIN at the first time stamp. We describe the features of the initial HIN via metagraphs, which retains more structural and semantic information than traditional path-oriented static models. We also adopt a complex embedding scheme to better distinguish between symmetric and asymmetric metagraphs. Unlike traditional models that process an entire network at each time stamp, we build a so-called change dataset that only includes nodes involved in a triadic closure or opening process, as well as newly added or deleted nodes. Then, we utilize the above metagraph-based mechanism to train on the change dataset. As a result of this setup, M-DHIN is scalable to large dynamic HINs since it only needs to model the entire HIN once while only the changed parts need to be processed over time. Existing dynamic embedding models only express the existing snapshots and cannot predict the future network structure. To equip M-DHIN with this ability, we introduce an LSTM-based deep autoencoder model that processes the evolution of the graph via an LSTM encoder and outputs the predicted graph. Finally, we evaluate the proposed model, M-DHIN , on real-life datasets and demonstrate that it significantly and consistently outperforms state-of-the-art models.
AbstractList	Content representation is a fundamental task in information retrieval. Representation learning is aimed at capturing features of an information object in a low-dimensional space. Most research on representation learning for heterogeneous information networks (HINs) focuses on static HINs. In practice, however, networks are dynamic and subject to constant change. In this article, we propose a novel and scalable representation learning model, M-DHIN , to explore the evolution of a dynamic HIN. We regard a dynamic HIN as a series of snapshots with different time stamps. We first use a static embedding method to learn the initial embeddings of a dynamic HIN at the first time stamp. We describe the features of the initial HIN via metagraphs, which retains more structural and semantic information than traditional path-oriented static models. We also adopt a complex embedding scheme to better distinguish between symmetric and asymmetric metagraphs. Unlike traditional models that process an entire network at each time stamp, we build a so-called change dataset that only includes nodes involved in a triadic closure or opening process, as well as newly added or deleted nodes. Then, we utilize the above metagraph-based mechanism to train on the change dataset. As a result of this setup, M-DHIN is scalable to large dynamic HINs since it only needs to model the entire HIN once while only the changed parts need to be processed over time. Existing dynamic embedding models only express the existing snapshots and cannot predict the future network structure. To equip M-DHIN with this ability, we introduce an LSTM-based deep autoencoder model that processes the evolution of the graph via an LSTM encoder and outputs the predicted graph. Finally, we evaluate the proposed model, M-DHIN , on real-life datasets and demonstrate that it significantly and consistently outperforms state-of-the-art models.
Author	de Rijke, Maarten Fang, Yang Xiao, Weidong Zhao, Xiang Huang, Peixin
Author_xml	– sequence: 1 givenname: Yang surname: Fang fullname: Fang, Yang organization: National University of Defense Technology, Changsha, China – sequence: 2 givenname: Xiang surname: Zhao fullname: Zhao, Xiang organization: National University of Defense Technology, Changsha, China – sequence: 3 givenname: Peixin surname: Huang fullname: Huang, Peixin organization: National University of Defense Technology, Changsha, China – sequence: 4 givenname: Weidong surname: Xiao fullname: Xiao, Weidong organization: National University of Defense Technology, Changsha, China – sequence: 5 givenname: Maarten orcidid: 0000-0002-1086-0202 surname: de Rijke fullname: de Rijke, Maarten organization: University of Amsterdam Amsterdam, Amsterdam, The Netherlands
BookMark	eNp1kEtPwzAQhC0EEm1B_AXf4BKwk_jRIyqPViog8bgSbc26BFI7sl2q_nuC0hOC0-xoPo20MyT7zjsk5ISzc85LcVGUWnA93iMDLoTOci31fnezUmaaa31IhjF-MNZ5yQbk9clAA4sG6SO2ASO6BKn2js4Rgqvdklof6NXWwao2dIoJg1-iQ7-OdOa6bNXj95g2PnxG-lUDvcMEywDtezwiBxaaiMc7HZGXm-vnyTSbP9zOJpfzzORCp8wawTQvNcNFUY65AgPWoimYykFbLRfWFpopprhSbxYY4Fham-dKoFBSFsWInPW9a9fCdgNNU7WhXkHYVpxVP7tUu106NOtRE3yMAW1l6v7nFKBu_uBPf_H_NX8DM7pzzA
CitedBy_id	crossref_primary_10_1145_3721434 crossref_primary_10_1155_2023_5917750 crossref_primary_10_1016_j_neucom_2023_02_060 crossref_primary_10_1109_ACCESS_2024_3418957 crossref_primary_10_1016_j_future_2023_09_007 crossref_primary_10_1016_j_patrec_2024_03_023 crossref_primary_10_1093_comjnl_bxad123 crossref_primary_10_1108_EL_01_2024_0011 crossref_primary_10_1007_s13278_023_01178_6 crossref_primary_10_1016_j_knosys_2023_111225 crossref_primary_10_1007_s12530_024_09616_2 crossref_primary_10_1016_j_knosys_2025_113150 crossref_primary_10_1007_s10115_022_01792_4 crossref_primary_10_1016_j_ins_2023_119371 crossref_primary_10_1109_TCSS_2023_3260118
Cites_doi	10.1145/2623330.2623732 10.1145/2488388.2488393 10.1109/TKDE.2016.2591009 10.1038/30918 10.1126/science.290.5500.2323 10.1145/3331184.3331273 10.1109/ICDM50108.2020.00022 10.1145/2939672.2939754 10.1109/TNNLS.2018.2829867 10.14778/2732286.2732289 10.1145/2783258.2783307 10.1007/978-3-319-93037-4_16 10.1016/j.knosys.2019.06.024 10.1609/aaai.v32i1.11299 10.1007/s10618-014-0365-y 10.1109/TNNLS.2017.2650978 10.1145/2939672.2939753 10.1609/aaai.v32i1.11257 10.1145/3331184.3331281 10.1609/aimag.v10i4.972 10.1145/3341161.3342859 10.1145/3269206.3272010 10.1145/3132847.3132953 10.1145/3097983.3098036 10.1145/2736277.2741093 10.1145/3308558.3313562 10.1145/3292500.3330961 10.1145/2806416.2806512 10.1145/3219819.3219947
ContentType	Journal Article
DBID	AAYXX CITATION ADTOC UNPAY
DOI	10.1145/3485189
DatabaseName	CrossRef Unpaywall for CDI: Periodical Content Unpaywall
DatabaseTitle	CrossRef
DatabaseTitleList	CrossRef
Database_xml	– sequence: 1 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
Discipline	Business
EISSN	1558-2868
EndPage	27
ExternalDocumentID	10.1145/3485189 10_1145_3485189
GroupedDBID	--Z -DZ -~X .4S .DC 23M 4.4 5GY 5VS 6J9 77I 77K 85S 8US AAKMM AALFJ AAYFX AAYXX ABPPZ ACGFO ACGOD ACM ADBCU ADL ADMLS AEBYY AEFXT AEGXH AEJOY AENEX AENSD AETEA AFWIH AFWXC AIAGR AIKLT AKRVB ALMA_UNASSIGNED_HOLDINGS ARCSS ASPBG AVWKF BDXCO CCLIF CITATION CS3 D0L EBS EDO FEDTE GUFHI HGAVV H~9 I07 IAO ICD IOF LHSKQ MK~ ML~ MS~ P1C P2P PQQKQ RNS ROL RXW TAE TUS U5U UHB UPT WH7 X6Y XH6 XSW YR2 ZCA 9M8 AAFWJ ADTOC AFFNX AI. BAAKF EJD HF~ IEA IGS ITC MVM N95 NEJ UNPAY VH1 XJT ZY4
ID	FETCH-LOGICAL-c258t-fc5081480eb34917acaffec3072a8f86bff380707177dfa0ae96ff2275e576633
IEDL.DBID	UNPAY
ISSN	1046-8188 1558-2868
IngestDate	Tue Aug 19 22:23:23 EDT 2025 Wed Oct 01 05:56:46 EDT 2025 Thu Apr 24 22:55:09 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	4
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c258t-fc5081480eb34917acaffec3072a8f86bff380707177dfa0ae96ff2275e576633
ORCID	0000-0002-1086-0202
OpenAccessLink	https://proxy.k.utb.cz/login?url=https://dl.acm.org/doi/pdf/10.1145/3485189
PageCount	27
ParticipantIDs	unpaywall_primary_10_1145_3485189 crossref_citationtrail_10_1145_3485189 crossref_primary_10_1145_3485189
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2022-10-01
PublicationDateYYYYMMDD	2022-10-01
PublicationDate_xml	– month: 10 year: 2022 text: 2022-10-01 day: 01
PublicationDecade	2020
PublicationTitle	ACM transactions on information systems
PublicationYear	2022
References	Belkin Mikhail (e_1_3_3_4_2) 2001 Shang Jingbo (e_1_3_3_28_2) 2016; 1610 Cox Michael A. A. (e_1_3_3_7_2) 2008 Goyal Palash (e_1_3_3_15_2) 2018; 1805 e_1_3_3_16_2 e_1_3_3_38_2 e_1_3_3_39_2 e_1_3_3_13_2 e_1_3_3_36_2 e_1_3_3_12_2 e_1_3_3_37_2 e_1_3_3_34_2 e_1_3_3_14_2 e_1_3_3_35_2 e_1_3_3_32_2 e_1_3_3_33_2 e_1_3_3_11_2 e_1_3_3_30_2 e_1_3_3_10_2 Kingma Diederik P. (e_1_3_3_18_2) 2015 Rahman Mahmudur (e_1_3_3_24_2) 2018; 1804 e_1_3_3_6_2 e_1_3_3_5_2 e_1_3_3_8_2 Huang Zhipeng (e_1_3_3_17_2) 2017; 1701 e_1_3_3_9_2 e_1_3_3_27_2 Kipf Thomas N. (e_1_3_3_19_2) 2017 e_1_3_3_29_2 e_1_3_3_23_2 e_1_3_3_26_2 e_1_3_3_25_2 e_1_3_3_2_2 e_1_3_3_20_2 e_1_3_3_22_2 e_1_3_3_3_2 e_1_3_3_21_2 Trouillon Théo (e_1_3_3_31_2) 2017; 18
References_xml	– ident: e_1_3_3_23_2 doi: 10.1145/2623330.2623732 – ident: e_1_3_3_2_2 doi: 10.1145/2488388.2488393 – ident: e_1_3_3_39_2 doi: 10.1109/TKDE.2016.2591009 – volume: 18 start-page: 130:1–130:38 year: 2017 ident: e_1_3_3_31_2 article-title: Knowledge graph completion via complex tensor factorization publication-title: Journal of Machine Learning Research – ident: e_1_3_3_34_2 doi: 10.1038/30918 – ident: e_1_3_3_26_2 doi: 10.1126/science.290.5500.2323 – ident: e_1_3_3_5_2 doi: 10.1145/3331184.3331273 – volume: 1805 year: 2018 ident: e_1_3_3_15_2 article-title: DynGEM: Deep embedding method for dynamic graphs publication-title: CoRR – ident: e_1_3_3_10_2 doi: 10.1109/ICDM50108.2020.00022 – ident: e_1_3_3_16_2 doi: 10.1145/2939672.2939754 – ident: e_1_3_3_20_2 doi: 10.1109/TNNLS.2018.2829867 – ident: e_1_3_3_9_2 doi: 10.14778/2732286.2732289 – ident: e_1_3_3_29_2 doi: 10.1145/2783258.2783307 – ident: e_1_3_3_36_2 doi: 10.1007/978-3-319-93037-4_16 – ident: e_1_3_3_14_2 doi: 10.1016/j.knosys.2019.06.024 – ident: e_1_3_3_37_2 doi: 10.1609/aaai.v32i1.11299 – ident: e_1_3_3_3_2 doi: 10.1007/s10618-014-0365-y – ident: e_1_3_3_22_2 doi: 10.1109/TNNLS.2017.2650978 – ident: e_1_3_3_32_2 doi: 10.1145/2939672.2939753 – ident: e_1_3_3_38_2 doi: 10.1609/aaai.v32i1.11257 – ident: e_1_3_3_11_2 doi: 10.1145/3331184.3331281 – ident: e_1_3_3_25_2 doi: 10.1609/aimag.v10i4.972 – volume: 1610 year: 2016 ident: e_1_3_3_28_2 article-title: Meta-path guided embedding for similarity search in large-scale heterogeneous information networks publication-title: CoRR – volume: 1701 year: 2017 ident: e_1_3_3_17_2 article-title: Heterogeneous information network embedding for meta path based proximity publication-title: CoRR – ident: e_1_3_3_27_2 doi: 10.1145/3341161.3342859 – ident: e_1_3_3_21_2 doi: 10.1145/3269206.3272010 – ident: e_1_3_3_12_2 doi: 10.1145/3132847.3132953 – volume: 1804 year: 2018 ident: e_1_3_3_24_2 article-title: DyLink2Vec: Effective feature representation for link prediction in dynamic networks publication-title: CoRR – ident: e_1_3_3_8_2 doi: 10.1145/3097983.3098036 – volume-title: 5th International Conference on Learning Representations (ICLR’17), Conference Track Proceedings year: 2017 ident: e_1_3_3_19_2 – volume-title: Multidimensional Scaling year: 2008 ident: e_1_3_3_7_2 – start-page: 585 volume-title: Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic] (NIPS’01) year: 2001 ident: e_1_3_3_4_2 – volume-title: 3rd International Conference on Learning Representations (ICLR’15), Conference Track Proceedings year: 2015 ident: e_1_3_3_18_2 – ident: e_1_3_3_30_2 doi: 10.1145/2736277.2741093 – ident: e_1_3_3_33_2 doi: 10.1145/3308558.3313562 – ident: e_1_3_3_35_2 doi: 10.1145/3292500.3330961 – ident: e_1_3_3_6_2 doi: 10.1145/2806416.2806512 – ident: e_1_3_3_13_2 doi: 10.1145/3219819.3219947
SSID	ssj0004660
Score	2.4503288
Snippet	Content representation is a fundamental task in information retrieval. Representation learning is aimed at capturing features of an information object in a...
SourceID	unpaywall crossref
SourceType	Open Access Repository Enrichment Source Index Database
StartPage	1
Title	Scalable Representation Learning for Dynamic Heterogeneous Information Networks via Metagraphs
URI	https://dl.acm.org/doi/pdf/10.1145/3485189
UnpaywallVersion	publishedVersion
Volume	40
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVEBS databaseName: Inspec with Full Text customDbUrl: eissn: 1558-2868 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0004660 issn: 1046-8188 databaseCode: ADMLS dateStart: 20060101 isFulltext: true titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text providerName: EBSCOhost
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1dS8MwFA26gfritzg_RgTxrbplTZo9DucY4oY4B_PFkaaJiLMbtlX013uTZmMqwt5vS8m94d5DzzkXoVNZIZFfJ4Enhe97fjVUHqcMwEokuDFbkcKq0jpd1u771wM6cDY5RgsTjeA9r_YXvrnTk0g7Q1t6UfNhOuD1ZVRkFObuAir2u7eNh9xugHnQeazujVJIPWc8V8jOP_mj9axm8UR8fojRaK6ftDbyxUSJtSE0NJKX8ywNz-XXL5PGxT51E627sRI38jrYQksq3kYrU1b7DnrsQS6MSgrfWeqrUxzF2PmrPmEYXnEzX0-P24YjM4bSUuMswU6xZMO7OWs8we_PAndUKqzhdbKL-q2r-8u251YreJJQnnpawmAGSKgCWNoHxCakMPQRuPBEcM1ZqLVxojdgL4i0qAhVZ1oTElAFAIXVanuoEI9jtY8wjTgJuOLVoBr6SkshCSMSYBchOtJal9DZ9NSH0vmOm_UXo2GuiaZDd1olhGeBk9xq42_IySxt_8UcLBBziNaIkTFYUt4RKqRvmTqG4SINy6jYaHZuemVXXd-vMMvK
linkProvider	Unpaywall
linkToUnpaywall	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1dS8MwFA26gfritzi_iCC-dW5Zk2aPQx1D2BB1MF8caZqIOLthW0V_vTcfG1MRfL8tJfeGew8951yETmSNJGGTRIEUYRiE9VgFnDIAK4ngxmxFCqtK6_ZYpx9eDejA2-QYLUwygve82F_45k5PEu0NbelZI4TpgDcXUZlRmLtLqNzvXbfund0AC6DzWN0bpZB6zrhTyM4_-a31LBfpRHy8i9Forp-019xioszaEBoayXO1yOOq_Pxh0vi_T11Hq36sxC1XBxtoQaWbaGnKat9CD7eQC6OSwjeW-uoVRyn2_qqPGIZXfOHW0-OO4ciMobTUuMiwVyzZ8J5jjWf47UngrsqFNbzOtlG_fXl33gn8aoVAEsrzQEsYzAAJ1QBLh4DYhBSGPgIXngiuOYu1Nk70BuxFiRY1oZpMa0IiqgCgsEZjB5XScap2EaYJJxFXvB7V41BpKSRhRALsIkQnWusKOp2e-lB633Gz_mI0dJpoOvSnVUF4FjhxVhu_Q45nafsrZu8fMftohRgZgyXlHaBS_lqoQxgu8vjIV9UXWNHKNg
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Scalable+Representation+Learning+for+Dynamic+Heterogeneous+Information+Networks+via+Metagraphs&rft.jtitle=ACM+transactions+on+information+systems&rft.au=Fang%2C+Yang&rft.au=Zhao%2C+Xiang&rft.au=Huang%2C+Peixin&rft.au=Xiao%2C+Weidong&rft.date=2022-10-01&rft.issn=1046-8188&rft.eissn=1558-2868&rft.volume=40&rft.issue=4&rft.spage=1&rft.epage=27&rft_id=info:doi/10.1145%2F3485189&rft.externalDBID=n%2Fa&rft.externalDocID=10_1145_3485189
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1046-8188&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1046-8188&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1046-8188&client=summon