Scalable Representation Learning for Dynamic Heterogeneous Information Networks via Metagraphs

Content representation is a fundamental task in information retrieval. Representation learning is aimed at capturing features of an information object in a low-dimensional space. Most research on representation learning for heterogeneous information networks (HINs) focuses on static HINs. In practic...

Full description

Saved in:
Bibliographic Details
Published inACM transactions on information systems Vol. 40; no. 4; pp. 1 - 27
Main Authors Fang, Yang, Zhao, Xiang, Huang, Peixin, Xiao, Weidong, de Rijke, Maarten
Format Journal Article
LanguageEnglish
Published 01.10.2022
Online AccessGet full text
ISSN1046-8188
1558-2868
1558-2868
DOI10.1145/3485189

Cover

More Information
Summary:Content representation is a fundamental task in information retrieval. Representation learning is aimed at capturing features of an information object in a low-dimensional space. Most research on representation learning for heterogeneous information networks (HINs) focuses on static HINs. In practice, however, networks are dynamic and subject to constant change. In this article, we propose a novel and scalable representation learning model, M-DHIN , to explore the evolution of a dynamic HIN. We regard a dynamic HIN as a series of snapshots with different time stamps. We first use a static embedding method to learn the initial embeddings of a dynamic HIN at the first time stamp. We describe the features of the initial HIN via metagraphs, which retains more structural and semantic information than traditional path-oriented static models. We also adopt a complex embedding scheme to better distinguish between symmetric and asymmetric metagraphs. Unlike traditional models that process an entire network at each time stamp, we build a so-called change dataset that only includes nodes involved in a triadic closure or opening process, as well as newly added or deleted nodes. Then, we utilize the above metagraph-based mechanism to train on the change dataset. As a result of this setup, M-DHIN is scalable to large dynamic HINs since it only needs to model the entire HIN once while only the changed parts need to be processed over time. Existing dynamic embedding models only express the existing snapshots and cannot predict the future network structure. To equip M-DHIN with this ability, we introduce an LSTM-based deep autoencoder model that processes the evolution of the graph via an LSTM encoder and outputs the predicted graph. Finally, we evaluate the proposed model, M-DHIN , on real-life datasets and demonstrate that it significantly and consistently outperforms state-of-the-art models.
ISSN:1046-8188
1558-2868
1558-2868
DOI:10.1145/3485189