TDT: Tensor Based Directed Truss Decomposition

Truss decomposition is to find the hierarchy of all the k-trusses in a graph for k\geq 2 . Existing GPU-based algorithms first compute edge support by parallelly counting the number of triangles each edge is contained in, and then iteratively peel off edges with the smallest support and update suppo...

Full description

Saved in:
Bibliographic Details
Published inData engineering pp. 1828 - 1840
Main Authors Li, Guojing, Zhu, Yuanyuan, Ma, Junchao, Zhong, Ming, Qian, Tieyun, Yu, Jeffrey Xu
Format Conference Proceeding
LanguageEnglish
Published IEEE 19.05.2025
Subjects
Online AccessGet full text
ISSN2375-026X
DOI10.1109/ICDE65448.2025.00140

Cover

More Information
Summary:Truss decomposition is to find the hierarchy of all the k-trusses in a graph for k\geq 2 . Existing GPU-based algorithms first compute edge support by parallelly counting the number of triangles each edge is contained in, and then iteratively peel off edges with the smallest support and update support of the affected edges in parallel. However, these algorithms perform truss decomposition on undirected graphs, which causes large storage space and numerous triangle existence checks during support update. Moreover, they are developed based on CUDA, which cannot naturally adapt to emerging hardware accelerators and support the end-to-end downstream graph machine learning (ML) tasks. In this paper, we propose a truss decomposition framework based on tensors (TDT), which can leverage the parallelism of heterogeneous hardware backends to speed up the computation and seamlessly integrate with downstream graph ML tasks. We first convert the original input graph into a directed graph and represent it by compacted tensors. Then we perform truss decomposition on the tensorized directed graph by efficient tensor operators. Such a directed-graph storage model not only saves the storage space but also naturally supports efficient support computation/update during the truss decomposition. To further accelerate truss decomposition, we also partition vertex neighbors into blocks to balance the computation workload and optimize key steps such as support computation/update in our framework. Extensive experimental studies show that our Python-based TDT algorithm not only achieves 2.3\times-8.5\times speedup in most cases compared with the state-of-the-art CUDA-based algorithms, but also can efficiently deal with large graphs with hundreds of millions of nodes and billions of edges while the baseline fails due to large storage cost. Our source code is publicly available at https://github.com/LiGuojing194/TDTdecomposition.
ISSN:2375-026X
DOI:10.1109/ICDE65448.2025.00140