TDT: Tensor Based Directed Truss Decomposition

Truss decomposition is to find the hierarchy of all the k-trusses in a graph for k\geq 2 . Existing GPU-based algorithms first compute edge support by parallelly counting the number of triangles each edge is contained in, and then iteratively peel off edges with the smallest support and update suppo...

Full description

Saved in:

Bibliographic Details
Published in	Data engineering pp. 1828 - 1840
Main Authors	Li, Guojing, Zhu, Yuanyuan, Ma, Junchao, Zhong, Ming, Qian, Tieyun, Yu, Jeffrey Xu
Format	Conference Proceeding
Language	English
Published	IEEE 19.05.2025
Subjects	Cohesive subgraphs Directed graphs Graphics processing units Hardware acceleration K-truss Machine learning Machine learning algorithms Parallel processing Partitioning algorithms Scalability Source coding Tensors Triangle counting Truss decomposition
Online Access	Get full text
ISSN	2375-026X
DOI	10.1109/ICDE65448.2025.00140

Cover

More Information
Summary:	Truss decomposition is to find the hierarchy of all the k-trusses in a graph for k\geq 2 . Existing GPU-based algorithms first compute edge support by parallelly counting the number of triangles each edge is contained in, and then iteratively peel off edges with the smallest support and update support of the affected edges in parallel. However, these algorithms perform truss decomposition on undirected graphs, which causes large storage space and numerous triangle existence checks during support update. Moreover, they are developed based on CUDA, which cannot naturally adapt to emerging hardware accelerators and support the end-to-end downstream graph machine learning (ML) tasks. In this paper, we propose a truss decomposition framework based on tensors (TDT), which can leverage the parallelism of heterogeneous hardware backends to speed up the computation and seamlessly integrate with downstream graph ML tasks. We first convert the original input graph into a directed graph and represent it by compacted tensors. Then we perform truss decomposition on the tensorized directed graph by efficient tensor operators. Such a directed-graph storage model not only saves the storage space but also naturally supports efficient support computation/update during the truss decomposition. To further accelerate truss decomposition, we also partition vertex neighbors into blocks to balance the computation workload and optimize key steps such as support computation/update in our framework. Extensive experimental studies show that our Python-based TDT algorithm not only achieves 2.3\times-8.5\times speedup in most cases compared with the state-of-the-art CUDA-based algorithms, but also can efficiently deal with large graphs with hundreds of millions of nodes and billions of edges while the baseline fails due to large storage cost. Our source code is publicly available at https://github.com/LiGuojing194/TDTdecomposition.
ISSN:	2375-026X
DOI:	10.1109/ICDE65448.2025.00140