A Space-Efficient Parallel Algorithm for Counting Exact Triangles in Massive Networks

Finding the number of triangles in a network (graph) is an important problem in mining and analysis of complex networks. Massive networks emerging from numerous application areas pose a significant challenge in network analytics since these networks consist of millions, or even billions, of nodes an...

Full description

Saved in:
Bibliographic Details
Published in2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems pp. 527 - 534
Main Authors Arifuzzaman, Shaikh, Khan, Maleq, Marathe, Madhav
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.08.2015
Subjects
Online AccessGet full text
DOI10.1109/HPCC-CSS-ICESS.2015.301

Cover

More Information
Summary:Finding the number of triangles in a network (graph) is an important problem in mining and analysis of complex networks. Massive networks emerging from numerous application areas pose a significant challenge in network analytics since these networks consist of millions, or even billions, of nodes and edges. Such massive networks necessitate the development of efficient parallel algorithms. There exist several MapReduce and an only MPI (Message Passing Interface) based distributed-memory parallel algorithms for counting triangles. MapReduce based algorithms generate prohibitively large intermediate data. The MPI based algorithm can work on quite large networks, however, the overlapping partitions employed by the algorithm limit its capability to deal with very massive networks. In this paper, we present a space-efficient MPI based parallel algorithm for counting exact number of triangles in massive networks. The algorithm divides the network into non-overlapping partitions. Our results demonstrate up to 25-fold space saving over the algorithm with overlapping partitions. This space efficiency allows the algorithm to deal with networks which are 25 times larger. We present a novel approach that reduces communication cost drastically (up to 90%) leading to both a space-and runtime-efficient algorithm. Our adaptation of a parallel partitioning scheme by computing a novel weight function adds further to the efficiency of the algorithm. Denoting average degree of nodes and the number of partitions by d and P, respectively, our algorithm achieves up to O(P 2 )-factor space efficiency over existing MapReduce based algorithms and up to d-factor (approx.) over the algorithm with overlapping partitioning.
DOI:10.1109/HPCC-CSS-ICESS.2015.301