A Cost-Efficient Auto-Scaling Algorithm for Large-Scale Graph Processing in Cloud Environments with Heterogeneous Resources

Graph processing model is being adopted extensively in various domains such as online gaming, social media, scientific computing and Internet of Things (IoT). Since general purpose data processing tools such as MapReduce are shown to be inefficient for iterative graph processing, many frameworks hav...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on software engineering Vol. 47; no. 8; pp. 1729 - 1741
Main Authors Heidari, Safiollah, Buyya, Rajkumar
Format Journal Article
LanguageEnglish
Published New York IEEE 01.08.2021
IEEE Computer Society
Subjects
Online AccessGet full text
ISSN0098-5589
1939-3520
DOI10.1109/TSE.2019.2934849

Cover

More Information
Summary:Graph processing model is being adopted extensively in various domains such as online gaming, social media, scientific computing and Internet of Things (IoT). Since general purpose data processing tools such as MapReduce are shown to be inefficient for iterative graph processing, many frameworks have been developed in recent years to facilitate analytics and computing of large-scale graphs. However, regardless of distributed or single machine based architecture of such frameworks, dynamic scalability is always a major concern. It becomes even more important when there is a correlation between scalability and monetary cost - similar to what public clouds provide. The pay-as-you-go model that is used by public cloud providers enables users to pay only for the number of resources they utilize. Nevertheless, processing large-scale graphs in such environments has been less studied and most frameworks are implemented for commodity clusters where they will not be charged for the resources that they consume. In this paper, we have developed algorithms to take advantage of resource heterogeneity in cloud environments. Using these algorithms, the system can automatically adjust the number and types of virtual machines according to the computation requirements for convergent graph applications to improve the performance and reduce the monetary cost of the entire operation. Also, a smart profiling mechanism along with a novel dynamic repartitioning approach helps to distribute graph partitions expeditiously. It is shown that this method outperforms popular frameworks such as Giraph and decreases more than 50 percent of the dollar cost compared to Giraph.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0098-5589
1939-3520
DOI:10.1109/TSE.2019.2934849