A Cost-Efficient Auto-Scaling Algorithm for Large-Scale Graph Processing in Cloud Environments with Heterogeneous Resources

Graph processing model is being adopted extensively in various domains such as online gaming, social media, scientific computing and Internet of Things (IoT). Since general purpose data processing tools such as MapReduce are shown to be inefficient for iterative graph processing, many frameworks hav...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on software engineering Vol. 47; no. 8; pp. 1729 - 1741
Main Authors	Heidari, Safiollah, Buyya, Rajkumar
Format	Journal Article
Language	English
Published	New York IEEE 01.08.2021 IEEE Computer Society
Subjects	Algorithms auto-scaling Cloud computing Clustering algorithms Computational modeling cost saving Data processing Graphs Heterogeneity heterogeneous resources Heuristic algorithms Internet of Things Iterative methods large-scale graph processing Partitions (mathematics) Scalability Software algorithms Virtual environments
Online Access	Get full text
ISSN	0098-5589 1939-3520
DOI	10.1109/TSE.2019.2934849

Cover

More Information
Summary:	Graph processing model is being adopted extensively in various domains such as online gaming, social media, scientific computing and Internet of Things (IoT). Since general purpose data processing tools such as MapReduce are shown to be inefficient for iterative graph processing, many frameworks have been developed in recent years to facilitate analytics and computing of large-scale graphs. However, regardless of distributed or single machine based architecture of such frameworks, dynamic scalability is always a major concern. It becomes even more important when there is a correlation between scalability and monetary cost - similar to what public clouds provide. The pay-as-you-go model that is used by public cloud providers enables users to pay only for the number of resources they utilize. Nevertheless, processing large-scale graphs in such environments has been less studied and most frameworks are implemented for commodity clusters where they will not be charged for the resources that they consume. In this paper, we have developed algorithms to take advantage of resource heterogeneity in cloud environments. Using these algorithms, the system can automatically adjust the number and types of virtual machines according to the computation requirements for convergent graph applications to improve the performance and reduce the monetary cost of the entire operation. Also, a smart profiling mechanism along with a novel dynamic repartitioning approach helps to distribute graph partitions expeditiously. It is shown that this method outperforms popular frameworks such as Giraph and decreases more than 50 percent of the dollar cost compared to Giraph.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0098-5589 1939-3520
DOI:	10.1109/TSE.2019.2934849