Migration-Aware Genetic Optimization for MapReduce Scheduling and Replica Placement in Hadoop

This work addresses the optimization of file locality, file availability, and replica migration cost in a Hadoop architecture. Our optimization algorithm is based on the Non-dominated Sorting Genetic Algorithm-II and it simultaneously determines file block placement, with a variable replication fact...

Full description

Saved in:

Bibliographic Details
Published in	Journal of grid computing Vol. 16; no. 2; pp. 265 - 284
Main Authors	Guerrero, Carlos, Lera, Isaac, Juiz, Carlos
Format	Journal Article
Language	English
Published	Dordrecht Springer Netherlands 01.06.2018 Springer Nature B.V
Subjects	Classification Computer Science Data centers Genetic algorithms Management of Computing and Information Systems Migration Optimization Placement Processor Architectures Replication Scheduling Sorting algorithms User Interfaces and Human Computer Interaction Workload Multi-objective optimization Replica placement Resource management Genetic algorithm MapReduce scheduling Hadoop
Online Access	Get full text
ISSN	1570-7873 1572-9184
DOI	10.1007/s10723-018-9432-8

Cover

More Information
Summary:	This work addresses the optimization of file locality, file availability, and replica migration cost in a Hadoop architecture. Our optimization algorithm is based on the Non-dominated Sorting Genetic Algorithm-II and it simultaneously determines file block placement, with a variable replication factor, and MapReduce job scheduling. Our proposal has been tested with experiments that considered three data center sizes (8, 16 and 32 nodes) with the same workload and number of files (150 files and 3519 file blocks). In general terms, the use of a placement policy with a variable replica factor obtains higher improvements for our three optimization objectives. On the contrary, the use of a job scheduling policy only improves these objectives when it is used along a variable replication factor. The results have also shown that the migration cost is a suitable optimization objective as significant improvements up to 34 % have been observed between the experiments.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1570-7873 1572-9184
DOI:	10.1007/s10723-018-9432-8