Skip to Main content Skip to Navigation
Journal articles

Graph-based Model and Algorithm for Minimizing Big Data Movement in a Cloud Environment

Abstract : In this paper, we discuss load balancing and data placement strategies in heterogeneous Cloud environments. Load balancing is crucial in large-scale data processing applications, especially in a distributed heterogeneous context like the Cloud. The main goal in data placement strategies is to improve the overall performance through the reduction of data movements among the participating datacenters, taking into account the dependencies. Typically, datacenters are geographically distributed based on theirs characteristics such as the processing speed, the storage capacity, among others technical considerations. Load balancing and ecient data placement on Cloud systems are critical problems, that are dicult to simultaneously cope with, especially in the emerging heterogeneous clusters. In this context, we propose a threshold-based load balancing algorithm, which rst balances the load between datacenters, and afterwards minimizes the overhead of data exchanges. The proposed approach is divided into three phases. First, the dependencies between the datasets are identi ed. Second, the load threshold of each datacenter is estimated based on the processing speed and the storage capacity. Third, the load balancing between the datacenters is managed through the threshold parameters. The heterogeneity of the datacenters together with the dependencies between the datasets are both taken into account. Our experimental results show that our approach can eciently reduce the frequency of data movement and keep a good load balancing between the datacenters.
Document type :
Journal articles
Complete list of metadata

https://hal-mines-paristech.archives-ouvertes.fr/hal-01711063
Contributor : Claire Medrala <>
Submitted on : Friday, February 16, 2018 - 4:47:21 PM
Last modification on : Thursday, September 24, 2020 - 4:36:04 PM

Identifiers

  • HAL Id : hal-01711063, version 1

Citation

Yassir Samadi, Mostapha Zbakh, Claude Tadonki. Graph-based Model and Algorithm for Minimizing Big Data Movement in a Cloud Environment. International Journal of High Performance Computing and Networking, Inderscience, In press. ⟨hal-01711063⟩

Share

Metrics

Record views

460