Graph-based Model and Algorithm for Minimizing Big Data Movement in a Cloud Environment

Abstract : In this paper, we discuss load balancing and data placement strategies in heterogeneous Cloud environments. Load balancing is crucial in large-scale data processing applications, especially in a distributed heterogeneous context like the Cloud. The main goal in data placement strategies is to improve the overall performance through the reduction of data movements among the participating datacenters, taking into account the dependencies. Typically, datacenters are geographically distributed based on theirs characteristics such as the processing speed, the storage capacity, among others technical considerations. Load balancing and ecient data placement on Cloud systems are critical problems, that are dicult to simultaneously cope with, especially in the emerging heterogeneous clusters. In this context, we propose a threshold-based load balancing algorithm, which rst balances the load between datacenters, and afterwards minimizes the overhead of data exchanges. The proposed approach is divided into three phases. First, the dependencies between the datasets are identi ed. Second, the load threshold of each datacenter is estimated based on the processing speed and the storage capacity. Third, the load balancing between the datacenters is managed through the threshold parameters. The heterogeneity of the datacenters together with the dependencies between the datasets are both taken into account. Our experimental results show that our approach can eciently reduce the frequency of data movement and keep a good load balancing between the datacenters.
Type de document :
Article dans une revue
Liste complète des métadonnées

https://hal-mines-paristech.archives-ouvertes.fr/hal-01711063
Contributeur : Claire Medrala <>
Soumis le : vendredi 16 février 2018 - 16:47:21
Dernière modification le : lundi 12 novembre 2018 - 11:04:59

Identifiants

  • HAL Id : hal-01711063, version 1

Citation

Yassir Samadi, Mostapha Zbakh, Claude Tadonki. Graph-based Model and Algorithm for Minimizing Big Data Movement in a Cloud Environment. International Journal of High Performance Computing and Networking, Inderscience, In press. ⟨hal-01711063⟩

Partager

Métriques

Consultations de la notice

389