Accéder directement au contenu Accéder directement à la navigation
Communication dans un congrès

Static Compilation Analysis for Host-Accelerator Communication Optimization

Abstract : We present an automatic, static program transformation that schedules and generates e cient memory transfers between a computer host and its hardware accelerator, addressing a well-known performance bottleneck. Our automatic approach uses two simple heuristics: to perform transfers to the accelerator as early as possible and to delay transfers back from the accelerator as late as possible. We implemented this transformation as a middle-end compilation pass in the pips/Par4All compiler. In the generated code, redundant communications due to data reuse between kernel executions are avoided. Instructions that initiate transfers are scheduled e ectively at compile-time. We present experimental results obtained with the Polybench 2.0, some Rodinia benchmarks, and with a real numerical simulation. We obtain an average speedup of 4 to 5 when compared to a naïve parallelization using a modern gpu with Par4All, hmpp, and pgi, and 3.5 when compared to an OpenMP version using a 12-core multiprocessor.
Type de document :
Communication dans un congrès
Liste complète des métadonnées

Littérature citée [23 références]  Voir  Masquer  Télécharger

https://hal-mines-paristech.archives-ouvertes.fr/hal-00743496
Contributeur : Claire Medrala <>
Soumis le : vendredi 19 octobre 2012 - 11:52:01
Dernière modification le : jeudi 24 septembre 2020 - 16:36:01
Archivage à long terme le : : dimanche 20 janvier 2013 - 03:38:38

Fichier

A-476.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00743496, version 1

Citation

Mehdi Amini, Fabien Coelho, François Irigoin, Ronan Keryell. Static Compilation Analysis for Host-Accelerator Communication Optimization. LCPC'2011 : The 24th International Workshop on Languages and Compilers for Parallel Computing, Sep 2011, Fort Collins, Colorado, United States. pp. 237-251. ⟨hal-00743496⟩

Partager

Métriques

Consultations de la notice

344

Téléchargements de fichiers

757