C. Ancourt, F. Coelho, F. Irigoin, and . Et-ronan-keryell, A Linear Algebra Framework for Static High Performance Fortran Code Distribution, Scientific Programming, pp.3-27, 1997.
DOI : 10.1155/1997/195689

D. Aubert, M. Amini, and R. David, A Particle-Mesh Integrator for Galactic Dynamics Powered by GPGPUs, International Conference on Computational Science : Part I, ICCS '09
DOI : 10.1007/978-3-642-01970-8_88

C. Augonnet, S. Thibault, R. Namyst, and P. Wacrenier, StarPU : A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures. Concurrency and Computation : Practice and Experience, Special Issue : Euro-Par, pp.187-198, 2009.
URL : https://hal.archives-ouvertes.fr/inria-00384363

F. Bodin and S. Bihan, Heterogeneous Multicore Parallel Programming for Graphics Processing Units, Scientific Programming, vol.17, issue.4, pp.325-336, 2009.
DOI : 10.1155/2009/784893

Y. Chen, X. Cui, and H. Mei, Large-scale FFT on GPU clusters, Proceedings of the 24th ACM International Conference on Supercomputing, ICS '10, pp.315-324, 2010.
DOI : 10.1145/1810085.1810128

B. Creusillet and F. Irigoin, Interprocedural Array Region Analyses, International Journal of Parallel Programming, vol.2, issue.3, pp.513-546, 1996.
DOI : 10.1007/BF03356758

URL : https://hal.archives-ouvertes.fr/hal-00752611

W. Fang, B. He, and Q. Luo, Database compression on graphics processors, Proc. VLDB Endow, pp.670-680, 2010.
DOI : 10.14778/1920841.1920927

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.172.8125

P. Feautrier, Parametric integer programming, RAIRO - Operations Research, vol.22, issue.3, 1988.
DOI : 10.1051/ro/1988220302431

H. Michael, G. , and H. P. Zima, Optimizing communication in SUPERB, Proceedings of the joint international conference on Vector and parallel processing, 1990.

C. Gong, R. Gupta, and R. Melhem, Compilation Techniques for Optimizing Communication on Distributed-Memory Systems, 1993 International Conference on Parallel Processing, ICPP'93 Vol2, 1993.
DOI : 10.1109/ICPP.1993.58

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.17.759

T. David, H. Tarek, and S. Abdelrahman, hiCUDA : a high-level directive-based language for GPU programming, Proceedings of GPGPU-2, 2009.

H. Project, Par4All initiative for automatic parallelization

F. Irigoin, P. Jouvelot, and R. Triolet, Semantical interprocedural parallelization : an overview of the PIPS project, ICS '91, pp.244-251, 1991.
URL : https://hal.archives-ouvertes.fr/hal-00984684

S. Lee and R. Eigenmann, OpenMPC: Extended OpenMP Programming and Tuning for GPUs, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-11, 2010.
DOI : 10.1109/SC.2010.36

S. Lee, R. Seung-jai-min, and . Eigenmann, OpenMP to GPGPU : a compiler framework for automatic translation and optimization, pp.101-110, 2009.

S. Ohshima, S. Hirasawa, and H. Honda, OMPCUDA : OpenMP Execution Framework for CUDA Based on Omni OpenMP Compiler, Beyond Loop Level Parallelism in OpenMP : Accelerators , Tasking and More, pp.161-173, 2010.
DOI : 10.1007/978-3-642-13217-9_13

C. Pereira-da-silva, L. F. Cupertino, D. Chevitarese, M. Aurelio, C. Pacheco et al., Exploring data streaming to improve 3d FFT implementation on multiple GPUs, International Symposium on Computer Architecture and High Performance Computing Workshops, 2010.

M. Wolfe, Implementing the PGI Accelerator model, Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, GPGPU '10, pp.43-50, 2010.
DOI : 10.1145/1735688.1735697