C. Ancourt and F. Irigoin, Scanning polyhedra with do loops, pp.39-50, 1991.
URL : https://hal.archives-ouvertes.fr/hal-00752774

A. I. Barvinok, A polynomial time algorithm for counting integral points in polyhedra when the dimension is fixed, pp.566-572, 1993.

C. Bastoul, Improving Data Locality in Static Control Programs, 2004.

P. Bonnot, F. Lemonnier, G. Edelin, G. Gaillat, O. Ruch et al., Definition and SIMD implementation of a multi-processing architecture approach on FPGA, Design Automation and Test in Europe, pp.610-615, 2008.

P. Clauss, Counting solutions to linear and nonlinear constraints through ehrhart polynomials: Applications to analyze and transform scientific programs, 1996.
URL : https://hal.archives-ouvertes.fr/hal-01100306

S. Coleman and K. S. Kinley, Tile size selection using cache organization and data layout, pp.279-290, 1995.
DOI : 10.1145/223428.207162

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.128.9167

B. Creusillet and F. Irigoin, Interprocedural Array Region Analyses, International Journal of Parallel Programming, vol.2, issue.3, pp.513-546, 1996.
DOI : 10.1007/BF03356758

URL : https://hal.archives-ouvertes.fr/hal-00752611

G. Genest, R. Chamberlain, and R. J. Bruce, Programming an FPGA-based Super Computer Using a C-to-VHDL Compiler: DIME-C, Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007), pp.280-286, 2007.
DOI : 10.1109/AHS.2007.89

Z. Guo, W. Najjar, and B. Buyukkurt, Efficient hardware code generation for FPGAs, ACM Transactions on Architecture and Code Optimization, vol.5, issue.1, pp.1-26, 2008.
DOI : 10.1145/1369396.1369402

T. D. Han and T. S. Abdelrahman, CUDA, Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2, pp.52-61, 2009.
DOI : 10.1145/1513895.1513902

A. Hartono, M. M. Baskaran, C. Bastoul, A. Cohen, S. Krishnamoorthy et al., Parametric multi-level tiling of imperfectly nested loops, Proceedings of the 23rd international conference on Conference on Supercomputing, ICS '09, pp.147-157, 2009.
DOI : 10.1145/1542275.1542301

URL : https://hal.archives-ouvertes.fr/hal-00645328

F. Irigoin, P. Jouvelot, and R. Triolet, Semantical interprocedural parallelization: An overview of the PIPS project, 1991 International Conference on Supercomputing, 1991.
URL : https://hal.archives-ouvertes.fr/hal-00984684

M. S. Lam, E. E. Rothberg, and M. E. Wolf, The cache performance and optimizations of blocked algorithms, Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp.63-74, 1991.

S. Lee, S. J. Min, and R. Eigenmann, Openmp to gpgpu: a compiler framework for automatic translation and optimization, PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, pp.101-110, 2009.

A. Leung, N. Vasilache, B. Meister, M. Baskaran, D. Wohlford et al., A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction, Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, GPGPU '10, pp.51-61
DOI : 10.1145/1735688.1735698

URL : https://hal.archives-ouvertes.fr/inria-00551084

C. Liao, D. J. Quinlan, R. Vuduc, and T. Panas, Effective Source-to-Source Outlining to Support Whole Program Empirical Optimization, Proc. Int'l. Wkshp. Languages and Compilers for Parallel Computing, 2009.
DOI : 10.1007/978-3-642-13374-9_21

M. Wolfe, Implementing the PGI Accelerator model, Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, GPGPU '10, pp.43-50, 2010.
DOI : 10.1145/1735688.1735697