Scanning polyhedra with do loops, pp.39-50, 1991. ,
URL : https://hal.archives-ouvertes.fr/hal-00752774
A polynomial time algorithm for counting integral points in polyhedra when the dimension is fixed, pp.566-572, 1993. ,
Improving Data Locality in Static Control Programs, 2004. ,
Definition and SIMD implementation of a multi-processing architecture approach on FPGA, Design Automation and Test in Europe, pp.610-615, 2008. ,
Counting solutions to linear and nonlinear constraints through ehrhart polynomials: Applications to analyze and transform scientific programs, 1996. ,
URL : https://hal.archives-ouvertes.fr/hal-01100306
Tile size selection using cache organization and data layout, pp.279-290, 1995. ,
DOI : 10.1145/223428.207162
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.128.9167
Interprocedural Array Region Analyses, International Journal of Parallel Programming, vol.2, issue.3, pp.513-546, 1996. ,
DOI : 10.1007/BF03356758
URL : https://hal.archives-ouvertes.fr/hal-00752611
Programming an FPGA-based Super Computer Using a C-to-VHDL Compiler: DIME-C, Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007), pp.280-286, 2007. ,
DOI : 10.1109/AHS.2007.89
Efficient hardware code generation for FPGAs, ACM Transactions on Architecture and Code Optimization, vol.5, issue.1, pp.1-26, 2008. ,
DOI : 10.1145/1369396.1369402
CUDA, Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2, pp.52-61, 2009. ,
DOI : 10.1145/1513895.1513902
Parametric multi-level tiling of imperfectly nested loops, Proceedings of the 23rd international conference on Conference on Supercomputing, ICS '09, pp.147-157, 2009. ,
DOI : 10.1145/1542275.1542301
URL : https://hal.archives-ouvertes.fr/hal-00645328
Semantical interprocedural parallelization: An overview of the PIPS project, 1991 International Conference on Supercomputing, 1991. ,
URL : https://hal.archives-ouvertes.fr/hal-00984684
The cache performance and optimizations of blocked algorithms, Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp.63-74, 1991. ,
Openmp to gpgpu: a compiler framework for automatic translation and optimization, PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, pp.101-110, 2009. ,
A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction, Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, GPGPU '10, pp.51-61 ,
DOI : 10.1145/1735688.1735698
URL : https://hal.archives-ouvertes.fr/inria-00551084
Effective Source-to-Source Outlining to Support Whole Program Empirical Optimization, Proc. Int'l. Wkshp. Languages and Compilers for Parallel Computing, 2009. ,
DOI : 10.1007/978-3-642-13374-9_21
Implementing the PGI Accelerator model, Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, GPGPU '10, pp.43-50, 2010. ,
DOI : 10.1145/1735688.1735697