Scalable NUMA-Aware Wilson-Dirac on Supercomputers - Mines Paris Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Scalable NUMA-Aware Wilson-Dirac on Supercomputers

Résumé

We revisit the Wilson-Dirac operator, also referred as Dslash, on NUMA manycore vector machines and thereby seek an efficient supercomputing implementation. Quantum Chro-moDynamics (QCD) is the theory of the strong nuclear force and its discrete formalism is the so-called Lattice Quantum ChromoDynamics (LQCD). Wilson-Dirac is the major computing kernel in LQCD, where a special attention is paid to large scale simulations. The corresponding computing demand is tremendous at various levels from storage to floating-point operations, thus the crucial need for powerful supercomputers. Designing efficient LQCD codes on modern (mostly hybrid) supercomputers requires to efficiently exploit all available levels of parallelism including accelerators. Since Wilson-Dirac is a coarse-grain stencil computation performed on a huge volume of data, any performance and scalability related investigation should skillfully address memory accesses and interprocessor communication overheads. In order to lower the latter, explicit shared memory implementations should be considered at the level of a compute node, since this will lead to a less complex data communication graph and thus (at least intuitively) reduce the overall communication latency. We focus on this aspect and propose a novel efficient NUMA-aware scheduling, together with a combination of the major HPC strategies for large-scale LQCD. We reach nearly optimal performances on a single core and a significant scalability improvement on several NUMA nodes. Then, using a classical domain decomposition approach, we extend our scheduling to a large cluster of many-core nodes, thus illustrating the global efficiency of our hybrid implementation.
Fichier principal
Vignette du fichier
A-661.pdf (652.39 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01529268 , version 1 (30-05-2017)

Identifiants

  • HAL Id : hal-01529268 , version 1

Citer

Claude Tadonki. Scalable NUMA-Aware Wilson-Dirac on Supercomputers. The 2017 International Conference on High Performance Computing & Simulation (HPCS 2017), Jul 2017, Genoa, Italy. pp.315-324. ⟨hal-01529268⟩
162 Consultations
210 Téléchargements

Partager

Gmail Facebook X LinkedIn More