SRC11. Electron Dynamics Simulation with Time-Dependent Density Functional Theory on Large Scale Many-Core Systems
Student: Yuta Hirokawa (University of Tsukuba)
Supervisor: Taisuke Boku (University of Tsukuba)
Abstract: Many-core processors such as the Intel Xeon Phi (Knights Corner, KNC) and GPUs provide new solutions for HPC systems. However, it is difficult to achieve high computation efficiency on them due to the different characteristics from traditional processors such as Intel Xeon CPU. In this study, I implement an electron dynamics simulation as real scientific code to the three types of many-core processors. A kernel of stencil computation that dominates the total computation time is optimized in a single-thread level with explicit vectorization with 256/512-bit SIMD instructions. As a result, the stencil computation performance achieves maximum 591.2 GFLOPS with two KNC and 215.1 GFLOPS with a SPARC64 XIfx while single KNL (Knights Landing) chip achieves 707.2 GFLOPS. For the entire code, the cooperative computation with KNC and Xeon achieves 1.97 times better performance compared with the CPU-only utilization at strong scaling case.
Two-page extended abstract: pdf