Programming Intel's 2nd Generation Xeon Phi (Knights Landing)
Event Type
Tutorial
Accelerators
Advanced
Intermediate
Programming Systems
Location255-B
DescriptionIntel's next generation Xeon Phi, Knights Landing (KNL), brings many changes from the first generation, Knights Corner (KNC). The new processor supports self-hosted nodes, connects cores via mesh topology rather than a ring, and uses a new memory technology, MCDRAM. It is based on Intel’s x86 technology with wide vector units and multiple hardware threads per core. KNL introduces new challenges for the programmer because of the multiple configuration options for MCDRAM memory.
This tutorial is designed for experienced programmers familiar with MPI and OpenMP. We’ll review the KNL architecture, and discuss the differences between KNC and KNL. We'll discuss the impact of the different MCDRAM memory configurations (flat, cache, hybrid) and the different modes of cluster configuration (all-to-all, quadrant, sub-NUMA quadrant). Recommendations regarding MPI task layout when using KNL with the Intel OmniPath fabric will be provided.
As in past tutorials, we will focus on the use of reports and directives to improve vectorization and the implementation of proper memory access. We will also showcase new Intel VTune Amplifier XE capabilities that allow for in-depth memory access analysis and hybrid code profiling. Hands-on exercises will be executed on the KNL-upgraded Stampede system at the Texas Advanced Computing Center (TACC).
This tutorial is designed for experienced programmers familiar with MPI and OpenMP. We’ll review the KNL architecture, and discuss the differences between KNC and KNL. We'll discuss the impact of the different MCDRAM memory configurations (flat, cache, hybrid) and the different modes of cluster configuration (all-to-all, quadrant, sub-NUMA quadrant). Recommendations regarding MPI task layout when using KNL with the Intel OmniPath fabric will be provided.
As in past tutorials, we will focus on the use of reports and directives to improve vectorization and the implementation of proper memory access. We will also showcase new Intel VTune Amplifier XE capabilities that allow for in-depth memory access analysis and hybrid code profiling. Hands-on exercises will be executed on the KNL-upgraded Stampede system at the Texas Advanced Computing Center (TACC).








