Advanced OpenMP: Performance and 4.5 Features
Presenters
Event Type
Tutorial
Accelerators
Advanced
Intermediate
Programming Systems
Location255-C
DescriptionWith the increasing prevalence of multicore processors, shared-memory programming models are essential. OpenMP is a popular, portable, widely supported, and easy-to-use shared-memory model. Developers usually find OpenMP easy to learn. However, they are often disappointed with the performance and scalability of the resulting code. This disappointment stems not from shortcomings of OpenMP, but rather with the lack of depth with which it is employed. Our “Advanced OpenMP Programming” tutorial addresses this critical need by exploring the implications of possible OpenMP parallelization strategies, both in terms of correctness and performance.
While we quickly review the basics of OpenMP programming, we assume attendees understand basic parallelization concepts. We focus on performance aspects, such as data and thread locality on NUMA architectures, false sharing, and exploitation of vector units. We discuss language features in-depth, with emphasis on advanced features like tasking or cancellation. We close with the presentation of the directives for attached compute accelerators. Throughout all topics we present the recent additions of OpenMP 4.5 and extensions that have been subsequently adopted by the OpenMP Language Committee.
While we quickly review the basics of OpenMP programming, we assume attendees understand basic parallelization concepts. We focus on performance aspects, such as data and thread locality on NUMA architectures, false sharing, and exploitation of vector units. We discuss language features in-depth, with emphasis on advanced features like tasking or cancellation. We close with the presentation of the directives for attached compute accelerators. Throughout all topics we present the recent additions of OpenMP 4.5 and extensions that have been subsequently adopted by the OpenMP Language Committee.










