02. Scalable Communication Architectures for GPU-Centric Systems
Authors: Benjamin Klenk (University of Heidelberg)Holger Fröning (University of Heidelberg)
Abstract: Heterogeneity in computing has enabled higher performance and increased energy efficiency in the past years. Accelerators, such as GPUs, have been deployed to offload compute-intensive tasks. However, the CPU has been the main processor, delegating work to the GPU and handling communication with other nodes. This model is changing, as the latest GPUs bear more responsibilities with hardware-assisted unified memory and integrated NVLink, enhancing the GPU into a peer device.
Allowing GPUs to autonomously source and sink network traffic seems promising, as costly interactions with CPUs can be avoided. Communication can be offloaded to dedicated networking hardware or performed on the GPU on top of shared memory models like Nvidia’s NVLink. This poster shows that for GPU-to-GPU traffic, offloading is always superior to CPU-controlled communication. We also present some early results and insights of communication being processed on the GPU without additional networking hardware.
Two-page extended abstract: pdf