Extended Task Queuing: Active Messages for Heterogeneous Systems
SessionTask-Oriented Runtimes
Session ChairZoran Budimlic
Authors
Event Type
Paper
Accelerators
Clouds and Distributed Computing
Heterogeneous Systems
Intermediate
Introductory
OS and Runtime Systems
System Software
Location355-E
DescriptionAccelerators have emerged as an important component of modern cloud, datacenter, and HPC computing environments. However, launching tasks on remote accelerators across a network remains unwieldy, forcing programmers to send data in large chunks to amortize the transfer and launch overhead. By combining advances in intra-node accelerator unification with one-sided Remote Direct Memory Access (RDMA) communication primitives, it is possible to efficiently implement lightweight tasking across distributed-memory systems.
This paper introduces Extended Task Queuing (XTQ), an RDMA-based active messaging mechanism for accelerators in distributed-memory systems. XTQ's direct NIC-to-accelerator communication decreases inter-node GPU task launch latency by 10-15% for small-to-medium sized messages and ameliorates CPU message servicing overheads. These benefits are shown in the context of MPI accumulate, reduce, and allreduce operations with up to 64 nodes. Finally, we illustrate how XTQ can improve the performance of popular deep learning workloads implemented in the Computational Network Toolkit (CNTK).
This paper introduces Extended Task Queuing (XTQ), an RDMA-based active messaging mechanism for accelerators in distributed-memory systems. XTQ's direct NIC-to-accelerator communication decreases inter-node GPU task launch latency by 10-15% for small-to-medium sized messages and ameliorates CPU message servicing overheads. These benefits are shown in the context of MPI accumulate, reduce, and allreduce operations with up to 64 nodes. Finally, we illustrate how XTQ can improve the performance of popular deep learning workloads implemented in the Computational Network Toolkit (CNTK).
Download PDF
Paper provided by the IEEE Computer SocietyPaper also available from the ACM Digital Library
Authors
Michael LeBeane (presenting)








