SC16 Salt Lake City, UT

DS1. Characterizing and Improving Power and Performance of HPC Networks

Student: Taylor L. Groves (University of New Mexico)
Advisor: Dorian C. Arnold (University of New Mexico)
Abstract: Networks are the backbone of modern HPC systems. They serve as a critical piece of infrastructure, tying together applications, analytics, storage, and visualization. Despite this importance, we have not fully explored how evolving communication paradigms and network design will impact scientific workloads. As networks expand in the race towards exascale, a principled approach should be taken to reexamine this relationship, so that the community better understands (1) characteristics and trends in HPC communication, (2) how to best design HPC networks and (3) opportunities in the future to save power or enhance the performance. Our thesis is that by developing new models, benchmarks, and monitoring techniques, we may better understand how to improve performance and power of HPC systems, with a focus on networks. This dissertation highlights opportunities for improving network performance and power efficiency, while uncovering pitfalls (and mitigation strategies) brought about by shifting trends in HPC communication.

