92. An I/O Load Balancing Framework for Large-Scale Applications (IO 2.0)
Authors: Sarah M. Neuwirth (University of Heidelberg)Feiyi Wang (Oak Ridge National Laboratory)Sarp Oral (Oak Ridge National Laboratory)Ulrich Bruening (University of Heidelberg)
Abstract: Designed for capacity and capability, HPC storage systems are inherently complex and shared among multiple, concurrent jobs competing for resources. The lack of centralized coordination and control often render the end-to-end I/O paths vulnerable to load imbalance and contention. With the emergence of data-intensive HPC applications, storage systems are further contended for performance and scalability. IO is a topology-aware, load balancing library for mitigating resource contention. This work introduces IO 2.0, a dynamic, shared load balancing framework based on IO. It transparently intercepts file creation calls during runtime to balance the workload over all available storage targets. The utilization of IO 2.0 requires no source code modification and is independent from any I/O middleware. We demonstrate the effectiveness of our framework on the Titan system with a synthetic benchmark in a noisy production environment.
Two-page extended abstract: pdf