SRC24. Discovering Energy Resource Usage Patterns on Scientific Clusters
Student: Matthew Bae (Harvey Mudd College)
Supervisor: Wucherl Yoo (Lawrence Berkeley National Laboratory)
Abstract: With the growth of scientific clusters, there has been an increase in volumes of data, number of machines, and exploited parallelism. We are now seeing increasing interactions of hardware components within clusters. As a result, system resource usage patterns are becoming increasingly harder to detect. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory (LBNL) uses Simple Linux Utility for Resource Management (SLURM) in its clusters, which outputs logs about the characteristics of jobs. On Cori, LBNL's Cray XC40 supercomputer, we are able to read energy counters for jobs. This allows us to analyze patterns related to energy consumption. We show that energy consumption patterns arise based on different variables such as CPU load and CPU utilization.
Two-page extended abstract: pdf