SLURM User Group Meeting
Authors: Morris Jette (SchedMD LLC)
Abstract: Slurm is an open source workload manager used on many TOP500 systems. It provides a rich set of features including topology aware optimized resource allocation, the ability to expand and shrink jobs on demand, the ability to power down idle nodes and restart them as needed, hierarchical bank accounts with fair-share job prioritization, and many resource limits. The meeting will consist of three parts: The Slurm development team will present details about changes in the new version 16.05, describe the Slurm roadmap, and solicit user feedback. Everyone interested in Slurm use and/or development is encouraged to attend.
Long Description: Slurm is a free open source workload manager in widespread use today with a steadily growing customer base. As of the November 2015 TOP500 list, Slurm was used on 5 of the top 10 systems. Slurm is vendor-neutral with about 250 individual contributors from a multitude of computer vendors, national laboratories, and universities. SC is our best venue to gather such a diverse global community.
The Slurm BOF has been held at the previous five SC conferences with attendance increasing each year (approximately 45, 60, 80 and 120, 170 people in the previous five meetings). The format has been similar in each year, developers presenting users with information about recent work and gathering requirements for future work.
The goals of the Slurm BOF are to inform users about recent developments, plans for future work, and gather requirements for future work. There are two major releases of Slurm each year and the advances in each are substantial. SC is a great venue to keep the user community informed about these developments.
The first presentation will highlight changes in the Slurm version 16.05 release in May 2016, including support for the Intel Knights Landing processor, improved mechanisms to coordinate scheduling of GPUs and CPUs, job dependencies between individual tasks within job arrays.
A second presentation will highlight changes planned for future releases of Slurm, especially version 17.02 to be released in February 2017. Some of those enhancements include support for jobs with heterogeneous resource requirements and managing federated clusters of resources.
We also seek user guidance at the BOF and via a survey in order to help prioritize development for future work.
* Slurm user (yes or no)
* Computer description (node counts and vendors)
* Typical workload (job sizes and run times)
* Current features that are most important to you
* Additional features desired (priority ordered)
* Interested in participating in Slurm consortium?
* Other comments
Conference Presentation: pdf
Birds of a Feather Index