OpenStack for HPC: Best Practices for Optimizing Software-Defined Infrastructure

Authors: Mr. Jonathan Mills (NASA)

BP Abstract: OpenStack is becoming increasingly capable for management of HPC infrastructure and support of HPC workloads. However, performance overheads, network integration, and system complexity all combine to pose a daunting challenge. The optimal outcome is to achieve all the benefits of software-defined infrastructure without paying any of the penalties. How can we get closer to achieving this? This BOF is aimed at architects, administrators and software engineers who are interested in designing and deploying OpenStack infrastructure for HPC, but are unsure how to begin. Speakers from OpenStack’s Scientific Working Group will share their experiences and engage the audience in discussions.

Long Description: Cloud Computing represents one of the most significant shifts in IT, and the group of projects that comprise OpenStack are the de facto standards for putting cloud technologies and methodologies within reach. The level of interest in the application of OpenStack in the HPC and Research Computing space reflects the already strong representation of scientific OpenStack deployments amongst the research community. In response to the emergence of this community, the OpenStack Foundation has recently created a Scientific Working Group. Over seventy people attended the inaugural working group meeting, in April 2016, and actively participated in collaborative note taking. There was strong representation from SC stalwarts including several US computing centers, current NSF projects, national labs and prominent educational and research institutions, as well as their European and APJ counterparts. It is the intent of this BOF to provide the broader HPC community an overview of the challenges of supporting HPC workloads with OpenStack and best practices adopted by members of the OpenStack community. Dealing with the complexity of Neutron networking, particularly with Neutron’s many component technologies like SR-IOV/VXLAN/VLAN, Distributed Virtual Routers, SNAT, etc, will be an important topic. The need to access parallel filesystems from within OpenStack tenant networks is a popular refrain, so another topic will be the use of storage systems such as Ceph, GPFS, and Lustre for shared filesystems, virtual machine root disks, virtual machine images, and scratch space. Other HPC-centric topics revolve around accounting and scheduling, including practical resource allocation approaches with the on-demand IaaS model. Through an open and thoughtful exchange, we intend to begin developing a shared understanding and vision of how open cloud computing solutions can best support existing and emerging uses in a range of research disciplines. The sponsors of this BOF represent a range of those leading the charge in the OpenStack community. The group (who will start the session with a series of lightning talks) includes: Stig Telfer (Cambridge, co-chair scientific-wg) Blair Bethwaite (Monash, co-chair scientific-wg) Jonathan Mills (NASA) Mike Lowe (Indiana) Tim Randles (LANL) Robert Budden (PSC) They cover a wide range of architectural approaches, including: bare-metal and virtualisation (both machine and OS); along with integrated high-performance interconnects, storage and computational accelerators. This group is well placed to set the scene for HPC on OpenStack and answer and discuss audience questions, including what does and does not work, and where challenges remain. This BOF has not been previously held at SC. However, at SC15, a BOF entitled "Virtualization and Clouds in HPC" was widely attended, filling the room to capacity, with many more potential attendees refused entry due to lack of space. The tangible outcomes of this BOF will be: * A public Etherpad of collaborative notes and chat from during and after the session Recruitment of new members to the OpenStack Scientific Working Group * FAQ added to OpenStack documentation which summarizes the major contributions and limitations of OpenStack in HPC * A list of opportunities for the OpenStack Scientific Working Group to work towards addressing these challenges.

