Experimental Infrastructure and Methodology for HPC Cloud Research

Authors: Dr. Kate Keahey (Argonne National Laboratory)

BP Abstract: The confluence of cloud computing, HPC, and Big Data is attracting increasingly more research interest with challenges ranging from systems research, networking, through resource and power management to new algorithms and innovative applications. New testbeds have been established to support this research and have attracted an active community of users. The objective of this BOF is to provide a forum that brings together operators of experimental testbeds for HPC and cloud computing as well as their current and prospective users to discuss current and future capabilities, user requirements, and experimental methodology.

Long Description: The emergence of cloud computing as critical infrastructure for scientific, enterprise, and commercial computing has happened primarily in the commercial space, which has limited the availability of research infrastructure. Funding agencies around the globe have responded with a variety programs to create a range of experimental cloud infrastructures. While each program has its own goals and target user bases, these infrastructures have many things in common. In particular, all operators of experimental cloud infrastructures work with user communities to assess the changing requirements for current research, explore methodology issues, and discuss interoperability and federation. The objective of this BoF is to provide a forum where this discussion can be continued. This BOF was previously held at SC14 (“Experimental Infrastructures for Open Cloud Research”), has been very well attended (50-100 people), and highly successful in its objective of creating a community of experimental infrastructure operators. The discussion was subsequently continued at and NSF workshop ( with report integrating input from both venues available under the link above. The collaborations formed then have been continued at a variety of international events and resulted in concrete steps towards federation of which the Chameleon identity federation with GENI is one example. The testbeds that at the time of the last workshop were only being built have now been fully operational for over a year which makes this a particularly good time to reconnect with the community of operators and users and discuss what new requirements had emerged. As was the case last time, we expect representatives of a number of existing projects, including the European Grid5000 project, the NSF FutureGrid project, and GENI community to participate. This BOF is relevant to many of the attendees of SC16. For individuals from an HPC center or vendor who are wrestling with the role cloud computing can play in supporting their users or who are responsible for building, supporting or employing cloud infrastructure, the BOF will provide insights into the architectural alternatives. For academic researchers who studying large scale systems, whether HPC or cloud, and are stymied by either lack of local resources or by limitations on computer science research within production HPC environments, the BOF will provide information about existing infrastructure. For researchers, practitioners and vendors in distributed or grid computing who regularly attend SC, they will learn about cloud testbeds, a valuable resource for their current work, and about cloud computing, a closely related area. By providing a mechanism for participants to communicate with each other through in person interactions, this BOF will be a catalyst for participants to influence experimental infrastructure projects and their policies, as well as to learn from one another. While we do not anticipate holding a formal survey, to facilitate participation the attendees will share an electronic document with defined contribution categories soliciting input on qualities such on hardware requirements, testbed capabilities, artifacts (traces, workloads, etc.), methodology challenges (reproducibility issues, etc.). After the BoF we will summarize this feedback in the form of a report.

