Presentation

· Presenter IndexPresenters · Organization IndexOrganizations · Search Program · Flagged · Happening Now · QRCode Reader

Paper

: Transient Guarantees: Maximizing the Value of Idle Cloud Capacity

ask a question

give feedback

SessionClouds & Job Scheduling

Session ChairAli R. Butt

Authors

Supreeth Shastri

Amr Rizk

David Irwin

Event Type

Paper

Event Tags

Clouds and Distributed Computing

Intermediate

Performance

TimeThursday, November 17th3:30pm - 4pm

Location355-BC

DescriptionTo reduce waste, platforms have begun to offer idle capacity in the form of transient servers, which they may unilaterally revoke, for much lower prices—∼70-90% less—than on-demand servers. However, transient servers’ revocation characteristics—their volatility and predictability—influence their performance, since they affect the overhead of fault-tolerance mechanisms applications use to handle revocations. Unfortunately, current cloud platforms offer no guarantees on revocation characteristics, which makes it difficult for users to optimally configure (and value) transient servers.

To address the problem, we propose the abstraction of a transient guarantee, which offers probabilistic assurances on revocation characteristics. We present policies for partitioning a variable amount of idle capacity into classes with different transient guarantees to maximize performance and value. We then implement and evaluate these policies on job traces from a production Google cluster. We show that our approach can increase the aggregate revenue from idle server capacity by ∼6.5× compared to existing approaches.

Download PDF

Paper provided by the IEEE Computer Society
Paper also available from the ACM Digital Library

Authors

Supreeth Shastri (presenting)

University of Massachusetts

Amr Rizk

University of Massachusetts

David Irwin

University of Massachusetts

Navigation