Orchestrating the Deployment of Computations in the Cloud with - - PowerPoint PPT Presentation

orchestrating the deployment of computations in the cloud
SMART_READER_LITE
LIVE PREVIEW

Orchestrating the Deployment of Computations in the Cloud with - - PowerPoint PPT Presentation

Orchestrating the Deployment of Computations in the Cloud with Conductor Alexander Wieder Pramod Bhatotia Ansley Post Rodrigo Rodrigues NSDI 2012 27.04.2012 1 Options for Processing Data in the Cloud Client What's the best Web Services


slide-1
SLIDE 1

1

Orchestrating the Deployment of Computations in the Cloud with Conductor

Alexander Wieder

Pramod Bhatotia Ansley Post Rodrigo Rodrigues

NSDI 2012 27.04.2012

slide-2
SLIDE 2

2

EC2

Options for Processing Data in the Cloud

S3

local

S3

local

Amazon Web Services Client S3

What's the best strategy to use cloud services?

slide-3
SLIDE 3

3

Variety of services and providers with different

  • Pricing models
  • Performance characteristics
  • Locations
  • Interfaces

Hybrid deployments

  • Use own infrastructure and/or multiple different

services at the same time Dynamics during runtime

  • Performance variations
  • Spot markets

Why is choosing the best strategy challenging?

slide-4
SLIDE 4

4

Conductor Goals

Simplify the management of cloud resources:

  • Automatization: Automatically optimize resource

allocation

  • Transparency: Use multiple different services

seamlessly

  • Adaptivity: Automatically adapt to dynamics
  • Performance variations
  • Variable resource cost on spot markets
slide-5
SLIDE 5

5

Outline

  • Conductor System Overview
  • Modeling Computations
  • Using Cloud Resources Transparently
  • Evaluation
slide-6
SLIDE 6

6

Controller Frameworks

High Level System Design

Dryad

submit job to framework submit job to Conductor LP Solver LP based execution model execution plan allocate resources run job monitor execution

How can we model computations? How can we transparently use cloud resources?

slide-7
SLIDE 7

7

Outline

  • Conductor System Overview
  • Modeling Computations
  • Using Cloud Resources Transparently
  • Evaluation

slide-8
SLIDE 8

8

Modeling Computations

  • Hard to model computations in general case
  • Unknown:
  • Data access patterns
  • Processing time
  • Scalability
  • Feasible for specific programming models, e.g.,

MapReduce

slide-9
SLIDE 9

9

Modeling MapReduce Computations

How can we model MapReduce Computations?

  • Data-parallel processing
  • Mostly linear dependencies:
  • Performance
  • Resources
  • Cost

➔ Problem calls for a formulation

as a linear program!

slide-10
SLIDE 10

10

Computation steps:

  • Storing data
  • Transferring data
  • Processing data
  • Migrating data

Graph based model:

  • Vertices: data storage and processing
  • Edges: data transfer

Modeling MapReduce Computations

Storage Providers

S3

local

S3

local

Data Upload Computation Providers

slide-11
SLIDE 11

11

Outline

  • Conductor System Overview
  • Modeling Computations
  • Using Cloud Resources Transparently
  • Evaluation

✔ ✔

slide-12
SLIDE 12

12

Deploying Jobs on the Cloud

uniform key-value interface backend specific interface migrate and upload

Resource Abstraction Layer Storage Computation Frameworks

Dryad

local HD on VM S3

slide-13
SLIDE 13

13

Outline

  • Conductor System Overview
  • Modeling Computations
  • Using Cloud Resources Transparently
  • Evaluation

✔ ✔ ✔

slide-14
SLIDE 14

14

Evaluation

Questions we answer in the evaluation:

  • Can Conductor find optimal execution plans?
  • Can Conductor efficiently adapt to dynamics?
  • Can Conductor enable hybrid deployments?
  • What overheads does Conductor impose?

see paper

slide-15
SLIDE 15

15

Scenario:

  • Job: k-means clustering, 32GB input data
  • Resources: EC2, S3
  • Deadline: 6h
  • Minimize monetary cost

Goal:

  • Automatically select resources
  • Manage data transfer
  • Launch job

Evaluation Finding Optimal Execution Plans

slide-16
SLIDE 16

16

Evaluation Finding Optimal Execution Plans

storing 1/3 on S3 and 2/3 on EC2 is optimal

slide-17
SLIDE 17

17

Evaluation Adapting to Dynamics

Observed resource performance in the cloud can vary for several reasons:

  • Interference with co-located VM instances
  • Network congestion
  • Failures

Scenario:

  • EC2 performance ~3x overestimated

Conductor doesn't allocate enough resources to finish before deadline

slide-18
SLIDE 18

18

Job progress: Allocated nodes: Conductor updated deployment after 1h

Evaluation Adapting to Dynamics

Deadline

slide-19
SLIDE 19

19

Evaluation Adapting to Spot Market Prices

Can Conductor help cutting cost by leveraging spot resources?

slide-20
SLIDE 20

20

Evaluation Adapting to Spot Market Prices

Methodology:

  • Simulate job deployment using EC2 spot instances
  • Spot pricing history over ~4 weeks
  • Conductor uses an oracle or simple pricing predictor

regular

  • racle

predictor

slide-21
SLIDE 21

21

Outline

  • Conductor System Overview
  • Modeling Computations
  • Using Cloud Resources Transparently
  • Evaluation

✔ ✔ ✔ ✔

slide-22
SLIDE 22

22

Summary and Conclusion

Observation: Making best use of the cloud is hard! Conductor's approach:

  • LP-based system model
  • Optimize for user goals
  • Resource abstraction layers
  • Adapt during runtime

Evaluation results: Conductor can efficiently manage cloud deployments Future work: Apply Conductor's approach to other frameworks

slide-23
SLIDE 23

23

Thanks for your Attention!