Cost-Efficient Resource Management for Scientific Workflows on the - - PowerPoint PPT Presentation

cost efficient resource management for scientific
SMART_READER_LITE
LIVE PREVIEW

Cost-Efficient Resource Management for Scientific Workflows on the - - PowerPoint PPT Presentation

Cost-Efficient Resource Management for Scientific Workflows on the Cloud Ilia Pietri School of Computer Science, University of Manchester, U.K Overview Scientific Meet User User Workflows Constraints Virtual Machines Goals (VMs) Provide


slide-1
SLIDE 1

Ilia Pietri School of Computer Science, University of Manchester, U.K

Cost-Efficient Resource Management for Scientific Workflows on the Cloud

slide-2
SLIDE 2

Scientific Workflows Virtual Machines (VMs) Hosts

User Cloud Provider

Meet User Constraints Goals Provide Energy Efficiency

Overview

1/21

slide-3
SLIDE 3
  • Scientific Workflows

– DAGs* – Data dependency constraints – Gaps in the schedule

  • Goal:

– Increase resource utilisation and achieve cost-efficient provisioning

  • Estimation of resource needs
  • CPU frequency selection

*DAG: Directed Acyclic Graph

C D E F A G B

Problem Description

2/21

slide-4
SLIDE 4
  • A wide range of possible configurations

– Number of resources – CPU frequency

  • New pricing schemes

– Pricing for CPU provisioning – Faster resources cost more

  • Which option to choose?

– Extra resources at a lower frequency – Less resources but faster

3/21

Motivation

slide-5
SLIDE 5

4/21

Motivation

slide-6
SLIDE 6

Motivation

5/21

slide-7
SLIDE 7
  • Assuming the user is interested in executing

a scientific workflow

– How many resources to provision? – Cost vs performance User: cost-efficient execution, as quickly as possible Provider: minimize energy costs – Performance modelling estimate workflow execution time (and related costs) for a different number of slots

Determining the Amount of Resources

6/21

slide-8
SLIDE 8
  • Level-based Estimation Model

– Workflow structure and individual job characteristics – Task assignment to levels

  • Top-Down Approach
  • Bottom-Up Approach

– Overall workflow performance estimation

  • Based on level characteristics

C D E F A H B G Pietri I., Juve G., Deelman E., Sakellariou R., A Performance Model to Estimate Execution Time

  • f Scientific Workflows on the Cloud. In 9th WORKS @ SC14. 2014.

A Performance Model to Estimate Execution Time of Scientific Workflows on the Cloud

7/21

slide-9
SLIDE 9

C D E F A H B G

A Performance Model to Estimate Execution Time of Scientific Workflows on the Cloud

  • Level-based Estimation Model

– Workflow structure and individual job characteristics – Task assignment to levels

  • Top-Down Approach
  • Bottom-Up Approach

– Overall workflow performance estimation

  • Based on level characteristics

7/21 Pietri I., Juve G., Deelman E., Sakellariou R., A Performance Model to Estimate Execution Time

  • f Scientific Workflows on the Cloud. In 9th WORKS @ SC14. 2014.
slide-10
SLIDE 10

C D E F A H B G

A Performance Model to Estimate Execution Time of Scientific Workflows on the Cloud

  • Level-based Estimation Model

– Workflow structure and individual job characteristics – Task assignment to levels

  • Top-Down Approach
  • Bottom-Up Approach

– Overall workflow performance estimation

  • Based on level characteristics

7/21 Pietri I., Juve G., Deelman E., Sakellariou R., A Performance Model to Estimate Execution Time

  • f Scientific Workflows on the Cloud. In 9th WORKS @ SC14. 2014.
slide-11
SLIDE 11

Performance Modelling for Montage

8/21

slide-12
SLIDE 12
  • Estimation of makespan for

– Different number of slots – Different CPU frequencies runtimef=[βt*(f/fmax-1)+1]*runtimefmax

  • Use of estimated makespan to predict

– User monetary costs – Energy costs

Using the model in practice

9/21

slide-13
SLIDE 13
  • The user is interested in executing a

workflow within a deadline

  • Given an initial schedule*

– Lower the CPU frequency to execute the tasks and achieve energy savings Utilising the slack time

Improving the Schedule for Energy Savings

* an allocation of the tasks to the resources

10/21

slide-14
SLIDE 14
  • Slack time:

Maximum delay in task execution without exceeding the deadline

− slackTimet=min(spareTimet->s+ slackTimes), s ϵ succt

* For the exit node:

− slackTimetexit=deadline-finishTimetexit − Where spare time is the maximum delay in task execution without affecting the successors start times spareTimet=startTimes-finishTimet-comCostt->s

[1]

Slack Time

[1] R. Sakellariou and H. Zhao, “A low-cost rescheduling policy for efficient mapping of workflows

  • n grid systems,” Scientific Programming, vol. 12, no. 4, pp. 253–262, 2004

11/21

slide-15
SLIDE 15

Algorithms with DVFS Techniques

P0 P1 P2 2 4 7 6 3 8 5

Example

1

12/21

slide-16
SLIDE 16

Algorithms with DVFS Techniques

P0 P1 P2 4 7 6 8 5

Example

Data transfer 6->8 finishes Deadline Execution of task 8 starts Execution

  • f task 8

finishes

2 3 1

12/21

slide-17
SLIDE 17

Algorithms with DVFS Techniques

P0 P1 P2 2 4 7 6 8 5

Example

Penalty on execution time Start time

  • f task 8

is affected

3 1

12/21

slide-18
SLIDE 18
  • Frequency scaling may not always be

energy-efficient

– Power savings, but longer execution time: runtimef=[βt*(f/fmax-1)+1]*runtimefmax – Idle power to be considered

Energy Savings

runtimefmax timeidle runtimef

Pfmax Pidle Pf

[1] C.H. Hsu, U. Kremer . The design, implementation, and evaluation of a compiler algorithm for cpu energy reduction. ACM SIGPLAN Notices , 38(5):38–48, 2003. [2] M. Etinski, Corbalan, J. Labarta, and M. Valero, “Understanding the future of energy-performance trade-off via DVFS in HPC environments,” JPDC, vol. 72, no. 4, pp. 579–590, 2012.

13/21

slide-19
SLIDE 19
  • Workflows may consist of heterogeneous tasks

– Different energy vs frequency behaviour – Reducing the frequency of a task may impact

  • verall schedule
  • Idea

– Start with an initial schedule e.g. HEFT[1] – Apply frequency scaling iteratively

  • Based on the energy gain of each task
  • Assessing the impact on overall energy consumption

Idea

[1] H. Topcuoglu, S. Hariri, and M.-Y. Wu, “Performance-effective and low-complexity task scheduling for heterogeneous computing,” IEEE TPDS, vol. 13, no. 3, pp. 260–274, 2002 14/21

slide-20
SLIDE 20

ESFS (Energy-aware Stepwise Frequency Scaling)

  • Stage 1. Start with an initial schedule at maximum

frequency.

  • Stage 2. Using the next available frequency (bound):

− 1. Identify the most profitable tasks in terms of energy gain(beyond a threshold). − 2. Update the plan using the lower frequencies for these tasks. − 3. Assess the impact to overall energy consumption (for the whole workflow). − 4. Continue with the procedure as long as energy savings are achieved (stage 2).

Energy-Aware Workflow Scheduling Using Frequency Scaling

  • I. Pietri, R. Sakellariou. “Energy-Aware Workflow Scheduling Using Frequency Scaling”. ICPP

Workshops (PASA), 2014. 15/21

slide-21
SLIDE 21

SIPHT with 1000 tasks

[1] I. Pietri, R. Sakellariou. “Energy-Aware Workflow Scheduling Using Frequency Scaling”. ICPP Workshops (PASA), 2014. [2] Q. Huang et al. “Enhanced energy-efficient scheduling for parallel applications in cloud,” in Proceedings of the 12th IEEE/ACM CCGrid. IEEE, 2012, pp. 781–786.

16/21

slide-22
SLIDE 22
  • We proposed an energy-efficient approach that

selects the CPU frequency for each task.

  • In practice, we may need to choose a CPU

frequency per resource (or core). – CPU provisioning charged based on frequency

  • Linear relation between frequency and price

e.g. CloudSigma[1] and ElasticHosts[2]

– User: minimize the cost within a deadline

Selecting CPU Frequencies Per Resource

[1] https://www.cloudsigma.com/ [2] http://www.elastichosts.com/ 17/21

slide-23
SLIDE 23

CSFS (Cost-based Stepwise Frequency Selection)

  • Stage 1. Start with a schedule at maximum frequency.
  • Stage 2. Using the next available frequency:

− 1. Select the resources with cost savings for the new frequency. − 2. Update the plan using the chosen frequency for these resources. − 3. Accept the new plan if costs savings and continue with the same procedure (go to 1). − 4. Otherwise terminate.

Cost-efficient Provisioning of Cloud Resources Priced by CPU

  • I. Pietri and R. Sakellariou. Cost-Efficient Provisioning of Cloud Resources Priced by

CPU frequency. Best poster award at UCC2014, 2014. 18/21

slide-24
SLIDE 24

Montage with 1000 tasks

19/21

slide-25
SLIDE 25

Pricing in practice

  • How to use pricing to motivate users for

energy-efficient scheduling in practice?

− Trade-off between

  • Energy savings for the provider
  • Minimization of user cost
  • D. Lucanin, I. Pietri, I. Brandic and R. Sakellariou. A Cloud Controller for Performance-Based
  • Pricing. In IEEE Cloud 2015, to appear.

20/21

slide-26
SLIDE 26

A Cloud Controller for Performance-based Pricing

  • D. Lucanin, I. Pietri, I. Brandic and R. Sakellariou. A Cloud Controller for Performance-based
  • Pricing. In IEEE Cloud 2015, to appear.
  • Motivation

– The impact on application performance from frequency reduction may vary depending on the application characteristics

  • Perceived-performance pricing

– CPU provisioning charged based on impact on performance

  • CPU frequency used
  • VM CPU-boundedness
  • Two-stage cloud controller

– Allocation of VMs to PMs – CPU frequency scaling when energy savings exceed the revenue losses

21/21

slide-27
SLIDE 27

Thank you for your time!

Cost-Efficient Resource Management for Scientific Workflows on the Cloud