Quality-of-Service and Resource management support in Task-Centric - - PowerPoint PPT Presentation

quality of service and resource management support in
SMART_READER_LITE
LIVE PREVIEW

Quality-of-Service and Resource management support in Task-Centric - - PowerPoint PPT Presentation

Quality-of-Service and Resource management support in Task-Centric Models Artur Podobas, Mats Brorsson, Vladimir Vlassov {podobas,matsbror,vladv}@kth.se 1 Contributions Show that it is possible (with benefits) to achieve QoS in user-space


slide-1
SLIDE 1

1

Quality-of-Service and Resource management support in Task-Centric Models Artur Podobas, Mats Brorsson, Vladimir Vlassov {podobas,matsbror,vladv}@kth.se

slide-2
SLIDE 2

2

Contributions

Show that it is possible (with benefits) to achieve QoS in user-space with task-centric programming models Increase the resource awareness of task-centric runtime systems Empower the task-centric programming with timing constrained tasks Reduced power consumption

slide-3
SLIDE 3

3

Outline

What is QoS?

Task-Centric scheduling and QoS-awareness

Timing as a QoS constraint

Current ideas and implementation

Preliminary Results

Conclusions

slide-4
SLIDE 4

4

What is Quality-of-Service?

Maximize the user's perceived experience

QoS-needs exist in all system abstraction layers

  • Multimedia, Web browsers,...
  • Operating System,...
  • NoC interconnects,...

Often combined with Resource-management

  • Enough resources to satisfy application QoS
  • …but also not too many to prevent degradation of
  • ther applications or to limit power consumption
slide-5
SLIDE 5

5

Task-Centric scheduling and QoS-awareness The task-centric paradigm:

 Exploiting dynamic parallelism within an application

 Programmer exposes available parallelism encapsulated as tasks  A task can dynamically generate new tasks

 The task-centric scheduler distributes work across the acquired

resources

#pragma omp task in() out() inout() merge(v1,v2,N); #pragma omp task in() out inout() { ... }

Distributed Scheduler P0 P0 P0 P4

slide-6
SLIDE 6

6

Task-Centric scheduling and QoS-awareness

There already exists tons of research concerning QoS and resource-management

  • Are they not adaptable to the task-centric paradigm?

Not necessarily...

  • Existing solutions are within kernel-, hypervisor- or

middleware-space

  • They do not assume multiple layers of scheduling

(OS/user-level run-time for multiprogrammed workloads)

slide-7
SLIDE 7

7

Task-Centric scheduling and QoS-awareness

However, a task-centric runtime system contains a scheduler:

  • A distributed scheduler that assigns tasks to cores

and which may also control resources (preferably in cooperation with the OS)

  • Middleware can get incorrect readings of an

application’s resource usage

  • The task-centric runtime system knows what tasks

exist, will exist, and about history

slide-8
SLIDE 8

8

Timing as a QoS constraint – Soft real-time systems

We chose to use timing to specify QoS demands of the application:

  • Let the programmer specify the timing behavior of

tasks

  • The timing should be specified so-that violating the

constraint will result in a degraded experience for the user

The timing-constraints will guide the scheduler in taking decisions

  • Tasks with tightest timing constraints execute first
  • Allow the scheduler to drop tasks predicted to violate

their timing constraints

  • Predict what resources can be turned off to save power

and when additional resources are needed

slide-9
SLIDE 9

9

Timing as a QoS constraint

Extensions to the existing OpenMP directive to support timing-behavior

#pragma omp task deadline(time) release_after(time) ON_ERROR(OMP_SKIP | OMP_NO_SKIP)

deadline() – Specifies the latest time a task should finish executing. release_after() – Specified the earliest time a task can start executing. ON_ERROR() - Specifies if this task may or may not be dropped

slide-10
SLIDE 10

10

Timing as a QoS constraint now = omp_get_wtime(); #pragma omp task deadline(now + 5 ms)

fft_array();

Task creation point Task's timing constraint now now + 5ms

slide-11
SLIDE 11

11

Current ideas and implementation

The two global goals of the runtime scheduler are:

  • Strive to minimize the amount of tasks violating their

timing constraints

  • Re-actively or pro-actively conserve resource

according to the needs of the application (throughout execution) to reduce power consumption

slide-12
SLIDE 12

12

Current ideas and implementation

Goal 1: “Strive to minimize the amount of tasks violating their timing constraints”

Solution:

  • Integrate an Earliest-Deadline-First, queuing policy to ensure that

the scheduler always executes tasks with the earliest deadline

  • Integrate an critical queue that handle tasks that, according to their

history and timing constraints, might miss their deadline

task1 task2 task3 5000 10000 15000

Predictor EDF-queue Critical-queue

slide-13
SLIDE 13

13

Current Ideas and Implementation

Goals 2: “Re-actively or pro-actively conserve resources according to the needs of the application (throughout execution)”

Current Solution:

  • A “fuzzy-logic” approach monitoring the critical-queue

and current timing violations to (de-) active resources

Resource regulator

Critical-queue

Target Miss ratio Amount of timing Violations < P1 P2

...

Pn

slide-14
SLIDE 14

14

Current Ideas and Implementation

We are using the Nanos++ runtime library (under the OmpSs programming model)

  • Plug-in based customization
  • Compiler assisted development using the Mercurium

compiler

  • Existing debugging tool-chains: Paraver
slide-15
SLIDE 15

15

Preliminary results

 We ported Nanos++ to the

TilePRO64 processor

 The TilePRO64:

64 small but energy efficient cores

VLIW

700 MHz clock frequency

  • We soldered and attached

a National Instruments Data-acquisition (NI USB- 6210) device to the TilePRO64's power pins

Soldered header-pins

slide-16
SLIDE 16

16

Preliminary results

TilePRO64 Wire Data-Acquisiton device

slide-17
SLIDE 17

17

Preliminary results

We executed the H.264 video decoder on the TilePRO64

For the QoS-aware scheduler, we set a target of <2% timing violations

We compare the results against a timing-unaware scheduler, the Breadth-First scheduler

In these examples, only the deadline() clause was used

No task dropping

Three scenarios:

  • 1. Not enough resources to meet timing constraints
  • 2. Enough resources to meeting timing constraints
  • 3. All resources available; letting the scheduler decide.
slide-18
SLIDE 18

18

Preliminary results

H.264 running an HD movie with two cores.

Timing constraint set towards a 10 fps execution

Overall power consumption increase: 0.7%

Investigation need, but likely due to complexity of scheduler

Breadth-First QoS-aw are 0,00% 5,00% 10,00% 15,00% 20,00% 25,00% 30,00% 35,00% 40,00%

Timing violations

H.264 with two cores

Fraction misses

slide-19
SLIDE 19

19

Preliminary results

H.264 running a HD movie with 12 cores

Timing constraints set towards a 10 fps execution

Overall power consumption decrease: ~5%

Breadth-First QoS-aw are 0,00% 0,20% 0,40% 0,60% 0,80% 1,00% 1,20% 1,40% 1,60% 1,80% 2,00%

Timing violations

H.264 with 12 cores

Fraction misses

slide-20
SLIDE 20

20

Preliminary results

H.264 decoder running a movie with 56 cores

Timing constraints set towards a 10 fps execution

Overall power consumption decrease: ~17%

Breadth-First QoS-aw are 0,00% 0,50% 1,00% 1,50% 2,00% 2,50%

Timing violations

H.264 with 56 cores

Fraction misses

slide-21
SLIDE 21

21

Conclusions

In majority of cases, power consumption is decreased compared to a timing un-aware scheduler

A scheduler that guarantees that tasks with earliest deadline are executed first

User-friendliness and portability increase; let the runtime system decide about resources

slide-22
SLIDE 22

22

Future work

Future work include

  • Refining the resource controlling model.
  • Further decrease the overhead of our scheduling

policy

  • Evaluate on more benchmarks
slide-23
SLIDE 23

23

Acknowledgments

Thanks to C.C. Chi and prof B. Juurlink of TU Berlin for the OmpSs version of H.264

Thanks to M. Själander and S. McKee's team of Chalmers for the support concerning the power measurements

slide-24
SLIDE 24

24

Thank you