Efficient Wake-Up Scheduling for Efficient Wake-Up Scheduling for - - PowerPoint PPT Presentation

efficient wake up scheduling for efficient wake up
SMART_READER_LITE
LIVE PREVIEW

Efficient Wake-Up Scheduling for Efficient Wake-Up Scheduling for - - PowerPoint PPT Presentation

TH EDA Efficient Wake-Up Scheduling for Efficient Wake-Up Scheduling for Multi-Core Systems Multi-Core Systems Ming-Chao Lee*, Yiyu Shi**, Yu-Guan Chen*, Diana Marculescu***, Shih-Chieh Chang* * Dept. of CS, National Tsing Hua


slide-1
SLIDE 1

VLSI/CAD LAB, Dept. of CS, NTHU

TH EDA

Efficient Wake-Up Scheduling for Multi-Core Systems Efficient Wake-Up Scheduling for Multi-Core Systems

Ming-Chao Lee*, Yiyu Shi**, Yu-Guan Chen*, Diana Marculescu***, Shih-Chieh Chang*

* Dept. of CS, National Tsing Hua University, Taiwan ** Dept. of ECE, Missouri University of Science and Technology *** Dept. of ECE, Carnegie Mellon University

slide-2
SLIDE 2

Outlines Outlines

 Introduction  Problem Formulation  Efficient Wake-up Scheduling Algorithm  Experimental Results  Conclusions

slide-3
SLIDE 3

Outlines Outlines

 Introduction  Problem Formulation  Efficient Wake-up Scheduling Algorithm  Experimental Results  Conclusions

slide-4
SLIDE 4

Multi-core System Multi-core System

 Multi-core architecture is widely adopted for high

performance computing.

 A serious problem for multi-core designs is their

large power consumption.

slide-5
SLIDE 5

Power Gating Technique Power Gating Technique

 Power gating

– is a well-known technique to suppress the leakage power – inserts sleep transistors between logic devices and ground rail.

 During the power mode transition from “off” to “on”, a

sudden current may flow through sleep transistors.

– thereby degrading signal and power integrity in the nearby

  • perating devices.

 Wake-up scheduling is to design a wake-up sequence of

sleep transistors

– the total wake-up time is minimized , – and without causing IR-drop violations.

slide-6
SLIDE 6

Motivation Motivation

 For a multi-core design, the number of active

cores and their locations vary with applications.

 An effective multi-core wake-up scheduling

should best decide the wake-up order at runtime.

– This puts high demand on the on-line scheduling algorithm.

slide-7
SLIDE 7

Contributions Contributions

 We propose an on-line wake-up scheduling framework

for the multi-core architecture.

– We first use a heuristic algorithm to construct a multi-conflict graph (MCG) from the multi-core design off-line. – Based on the MCG, we develop a fast linear-time on-line scheduling algorithm to determine the optimal wake-up order for a given set of cores from a task scheduler.

 Our approach can achieve 46.01% speedup on average

  • ver the industrial approach [4] in the wake-up latency

without violating the noise constraint.

slide-8
SLIDE 8

Outlines Outlines

 Introduction  Problem Formulation  Efficient Wake-up Scheduling Algorithm  Experimental Results  Conclusions

slide-9
SLIDE 9

Problem Formulation Problem Formulation

Find the optimal order for turning on a given set of cores at runtime, such that the total wake-up time is minimized under the given IR drop bound.

M1 WS1 … M6 WS6 … M2 WS2 … VDD GND M3 WS3 … WS4 M4 … M5 WS5 … M : module is during power mode transition M : module is idle

slide-10
SLIDE 10

Problem Formulation Problem Formulation

This model can be pessimistic. Determining the optimal wake-up scheduling at runtime is still not an easy task.

Power mesh Ground mesh M1 M4 M5 M2 M6 M3 : Current source : Ground pad : Power pad

slide-11
SLIDE 11

Outlines Outlines

 Introduction  Problem Formulation  Efficient Wake-up Scheduling Algorithm

– The Concept of Multi-Conflict Graph (MCG) – Multi-Conflict Graph (MCG) Construction – On-line Scheduling

 Experimental Results  Conclusions

Off-line MCG construction On-line Wake-up order scheduling M1 M5 M3 M4 M2 M6 Multi-conflict graph (MCG) The optimal wakeup scheduling is (M3, M4), M5. M3, M4 and M5 Task Scheduler

slide-12
SLIDE 12

Conflict Graph (CG) Conflict Graph (CG)

 A conflict graph (CG),

– Each vertex represents core, and – each edge represents a conflict between the two vertices

 We construct a CG by adding an edge between

a pair of vertices/cores that should not be turned

  • n simultaneously.

M1 M5 M3 M4 M2 M6

slide-13
SLIDE 13

Multi-Conflict Graph (MCG) Multi-Conflict Graph (MCG)

 A multi-conflict graph (MCG) is an undirected

graph

– the vertices form an independent set (IS) which is a set of vertices without any edge between them

 the group of corresponding modules can be turned on

simultaneously M1 M5 M3 M4 M2 M6

M1, M2 and M6 form an IS. → M1, M2 and M6 can be turned on simultaneously. M2, M4 and M5 is not an IS → M2, M4 and M5 can be not turned on simultaneously.

slide-14
SLIDE 14

MCG Construction MCG Construction

empty  Step 6 M1 M5 M3 M4 M2 M6 multi-conflict graph (MCG) Step 1 M1 M5 M3 M4 M2 M6 conflict graph (CG) {M1, M2, M6} {M1, M5, M6} {M1, M3, M4, M6}  Step 2 M1 M3 M4 M6

split

M3 M4 M6 M1 M3 M6 {M1, M3, M4, M6} {M1, M3, M6} {M3, M4, M6} Step 4 {M1, M3, M6} {M3, M4, M6}  Step 5 Step 3 {M1, M3, M4, M6}

V VDD bound T

M4 Power mesh Ground mesh M6 M3 VDD GND M1

slide-15
SLIDE 15

On-Line Scheduling Algorithm On-Line Scheduling Algorithm

*Saturation degree of a vertex is the number of different colors which is connected M5 M3 M4 Step 2

M3 = 1 M4 = 1 Saturation degree

Step 3 M5 M3 M4 Step 4 M5 M3 M4 Step 6

M4 = 1 Saturation degree

Step 5 Step 1

Sub-MCG M1 M5 M3 M4 M2 M6 MCG

slide-16
SLIDE 16

Outlines Outlines

 Introduction  Problem Formulation  Efficient Wake-up Scheduling Algorithm  Experimental Results  Conclusions

slide-17
SLIDE 17

Experimental Setup Experimental Setup

 We use the TSMC 90nm CMOS technology.

– 55,794 gate count for each core – The size and the wake-up scheduling of sleep transistors for the core are designed appropriately. – The maximum wake-up surge current during the wake-up operation and the wake-up time for the core are 240mA and 18ns, respectively. – 16-core, 64-core, and 256-core, three different multi- core systems so that the topology of cores, the number and the locations of power/ground pads on the multi-core designs are properly designed.

slide-18
SLIDE 18

# of modules in multi- module design

# of modules to turn on (mc: Monte Carlo search, og: off-line grouping) 8 16 32 mc

  • urs
  • g

[4] mc

  • urs
  • g

[4] mc

  • urs
  • g

[4] 16 72 72 72 144 144 144 144 288

  • 64

72 72 72 144 144 144 162 288 306 306 360 576 256 108 108 144 144 216 216 288 288 450 450 576 576 Average 1.00 1.00 1.11 1.78 1.00 1.00 1.15 1.78 1.00 1.00 1.23 1.58

# of modules in multi- module design

# of modules to turn on (mc: Monte Carlo search, og: off-line grouping) 64 128 256 mc

  • urs
  • g

[4] mc

  • urs
  • g

[4] mc

  • urs
  • g

[4] 16

  • 64

720 720 864 1152

  • 256

954 972 1134 1152 1998 2016 2268 2304 4140 4176 4608 4608 Average 1.00 1.01 1.19 1.40 1.00 1.01 1.14 1.15 1.00 1.01 1.11 1.11

Experimental Results Experimental Results

slide-19
SLIDE 19

Outlines Outlines

 Introduction  Problem Formulation  Efficient Wake-up Scheduling Algorithm  Experimental Results  Conclusions

slide-20
SLIDE 20

Conclusions Conclusions

 In this paper, we discuss the wake-up scheduling

for multi-core systems.

 We propose

– an off-line algorithm to characterize the multi-conflict graph (MCG) for the multi-core design, and – an on-line scheduling algorithm to efficiently decide the order of turning-on cores.

 For a 256-module system, the wake-up latency

for our approach on average achieves 23.06% and 23.67% wake-up latency reduction compared with [16] and [4], respectively.

slide-21
SLIDE 21

Thanks for your attention