efficient wake up scheduling for efficient wake up
play

Efficient Wake-Up Scheduling for Efficient Wake-Up Scheduling for - PowerPoint PPT Presentation

TH EDA Efficient Wake-Up Scheduling for Efficient Wake-Up Scheduling for Multi-Core Systems Multi-Core Systems Ming-Chao Lee*, Yiyu Shi**, Yu-Guan Chen*, Diana Marculescu***, Shih-Chieh Chang* * Dept. of CS, National Tsing Hua


  1. TH EDA Efficient Wake-Up Scheduling for Efficient Wake-Up Scheduling for Multi-Core Systems Multi-Core Systems Ming-Chao Lee*, Yiyu Shi**, Yu-Guan Chen*, Diana Marculescu***, Shih-Chieh Chang* * Dept. of CS, National Tsing Hua University, Taiwan ** Dept. of ECE, Missouri University of Science and Technology *** Dept. of ECE, Carnegie Mellon University VLSI/CAD LAB, Dept. of CS, NTHU

  2. Outlines Outlines  Introduction  Problem Formulation  Efficient Wake-up Scheduling Algorithm  Experimental Results  Conclusions

  3. Outlines Outlines  Introduction  Problem Formulation  Efficient Wake-up Scheduling Algorithm  Experimental Results  Conclusions

  4. Multi-core System Multi-core System  Multi-core architecture is widely adopted for high performance computing.  A serious problem for multi-core designs is their large power consumption.

  5. Power Gating Technique Power Gating Technique  Power gating – is a well-known technique to suppress the leakage power – inserts sleep transistors between logic devices and ground rail.  During the power mode transition from “off” to “on”, a sudden current may flow through sleep transistors. – thereby degrading signal and power integrity in the nearby operating devices.  Wake-up scheduling is to design a wake-up sequence of sleep transistors – the total wake-up time is minimized , – and without causing IR-drop violations.

  6. Motivation Motivation  For a multi-core design, the number of active cores and their locations vary with applications.  An effective multi-core wake-up scheduling should best decide the wake-up order at runtime. – This puts high demand on the on-line scheduling algorithm.

  7. Contributions Contributions  We propose an on-line wake-up scheduling framework for the multi-core architecture. – We first use a heuristic algorithm to construct a multi-conflict graph ( MCG ) from the multi-core design off-line. – Based on the MCG, we develop a fast linear-time on-line scheduling algorithm to determine the optimal wake-up order for a given set of cores from a task scheduler.  Our approach can achieve 46.01% speedup on average over the industrial approach [4] in the wake-up latency without violating the noise constraint.

  8. Outlines Outlines  Introduction  Problem Formulation  Efficient Wake-up Scheduling Algorithm  Experimental Results  Conclusions

  9. Problem Formulation Problem Formulation WS 1 WS 2 WS 3 V DD M 1 M 2 M 3 … … … M 4 M 6 M 5 M : module is during … … … power mode transition GND M : module is idle WS 4 WS 5 WS 6 Find the optimal order for turning on a given set of cores at runtime, such that the total wake-up time is minimized under the given IR drop bound.

  10. Problem Formulation Problem Formulation Power mesh M 3 M 1 M 2 M 6 M 5 : Power pad M 4 : Ground pad : Current source Ground mesh This model can be pessimistic. Determining the optimal wake-up scheduling at runtime is still not an easy task.

  11. Outlines Outlines  Introduction  Problem Formulation  Efficient Wake-up Scheduling Algorithm Off-line – The Concept of Multi-Conflict Graph (MCG) MCG construction – Multi-Conflict Graph (MCG) Construction M 1 M 2 M 3 – On-line Scheduling M 4 M 5 M 6  Experimental Results Multi-conflict graph ( MCG )  Conclusions On-line Task Wake-up order Scheduler scheduling M 3 , M 4 and M 5 The optimal wakeup scheduling is ( M 3 , M 4 ), M 5 .

  12. Conflict Graph (CG) Conflict Graph (CG)  A conflict graph (CG), – Each vertex represents core, and – each edge represents a conflict between the two vertices M 1 M 2 M 3 M 4 M 5 M 6  We construct a CG by adding an edge between a pair of vertices/cores that should not be turned on simultaneously.

  13. Multi-Conflict Graph (MCG) Multi-Conflict Graph (MCG)  A multi-conflict graph (MCG) is an undirected graph – the vertices form an independent set ( IS ) which is a set of vertices without any edge between them  the group of corresponding modules can be turned on simultaneously M 1 M 2 M 3 M 2 , M 4 and M 5 is not an IS M 1 , M 2 and M 6 form an IS. → M 2 , M 4 and M 5 can be not → M 1 , M 2 and M 6 can be turned on simultaneously. turned on simultaneously. M 4 M 5 M 6

  14. MCG Construction MCG Construction Step 1 Step 2 Step 4  { M 1 , M 3 , M 4 , M 6 } { M 1 , M 3 , M 6 } { M 3 , M 4 , M 6 } M 1 M 2 M 3 M 1 M 3 M 1 M 3 M 3 { M 1 , M 2 , M 6 } M 4 M 5 M 6 split { M 1 , M 5 , M 6 } M 6 M 4 M 6 M 4 M 6 { M 1 , M 3 , M 4 , M 6 } conflict graph (CG) Step 5 Step 3 V DD  { M 1 , M 3 , M 4 , M 6 } { M 1 , M 3 , M 6 } V { M 3 , M 4 , M 6 } Power mesh V DD bound M 3 Step 6 M 1 T M 1 M 2 M 3 M 6 M 4 M 4 M 5 M 6 multi-conflict graph Ground mesh (MCG)  empty GND

  15. On-Line Scheduling Algorithm On-Line Scheduling Algorithm Step 1 Step 2 Step 3 M 3 M 1 M 2 M 3 Saturation degree M 3 = 1 M 4 M 5 M 4 M 5 M 6 M 4 = 1 Sub-MCG MCG Step 6 Step 5 Step 4 M 3 M 3 Saturation degree M 4 = 1 M 4 M 5 M 4 M 5 * Saturation degree of a vertex is the number of different colors which is connected

  16. Outlines Outlines  Introduction  Problem Formulation  Efficient Wake-up Scheduling Algorithm  Experimental Results  Conclusions

  17. Experimental Setup Experimental Setup  We use the TSMC 90nm CMOS technology. – 55,794 gate count for each core – The size and the wake-up scheduling of sleep transistors for the core are designed appropriately. – The maximum wake-up surge current during the wake-up operation and the wake-up time for the core are 240mA and 18ns, respectively. – 16-core, 64-core, and 256-core, three different multi- core systems so that the topology of cores, the number and the locations of power/ground pads on the multi-core designs are properly designed.

  18. Experimental Results Experimental Results # of modules to turn on (mc: Monte Carlo search, og: off-line grouping) # of 8 16 32 modules in multi- module mc ours og [4] mc ours og [4] mc ours og [4] design 16 72 72 72 144 144 144 144 288 - - - - 64 72 72 72 144 144 144 162 288 306 306 360 576 256 108 108 144 144 216 216 288 288 450 450 576 576 Average 1.00 1.00 1.11 1.78 1.00 1.00 1.15 1.78 1.00 1.00 1.23 1.58 # of modules to turn on (mc: Monte Carlo search, og: off-line grouping) # of 64 128 256 modules in multi- module mc ours og [4] mc ours og [4] mc ours og [4] design 16 - - - - - - - - - - - - 64 720 720 864 1152 - - - - - - - - 256 954 972 1134 1152 1998 2016 2268 2304 4140 4176 4608 4608 Average 1.00 1.01 1.19 1.40 1.00 1.01 1.14 1.15 1.00 1.01 1.11 1.11

  19. Outlines Outlines  Introduction  Problem Formulation  Efficient Wake-up Scheduling Algorithm  Experimental Results  Conclusions

  20. Conclusions Conclusions  In this paper, we discuss the wake-up scheduling for multi-core systems.  We propose – an off-line algorithm to characterize the multi-conflict graph (MCG) for the multi-core design, and – an on-line scheduling algorithm to efficiently decide the order of turning-on cores.  For a 256-module system, the wake-up latency for our approach on average achieves 23.06% and 23.67% wake-up latency reduction compared with [16] and [4], respectively.

  21. Thanks for your attention

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend