Dr. Michelle Strout OUTLINE Introduction and Preview of - - PowerPoint PPT Presentation

dr michelle strout outline introduction and preview of
SMART_READER_LITE
LIVE PREVIEW

Dr. Michelle Strout OUTLINE Introduction and Preview of - - PowerPoint PPT Presentation

Yi Xiang Committee members: Dr. Sudeep Pasricha (Academic Advisor) Dr. Anura Jayasumana Dr. H. J. Siegel Dr. Michelle Strout OUTLINE Introduction and Preview of Contributions Contribution I: Semi-Dynamic Scheduling for


slide-1
SLIDE 1

  Yi Xiang

Committee members:

  • Dr. Sudeep Pasricha (Academic Advisor)
  • Dr. Anura Jayasumana
  • Dr. H. J. Siegel
  • Dr. Michelle Strout
slide-2
SLIDE 2

OUTLINE

2 Introduction and Preview of Contributions Contribution I: Semi-Dynamic Scheduling for Independent Tasks:

Hybrid Energy Storage, Process Variation, and Thermal Management

Contribution II: Template-Based Scheduling for Task Graphs:

Slack Reclamation, Soft Errors, and Hard Failures

Contribution III: Mixed-Criticality Scheduling on Heterogeneous Cores:

Soft Deadline, Near-Threshold/Super-Threshold Computing

Conclusion

Outline

slide-3
SLIDE 3
  • We have been doing this for a very long time:

3

Introduction and Preview of Contributions

What is Energy Harvesting?

wind + mill

windmill

water + wheel

waterwheel

solar + core

solarcore? Chao Li et al., “SolarCore: Solar Energy Driven Multi-Core Architecture Power Management”, IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 205-216, 2011.

slide-4
SLIDE 4
  • Collect energy from ambient sources
  • Including solar, radio frequency,

magnetic, vibration, thermoelectric, etc. 4

Introduction and Preview of Contributions

Energy Harvesting for Electronic Systems

  • To support energy autonomy for electronic devices
  • Wearable electronics, wireless sensor networks, etc.
slide-5
SLIDE 5
  • Limited energy availability for electronic devices:
  • Electricity grid is not available everywhere

5

Introduction and Preview of Contributions

Application of Energy Harvesting: Project Loon

Project Loon

  • Global network-on-balloon
  • Provide internet access for rural

and remote areas

  • Float in atmosphere
  • Powered by solar energy

harvesting For many applications energy is not readily available, motivating the use of energy harvesting

slide-6
SLIDE 6
  • Energy constraints for embedded devices:
  • Limited energy capacity of batteries
  • Replacing battery can be inconvenient, costly, or even impractical

6

Introduction and Preview of Contributions

Pervasive computing:

  • Billions of sensors
  • Scattered everywhere
  • Batteries:
  • costly or impossible

maintenance

  • toxic to environment
  • Need energy harvesting to

achieve energy autonomy

User Demands: High performance Large screen size High resolution GPS Camera Biometric sensors 24X7 battery life?

External Battery Pack

Energy Harvesting and Batteries

slide-7
SLIDE 7
  • Solar energy harvesting as power supply
  • Photovoltaic (PV) panel to scavenge energy from solar radiation
  • Why choose solar energy harvesting?

Solar Energy Harvesting

7

Introduction and Preview of Contributions

slide-8
SLIDE 8
  • Advantages of solar energy harvesting as power supply
  • High power density
  • Varied scales of PV panels and systems

Why Solar Energy Harvesting?

8

Introduction and Preview of Contributions

Energy Source Typical Power Density thermal gradient 60 μW/cm2 vibration 4 μW/cm3 radio frequency 1 μW/cm2 solar radiation 100 mW/cm2

slide-9
SLIDE 9
  • Solar radiation can vary dramatically with environment change
  • Energy shortage at times
  • Hard to predict

As a result, hard to find a performance- and energy-optimal schedule for workload running on energy harvesting-aware embedded systems 9 Harvesting power trace of an PV array in one day

Provided by National Renewable Energy Laboratory (NREL), Golden, Colorado Introduction and Preview of Contributions

Challenges with Solar Energy Harvesting for Embedded Systems

slide-10
SLIDE 10

Design a holistic framework for performance- and energy-optimal scheduling and allocation of workload (task, communication) on energy harvesting-aware multicore embedded system platforms

  • Solar energy harvesting as the only power source
  • Batteries/supercapacitors used for temporal energy storage
  • Multi-core processors with frequency scaling capability
  • Real-time periodic workloads with deadline constraints

Focus of this Dissertation

10

Introduction and Preview of Contributions

slide-11
SLIDE 11

Real-Time Workload with Different Timing Constraints

  • Hard deadline constraint
  • Any task miss → total system failure
  • Firm deadline constraint:
  • Every task miss → inevitable performance penalty
  • Soft deadline constraint:
  • Each task miss → possible performance penalty

Main objective:

Minimizing miss rate/penalty for task set with firm/soft deadlines 11

Introduction and Preview of Contributions

slide-12
SLIDE 12

Overview of Proposed Framework

12

  • task-to-core mapping
  • intra-core scheduling
  • communication mapping
  • voltage-frequency selection
  • dynamic power management

Semi-Dynamic Workload and Platform Management Framework Real-Time Workloads

independent tasks task graphs

Multicore Platforms

homogeneous

Constraints

timing temperature Energy Harvesting Systems photovoltaic panels

batteries supercapacitors

heterogeneous core variation energy soft error hard error firm deadline soft deadline

hybrid Objective

minimize miss rate/penalty

  • ff

tE

  • ff

tF

  • ff

tD tB tC tA

multithreaded tasks

slide-13
SLIDE 13

OUTLINE

13 Introduction and Preview of Contributions Contribution I: Semi-Dynamic Scheduling for Independent Tasks:

Hybrid Energy Storage, Process Variation, and Thermal Management

Contribution II: Template-Based Scheduling for Task Graphs:

Slack Reclamation, Soft Errors, and Hard Failures

Contribution III: Mixed-Criticality Scheduling on Heterogeneous Cores:

Soft Deadline, Near-Threshold/Super-Threshold Computing

Conclusion

Outline

slide-14
SLIDE 14

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

  • Main Objective
  • To reduce miss rate of independent tasks under varying and

stringent energy harvesting conditions

  • Contributions
  • A semi-dynamic algorithm (SDA) that results in lower task miss rates

compared to best known prior work

  • Efficient utilization of multicore systems
  • Management of battery/supercapacitor hybrid storage system
  • Awareness of discrete frequency levels, process variations, thermal

issues 14

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

slide-15
SLIDE 15

Workload: Periodic Independent Task Set

  • Multiple independent periodic tasks with firm deadlines
  • A task miss: missing deadline of a task instance
  • Minimize miss rate → utilize energy as efficient as possible

15

An example of periodic task set

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

slide-16
SLIDE 16
  • Task utilization = execution time at fmax

task period

  • fopt = fmax×U
  • The lowest frequency that is sufficient to meet all task deadlines,

with Earliest Deadline First (EDF) scheduling

  • For energy efficiency: minimize frequency fluctuations
  • Main drawback: dynamic task dropping and slowing down on

energy shortage

16

Related Work: Utilization-Based Algorithm (UTB)

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

  • J. Lu et al., “Scheduling and Mapping of Periodic Tasks on Multi-Core Embedded

Systems with Energy Harvesting", IEEE International Green Computing Conference (IGCC), pp. 1-6, 2011.

slide-17
SLIDE 17
  • Proposed solution: a semi-dynamic window-based scheduling
  • Estimate/predict energy budget
  • Preset execution strategy with uniform speed
  • Divide execution process into schedule windows

17

Motivation: Address Limitation of UTB

6 tasks finished 9 tasks finished

HOW? A spike/dip in harvesting power can make the prediction inaccurate.

Any mispredictions can only affect prediction accuracy of

  • ne schedule window

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

slide-18
SLIDE 18

Proposed Semi-Dynamic Framework

18

  • During system execution for a long duration of time
  • Slice execution time into time windows of k minutes
  • At reschedule point, predict/obtain energy budget for next time window
  • Reject tasks based on energy budget. Then allocate the rest
  • Execute accepted tasks with uniform optimal frequency
  • Semi-Dynamic: reschedule→execute→reschedule→execute …

k minutes

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

slide-19
SLIDE 19

Experiment Setup

19

  • Design a simulation environment to capture workload executing on

multicore embedded system platform

  • Historical weather data (solar radiation intensity, temperature)

provided by National Renewable Energy Laboratory (NREL)

  • System only operates from 6:00 AM to 6:30 PM
  • 50 periodic task sets are randomly generated for each comparison
  • Implementation of our proposed Semi-Dynamic Algorithm (SDA),

together with Utilization-Based Algorithm (UTB)

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

UTB: Utilization-Based Algorithm

  • J. Lu et al., “Scheduling and mapping of periodic tasks on multi-

core embedded systems with energy harvesting”, IGCC 2011.

slide-20
SLIDE 20
  • SDA outperforms UTB
  • Advantage expands with increasing number of cores
  • Up to 70% miss rate reduction compared to UTB

20

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

100% utilization: very intensive workload

Simulation Results with Heavy Workload

slide-21
SLIDE 21

Advantage of SDA: missUTB - missSDA

21

  • More miss rate reduction when power budget is low or fluctuating

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

slide-22
SLIDE 22

Related Topic I: Hybrid Energy Storage

22

  • Pros and cons of different storage medium types

battery-only Supercapacitor-only high energy density low energy density low power density high power density less recharge cycles more recharge cycles

Solution: Battery-supercapacitor hybrid storage system

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

slide-23
SLIDE 23

Simulation Results

23

  • CA-SDA outperforms BA-SDA
  • Proposed HY-SDA outperforms all other techniques

UTB BA-SDA CA-SDA HY-SDA

best prior work battery-only supercap-only hybrid storage

Harvesting Power X 2 Contribution I: Semi-Dynamic Scheduling for Independent Tasks

slide-24
SLIDE 24

24

  • HY-SDA with penalty awareness outperforms MISS-SDA (HY-SDA

without priority support)

Related Topic II: Task Miss Penalty

  • Assume tasks have different miss penalties:
  • Utilize flexibility of reschedule points provided by SDA
  • During task rejection: give priority to tasks with higher miss penalty

density (miss penalty/required execution time)

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

slide-25
SLIDE 25
  • Process variation:
  • Variations in the attributes of transistors formed during fabrication
  • Effect: performance metrics to deviate from their nominal values
  • Variations in gate delays
  • Solution: at reschedule point, distribute workload with awareness of

each core’s peak frequency

Related T

  • pic III: Process Variation-Aware Workload Allocation

25

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

1.0 1.0 1.0 0.8 0.8 0.8 1.0 0.6 0.6

→ different peak frequencies of cores

slide-26
SLIDE 26

Simulation Results

26

  • Variation-Unaware: faulty schedule with high task miss rate
  • Some assigned tasks consumed energy without finishing in time
  • Variation-aware: significantly lower (up to 49%) task miss rate
  • Peak frequencies of cores:

normal distribution with average of 1000MHz and variation of 33%

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

slide-27
SLIDE 27

Related Topic IV: Discrete Frequency Levels

27

  • In general, fexec selected by SDA is from a continuous range, which

may not correspond to an available, discrete frequency level

  • Single: executing with discrete frequency level closest to and higher than fexec

Level 1 2 3 4 5

Frequency (MHz)

idle 150 400 600 800 1000

Power (mW)

40 80 170 400 900 1600

  • Proposed dual-speed method
  • Intra: combine two neighboring

discrete frequencies to approximate fexec

  • Switch frequency multiple times

within each task instance

  • Inter: avoid frequent speed switches

for less switching overhead

  • Only switch frequency once between

every two task instances

 Ends up with near-ideal result

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

slide-28
SLIDE 28

Related Topic V: Run-Time Thermal Issues

  • Motivation
  • Only passive cooling applicable
  • Solar radiation → T

enviroment

and fexecution

  • High risk of core overheating during the middle of the day
  • Solution:
  • Passive: enforce thermal throttling for system stability
  • Proactive: at reschedule point, allocate less workload to hotspot
  • Advantages of proactive thermal management over passive:
  • 22% less frequent thermal throttlings (94 → 74)
  • Less throttlings→ more stable speed→ 2.7% miss rate reduction

28

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

slide-29
SLIDE 29

Summary of Contribution I: Semi-Dynamic Scheduling for Independent Tasks

  • Proposed a novel Semi-Dynamic Algorithm (SDA) that achieved

up to 70% miss rate reduction compared to best known prior work

  • Unlike any other work, SDA provides flexibility to simultaneously

achieve multiple goals:

  • Intelligent management of hybrid energy storage systems
  • Priority scheduling for tasks with different miss penalties
  • Process variation-aware workload allocation
  • More energy efficient utilization of discrete frequency levels
  • Proactive reaction to run-time thermal issues
  • Publications
  • Y. Xiang, S. Pasricha, "Run-Time Management for Multi-Core Embedded Systems with Energy

Harvesting", IEEE Transactions on Very Large Scale Integration Systems (TVLSI), 2014.

  • Y. Xiang, S. Pasricha, "Harvesting-Aware Energy Management for Multicore Platforms with

Hybrid Energy Storage", ACM Great Lakes Symposium on VLSI (GLSVLSI), pp. 25-30, 2013.

  • Y. Xiang, S. Pasricha, "Thermal-Aware Semi-Dynamic Power Management for Multicore

Systems with Energy Harvesting", IEEE International Symposium on Quality Electronic Design (ISQED), pp. 619-626, 2013.

29

Contribution I: Semi-Dynamic Scheduling for Independent Tasks

slide-30
SLIDE 30

OUTLINE

30 Introduction and Preview of Contributions Contribution I: Semi-Dynamic Scheduling for Independent Tasks:

Hybrid Energy Storage, Process Variation, and Thermal Management

Contribution II: Template-Based Scheduling for Task Graphs:

Slack Reclamation, Soft Errors, and Hard Failures

Contribution III: Mixed-Criticality Scheduling on Heterogeneous Cores:

Soft Deadline, Near-Threshold/Super-Threshold Computing

Conclusion

Outline

slide-31
SLIDE 31

Contribution II: Template-Based Scheduling for Task Graphs

  • Objective
  • To reduce total task graph miss rate/penalty under variant and

stringent energy harvesting condition at run-time

  • To offload scheduling complexity of task graphs to design time
  • Contribution
  • A hybrid workload management framework (HyWM) using schedule

templates to integrate design-time and run-time scheduling efforts

  • Two approaches for offline schedule template generation
  • Novel run-time slack reclamation and soft error handling heuristics
  • Aging-aware workload allocation on multicore processors

31

Contribution II: Template-Based Scheduling for Task Graphs

slide-32
SLIDE 32
  • Multiple applications modeled as periodic task graphs
  • Each periodic task graph has instances that arrive recursively
  • To avoid a miss: finish all task nodes before deadline

Workload: Periodic Task Graphs

1st period 2nd period 3rd period

arrival times deadlines

new TG instance ready finished or missed ?

32

Contribution II: Template-Based Scheduling for Task Graphs

slide-33
SLIDE 33
  • Task scheduling on systems powered by energy harvesting
  • IGCC 2011’ J. Lu and Q. Qiu

Scheduling and Mapping of Periodic Tasks on Multi-Core Embedded Systems with Energy Harvesting No awareness of task dependency

  • Energy-aware scheduling for task graphs
  • DATE 2007’ R. Wtanabe et al.

Task scheduling under performance constraints for reducing the energy consumption of the multi-processor SoC Does not target systems with energy harvesting

The first work to consider scheduling of multiple task graphs for systems powered by energy harvesting

Related Works

33

Contribution II: Template-Based Scheduling for Task Graphs

slide-34
SLIDE 34

Hybrid Workload Management: Motivation and Overview

Template-based:

  • Design-time (offline) schedule template generation to offload complexity
  • Run-time (online) schedule template selection based on energy budget

With variations in energy harvesting, how to provide fixed energy budget so that we know which template to select?

34

Contribution II: Template-Based Scheduling for Task Graphs

Motivation for hybrid framework:

  • Task graph scheduling problem is complex
  • Limited energy and computing resource prohibit dynamic scheduling

at run-time

slide-35
SLIDE 35

Semi-Dynamic Framework

  • Semi-dynamic: divide time of execution into schedule windows
  • Each window has an energy budget that is independent from
  • ther windows
  • Shifting: use energy harvested during the previous window
  • Each window has an energy budget that is decoupled from

run-time variations in energy harvesting

35

Contribution II: Template-Based Scheduling for Task Graphs

slide-36
SLIDE 36

Design Time Template Generation:

36

  • Optimized schedule template
  • Time consuming
  • Design-time

MILP-Based Template Generation

Mixed Integer Linear Programming

Contribution II: Template-Based Scheduling for Task Graphs

slide-37
SLIDE 37

MILP for Design-Time Template Generation

  • Represent scheduling problem as mixed-integer linear programming

(MILP) formulation

  • Objective:
  • Formulate MILP constraints based on workload, platform, and energy

budget

  • Represent scheduling decisions as integer variables in MILP
  • Generated templates saved in system for run-time selection

37 MILP Optimization

Optimized Schedule Templates

Energy budget levels

Contribution II: Template-Based Scheduling for Task Graphs

slide-38
SLIDE 38

Quality of Schedule T emplates Generated by MILP

  • With low energy budget, schedule less workload for higher efficiency
  • With sufficient energy budget, schedule full workload even with

stringent timing requirement (95% per core utilization)

  • 4-core embedded system
  • 9 task graph instances per schedule window (E3S benchmark)
  • With per-core average utilization of 95%

38

Contribution II: Template-Based Scheduling for Task Graphs

slide-39
SLIDE 39

Frequency Selection in Schedule Templates

Dealing with stringent energy constraint when energy budget is low

  • Lower energy budget → lower average execution frequency
  • Slow and inefficient frequency level (150MHz) is ignored

automatically

  • Breakdown of frequencies selected for all task nodes
  • For schedule templates with energy beget levels from low to high

39

Contribution II: Template-Based Scheduling for Task Graphs

slide-40
SLIDE 40

Frequency Selection in Schedule Templates

Dealing with stringent timing constraint when energy budget is high

  • Higher workload intensity → higher average execution frequency
  • Breakdown of frequencies selected for all task nodes
  • For schedule templates with energy beget levels from low to high

40

Contribution II: Template-Based Scheduling for Task Graphs

slide-41
SLIDE 41
  • Task scheduling on systems powered by energy harvesting
  • IGCC 2011’ J. Lu and Q. Qiu

Scheduling and Mapping of Periodic Tasks on Multi-Core Embedded Systems with Energy Harvesting

  • GLSVLSI 2013’ Y. Xiang and S. Pasricha

Harvesting-Aware Energy Management for Multicore Platforms with Hybrid Energy Storage

  • Energy-aware scheduling for task graphs
  • DATE 2007’ R. Wtanabe et al.

Task scheduling under performance constraints for reducing the energy consumption of the GALS multi-processor SoC

  • Embedded System Synthesis Benchmarks Suite (E3S)

Comparison with Prior Work

UTA SDA LP+SA

41

Contribution II: Template-Based Scheduling for Task Graphs

Linear programming + heuristics

Assume that scheduler can identify ready task nodes for them

slide-42
SLIDE 42

Comparison over Miss Rate

  • Proposed HyWM-LP: miss rate reduction of 9.7% compared to

LP+SA, 15.2 % compared to SDA, and 29.5% compared to UTA

  • Because of semi-dynamic framework: VS
  • Because of dependency awareness: VS
  • Because of schedule template optimality : VS

42

dependency Window- Shifting

UTA SDA  LP+SA   Contribution II: Template-Based Scheduling for Task Graphs

slide-43
SLIDE 43

Design Time Template Generation: Alternative Approach

43

  • Optimized schedule template
  • Time consuming
  • Not Scalable

MILP-Based Template Generation

  • Near-optimal schedule template
  • Much less time consuming
  • Better scalability

Analysis-Based Template Generation

ATG MILP

For problem size of 10 tasks, 100 nodes: about 100 hours and 5 GB memory Fast schedule template generation

Contribution II: Template-Based Scheduling for Task Graphs

slide-44
SLIDE 44

Analysis-Based Template Generation (ATG)

Basic ideas:

  • With known workload and platform, we can simulate system execution
  • During simulation, we can monitor execution process to find energy inefficient events
  • We can get hindsight on how to avoid these events
  • Make informed update on execution schedule and rewind simulation
  • Save the best schedule we have

44

Contribution II: Template-Based Scheduling for Task Graphs

slide-45
SLIDE 45

Dynamic Critical Path Identification in ATG

  • Start from the ending task node with deadline equal to deadline of task graph
  • Recursively calculate implicit deadlines of precedent task nodes:

To compare priorities of task nodes To identify critical paths of task graphs 45

COMM = 50 WCET = 1100 Implicit Deadline = 2000 – 100 – 150 Implicit Deadline = 1750

τ1

τ3 τ2 τ4

Task Graph Deadline = 2000

WCET = 800 Implicit Deadline = 2000 – 100 – 50 Implicit Deadline = 1850 WCET = 300 1850 – 800 – 50 > 1750 – 1100 – 200 Implicit Deadline = 1750 – 1100 – 200 Implicit Deadline = 450 WCET = 100 Implicit Deadline = 2000 Direction of Calculation

Other Predecessor Nodes in Task Graph (Omitted in this figure)

Initially only one deadline for entire task graph

Contribution II: Template-Based Scheduling for Task Graphs

slide-46
SLIDE 46

Design Time Template Generation: Motivation for Alternative Approach

46

  • Optimized schedule template
  • Time consuming
  • Not Scalable

MILP-Based Template Generation

  • Near-optimal schedule template
  • Much less time consuming
  • Better scalability

Analysis-Based Template Generation

ATG MILP

For problem size of 10 tasks, 100 nodes: about 100 hours and 5 GB memory Near-optimal schedule templates (up to 1.3% higher miss rate) Only uses about 1 hour and 50 MB memory

Contribution II: Template-Based Scheduling for Task Graphs

Fast schedule template generation

slide-47
SLIDE 47
  • Random transient errors (bits flipping) in circuits, caused by
  • Alpha particles from package decay
  • Neutron strikes from cosmic rays
  • Not a permanent failure in circuitry
  • May lead to faulty/invalid execution result on a task node

(also considered as task graph miss → waste of energy)

Related Topic I: Soft Errors

47 Earth’s Surface

p n p p n n p p n n n

+ - + + +

  • Transistor Device

source drain

Particle Strike!

Contribution II: Template-Based Scheduling for Task Graphs

slide-48
SLIDE 48

Run-Time Soft Error Handling and Slack Reclamation

  • Goal: avoid invalidation of schedule templates
  • Never change allocation and execution order decisions
  • Never delay start times of task graph instances
  • Can drop task instances, adjust execution frequencies when necessary
  • Soft error handling heuristic
  • Triggered when a soft error is detected on a task node
  • Attempts to re-execute node with error to save previous execution effort
  • By boosting up frequency using surplus energy
  • By dropping of upcoming and not started yet task graphs
  • By reclaiming slack time
  • Slack reclamation heuristic
  • To utilize slack time due to task dropping and earlier than worst-case

execution time

  • Triggered when a task node can start earlier than scheduled
  • Reclaim: slow down execution frequency without delaying scheduled finish time
  • Pass-on: If can’t reclaim instantly, start and finish task earlier

48

Contribution II: Template-Based Scheduling for Task Graphs

slide-49
SLIDE 49

Benefit of Slack Reclamation and Error Handling

Setup:

  • Based on 8-core system
  • Execution time variation: uniform distribution from 50% to 100% of worst case
  • Error injection rate: 10-5/s per core at maximum frequency

Experiment Results:

  • 1. Base case: no slack time awareness + no soft error awareness:

High miss rate of 40.11%

  • 2. Slack reclamation + no soft error awareness:

Significant lower miss rate of 29.19% (27.2% reduction)

  • 3. Slack reclamation + soft error handling:

miss rate of 22.01% (45.2 % reduction)

In all, 45.2% miss rate reduction achieved compared to the base case 49

Contribution II: Template-Based Scheduling for Task Graphs

slide-50
SLIDE 50
  • A major concern for electronic devices with technology scaling

Related Topic II: Aging Effect

50 Useful life Infant mortality ~ 10 year

Contribution II: Template-Based Scheduling for Task Graphs

slide-51
SLIDE 51

Per-Core Aging Effect Modeling

51 Infant mortality ~ 10 year

Reliability model based on Weibull distribution

  • Core survival rate at time t:
  • β, constant parameter related

to core architecture

  • Scale parameter, α
  • Electromigration model:
  • Workload-related factors: supply voltage (vdd), execution frequency (f),

core temperature (T)

  • Can be controlled

Contribution II: Template-Based Scheduling for Task Graphs

slide-52
SLIDE 52
  • Assumes 8-core system with per-core aging progress detection

circuitry

  • Aging-Aware: to balance workload/aging effect among cores
  • Compare against:
  • Biased: always assign heavier workload to certain cores
  • Random: randomize workload allocation

Aging-Aware Allocation Scheme

52 ~ 10 year

  • Aging-Aware: slower reliability drop and improved MTTF (+19.7%)

MTTF: mean-time-to-failure

Contribution II: Template-Based Scheduling for Task Graphs

slide-53
SLIDE 53
  • Experiment:
  • 8-core homogenous processor
  • Processing capability : achievable task graph finish rate with remaining cores

System Aging Model with Core Failure Tolerance

53

Failure Threshold* 1 2 3 4 5 6 7 MTTF (years) 10.06 15.58 20.22 24.66 29.27 34.44 40.94 51.35 Processing Capability Before System Failure (%) 100 92.3 80.1 68.8 52.4 34.9 18.4 7.0 Average Processing Capability during System Lifetime (%) 100 96.5 92.5 87.7 81.7 75.0 67.3 60.3

  • Higher tolerance → longer MTTF → lower performance guarantee

Contribution II: Template-Based Scheduling for Task Graphs

  • Multicore system reliability model:

With different levels of tolerance on number of failed cores (h)

Probability of exactly h core survived after tw schedule windows

slide-54
SLIDE 54

Summary of Contribution II: Template-Based Scheduling for Task Graphs

  • The first work to consider task graph set scheduling for

multiprocessors powered by energy harvesting

  • Hybrid Workload Management framework (HyWM) achieved up to

29.5% lower miss rate compared to prior works

  • Two offline schedule template generation approaches to offload

scheduling complexity of task graphs to design-time

  • Comprehensive heuristic for slack reclamation and soft error

handling with up to 45.2% miss rate reduction

  • Aging-aware workload allocation with 19.7% longer MTTF
  • Publication
  • Y. Xiang, S. Pasricha, "Fault-Aware Application Scheduling in Low Power Embedded

Systems with Energy Harvesting", ACM/IEEE International Conference

  • n

Hardware/Software Codesign and System Synthesis (CODES+ISSS), article 32, 2014.

  • Y. Xiang, S. Pasricha, "A Hybrid Framework for Application Allocation and Scheduling

in Multicore Systems with Energy Harvesting", ACM Great Lakes Symposium on VLSI (GLSVLSI), pp. 163-168, 2014.

  • Y. Xiang, S. Pasricha, "Soft and Hard Reliability-Aware Scheduling for Multicore

Embedded Systems with Energy Harvesting", IEEE Transactions on Multi-Scale Computing Systems (TMSCS), under review.

54

Contribution II: Template-Based Scheduling for Task Graphs

slide-55
SLIDE 55

OUTLINE

55 Introduction and Preview of Contributions Contribution I: Semi-Dynamic Scheduling for Independent Tasks:

Hybrid Energy Storage, Process Variation, and Thermal Management

Contribution II: Template-Based Scheduling for Task Graphs:

Slack Reclamation, Soft Errors, and Hard Failures

Contribution III: Mixed-Criticality Scheduling on Heterogeneous Cores:

Soft Deadline, Near-Threshold/Super-Threshold Computing

Conclusion

Outline

slide-56
SLIDE 56

Contribution III:

Mixed-Criticality Scheduling on Heterogeneous Systems

  • Main Objective
  • To reduce miss penalty of mixed-criticality workload under varying

and stringent energy harvesting condition

  • Contributions
  • Modeled mixed-criticality workload with soft/firm deadline constraints
  • A single-ISA heterogeneous multicore platform for mixed-criticality

scheduling

  • A novel timing intensity metric to estimate task instance importance

in a mixed-criticality workload

  • Considered near-threshold computing for high energy efficiency

56

Contribution III: Mixed-Criticality Scheduling on Heterogeneous Systems

slide-57
SLIDE 57

Mixed-Criticality Workload

(m, k)-soft deadline: only need to finish m instances out of every k instances

  • Compare throughput-centric workload to timing-centric workload:
  • Has less critical timing requirements
  • Emphasizes even more on energy efficiency
  • Our mixed-criticality scheduling problem:

Simultaneous scheduling of timing- and throughput-centric tasks with energy harvesting 57

Criticality Type Timing-Centric Throughput-Centric Structure Model task graphs multithreaded applications Parallelism highly customized barrier-synchronized Execution Time few seconds few minutes Period tens of seconds tens of minutes Deadline Model firm (m, k)-soft Schedule Method template-based dynamic scheduling Benchmark Suit E3S PARSEC

Contribution III: Mixed-Criticality Scheduling on Heterogeneous Systems

special case of task graph simpler parallelism longer execution time period > schedule window flexible timing possible and necessary same as task graph model used in the previous section

slide-58
SLIDE 58

Mixed-Criticality Scheduling on Heterogeneous Platform

  • Big cores (high performance) for timing-centric task graphs
  • Small cores (high efficiency) for throughput-centric multithreaded

applications: near-threshold computing

  • Exception: sequential phases of multithreaded applications executed on big cores
  • Interaction: split energy budget by comparing miss penalty density
  • Lower weight factor (timing intensity) for tasks with soft deadlines

58

Contribution III: Mixed-Criticality Scheduling on Heterogeneous Systems

slide-59
SLIDE 59

Near-Threshold Computing

59

Contribution III: Mixed-Criticality Scheduling on Heterogeneous Systems

Vth: CMOS gate-to-source threshold voltage Vdd: supply voltage:

  • Super-threshold region
  • High performance
  • Low efficiency
  • Near-threshold region
  • Lower performance
  • Highest efficiency
slide-60
SLIDE 60

Heterogeneous Platform: Big Core and Small Core

  • Simulated using Sniper (performance) and McPAT (power)
  • Small core (near-threshold supply voltage) to focus on energy efficiency
  • Big core to focus on timing performance

60

Architectural Parameters Core Types Big Cores Small Cores Execution Out-of-Order In-Order Issue Width 4 2 Reorder Buffer Size 128 N/A Cache 64KB, 4-way 16KB, direct Core Area 15.7 mm2 4 mm2 Cluster Parameters Cluster Type Big-Core-Cluster Small-Core-Cluster Core Count 8 32 Frequency Control Per-Core DVFS Uniform Frequency f , Vdd Range 0.5~1.2GHz, 0.4~1 V f nth, Vdd

nth

Technology Parameters Technology Node 22 nm Vth 0.289 V Vdd

nth, f nth

0.4 V, 500 MHz

Contribution III: Mixed-Criticality Scheduling on Heterogeneous Systems

slide-61
SLIDE 61

Experiment Results

61

  • Compared to (m, k)-

unaware, 9.5% performance benefit from soft deadline- awareness

  • Compared to PIE

(Performance impact estimation, ISCA 2012), 13.6% performance improvement from emphasis

  • n energy efficiency
  • Compared to B8-B8,

23.2% performance benefit from heterogeneous computing B8-S32: proposed mixed-critical scheduling on heterogeneous platform with 8 big cores and 32 small cores

Contribution III: Mixed-Criticality Scheduling on Heterogeneous Systems

Craeynestet al., “Scheduling heterogeneous multi-cores through Performance Impact Estimation”, ISCA 2012.

slide-62
SLIDE 62
  • First work to consider mixed-criticality scheduling and near-threshold

computing for energy harvesting embedded systems

  • A single-ISA heterogeneous multicore platform for mixed-criticality scheduling
  • Estimate importance of tasks based on soft deadline constraints
  • Up to 23.2% performance benefit
  • Publication
  • Y. Xiang, S. Pasricha, "Mixed-Criticality Scheduling on Heterogeneous Multicore Systems

Powered by Energy Harvesting", ACM Transaction on Embedded Computing (TECS), under review.

62

Summary of Contribution III:

Mixed-Criticality Scheduling on Heterogeneous Systems

Contribution III: Mixed-Criticality Scheduling on Heterogeneous Systems

slide-63
SLIDE 63

OUTLINE

63 Introduction and Preview of Contributions Contribution I: Semi-Dynamic Scheduling for Independent Tasks:

Hybrid Energy Storage, Process Variation, and Thermal Management

Contribution II: Template-Based Scheduling for Task Graphs:

Slack Reclamation, Soft Errors, and Hard Failures

Contribution III: Mixed-Criticality Scheduling on Heterogeneous Cores:

Soft Deadline, Near-Threshold/Super-Threshold Computing

Conclusion

Outline

slide-64
SLIDE 64
  • The proposed Semi-Dynamic Framework is a unified solution for

energy harvesting-aware resource management with significant advantages on energy efficiency and flexibility over prior work

  • Issues addressed by our Semi-Dynamic Framework:
  • 1. Minimizing miss rate/miss penalty
  • 2. Run-time thermal control
  • 3. Mitigating impact of process variation
  • 4. Management of hybrid energy storage
  • 5. Scheduling of task graphs with inter-node dependencies
  • 6. Soft error handling and slack reclamation during execution
  • 7. Mitigating aging effects across the chip over time
  • 8. Mixed-criticality scheduling on heterogeneous processors

64

Conclusion

My Dissertation Summary

slide-65
SLIDE 65

List of Publications

Journal papers:

  • Y. Xiang, S. Pasricha, "Run-Time Management for Multi-Core Embedded Systems with Energy

Harvesting", IEEE Transactions on Very Large Scale Integration Systems (TVLSI), 2014.

  • B.

Donohoo, C. Ohlsen, S. Pasricha, C. Anderson, Y. Xiang, "Context-Aware Energy Enhancements for Smart Mobile Devices", IEEE Transactions on Mobile Computing (TMC), vol. 13, no. 8, 2013.

  • Y. Zou, Y. Xiang, S. Pasricha, "Characterizing Vulnerability of Network Interfaces in Embedded

Chip Multiprocessors", IEEE Embedded System Letters (ESL), vol. 4, no. 2, 2012.

  • Y. Xiang, S. Pasricha, "Mixed-Criticality Scheduling on Heterogeneous Multicore Systems

Powered by Energy Harvesting", ACM Transaction on Embedded Computing (TECS), under review.

  • Y. Xiang, S. Pasricha, "Soft and Hard Reliability-Aware Scheduling for Multicore Embedded

Systems with Energy Harvesting", IEEE Transactions on Multi-Scale Computing Systems (TMSCS), under review.

Conference papers:

  • Y. Xiang, S. Pasricha, "Fault-Aware Application Scheduling in Low Power Embedded Systems

with Energy Harvesting", ACM/IEEE International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), article 32, 2014.

  • Y. Xiang, S. Pasricha, "A Hybrid Framework for Application Allocation and Scheduling in Multicore

Systems with Energy Harvesting", ACM Great Lakes Symposium on VLSI (GLSVLSI), pp.163- 168, 2014.

  • Y. Xiang, S. Pasricha, "Harvesting-Aware Energy Management for Multicore Platforms with

Hybrid Energy Storage", ACM Great Lakes Symposium on VLSI (GLSVLSI), pp. 25-30, 2013.

  • Y. Xiang, S. Pasricha, "Thermal-Aware Semi-Dynamic Power Management for Multicore Systems

with Energy Harvesting", IEEE International Symposium on Quality Electronic Design (ISQED),

  • pp. 619-626, 2013.
  • Y. Zou, Y. Xiang, S. Pasricha, “Analysis of On-chip Interconnection Network Interface Reliability in

Multicore Systems”, IEEE International Conference on Computer Design (ICCD), pp.427-428, 2011.

65

List of Publications

slide-66
SLIDE 66

Thank You!

66

Thank you!