Enabling Realistic Fine-Grain Voltage Scaling with Reconfigurable - - PowerPoint PPT Presentation

enabling realistic fine grain voltage scaling with
SMART_READER_LITE
LIVE PREVIEW

Enabling Realistic Fine-Grain Voltage Scaling with Reconfigurable - - PowerPoint PPT Presentation

Enabling Realistic Fine-Grain Voltage Scaling with Reconfigurable Power Distribution Networks Waclaw Godycki, Christopher Torng, Ivan Bukreyev Alyssa Apsel, Christopher Batten School of Electrical and Computer Engineering Cornell University


slide-1
SLIDE 1

Enabling Realistic Fine-Grain Voltage Scaling with Reconfigurable Power Distribution Networks

Waclaw Godycki, Christopher Torng, Ivan Bukreyev Alyssa Apsel, Christopher Batten

School of Electrical and Computer Engineering Cornell University 47th Int’l Symp. on Microarchitecture, Dec 2014

slide-2
SLIDE 2
  • Motivation •

FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Motivation: Integrated Voltage Regulation (IVR)

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank On-Chip Off-Chip Discrete Voltage Regulators Integrated Voltage Regulators

Key Benefit of IVR

◮ Reduced System Cost

Cornell University Christopher Torng 2 / 22

slide-3
SLIDE 3
  • Motivation •

FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Motivation: Integrated Voltage Regulation (IVR)

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank On-Chip Off-Chip Discrete Voltage Regulators Integrated Voltage Regulators

Key Benefit of IVR

◮ Reduced System Cost

Challenges of IVR

◮ Integrated energy-storage elements

have low energy densities

◮ Low switching speeds with high

parasitic losses

Cornell University Christopher Torng 2 / 22

slide-4
SLIDE 4
  • Motivation •

FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Motivation: Integrated Voltage Regulation (IVR)

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank On-Chip Off-Chip Discrete Voltage Regulators Integrated Voltage Regulators

Key Benefit of IVR

◮ Reduced System Cost

Challenges of IVR

◮ Integrated energy-storage elements

have low energy densities

◮ Low switching speeds with high

parasitic losses A New Era of IVR

◮ Energy storage elements have

slightly improved energy densities

◮ Faster switches with low parasitic

losses

Cornell University Christopher Torng 2 / 22

slide-5
SLIDE 5
  • Motivation •

FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Motivation: Integrated Voltage Regulation (IVR)

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank On-Chip Off-Chip Discrete Voltage Regulators Integrated Voltage Regulators

VLLC

Graphics Engine Ring Interconnect System Agent + PCIe + DMI

VDDQ

DRAM Cntl

VCCIN

FIVR CPUs + Cache

VCPU0 VRING VGPU VSA VIOA VCPU1

… …

Intel Haswell integrates the voltage control loop circuitry on-die with inductors in-package. Intel Haswell

Figure from D. Kanter, MPR'13

"Fully" Integrated Voltage Regulator (FIVR)

Cornell University Christopher Torng 3 / 22

slide-6
SLIDE 6
  • Motivation •

FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Motivation: Integrated Voltage Regulation (IVR)

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank On-Chip Off-Chip Discrete Voltage Regulators Integrated Voltage Regulators

VLLC

Graphics Engine Ring Interconnect System Agent + PCIe + DMI

VDDQ

DRAM Cntl

VCCIN

FIVR CPUs + Cache

VCPU0 VRING VGPU VSA VIOA VCPU1

… …

Intel Haswell integrates the voltage control loop circuitry on-die with inductors in-package. Intel Haswell

Figure from D. Kanter, MPR'13

"Fully" Integrated Voltage Regulator (FIVR) Potential for Fine-Grain Voltage Scaling

Cornell University Christopher Torng 3 / 22

slide-7
SLIDE 7
  • Motivation •

FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Motivation: Fine-Grain Voltage Scaling Opportunities

Cores

7

Knuth-Morris-Pratt String Search Core Busy Core Waiting Cores

7

Cores

7

time Breadth-First Search Radix Sort

Cornell University Christopher Torng 4 / 22

slide-8
SLIDE 8
  • Motivation •

FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Motivation: Fine-Grain Voltage Scaling Opportunities

Cores

7

Knuth-Morris-Pratt String Search Core Busy Core Waiting Cores

7

Cores

7

time Breadth-First Search Radix Sort

90μs

2.3μs

7μs

Cornell University Christopher Torng 4 / 22

slide-9
SLIDE 9
  • Motivation •

FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Enabling Realistic Fine-Grain Voltage Scaling

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Fine-Grain DVFS Controller Power Distribution Network

Cornell University Christopher Torng 5 / 22

slide-10
SLIDE 10
  • Motivation •

FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Enabling Realistic Fine-Grain Voltage Scaling

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Fine-Grain DVFS Controller Power Distribution Network Stats

Cornell University Christopher Torng 5 / 22

slide-11
SLIDE 11
  • Motivation •

FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Enabling Realistic Fine-Grain Voltage Scaling

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Fine-Grain DVFS Controller Power Distribution Network Stats Power Modes

Cornell University Christopher Torng 5 / 22

slide-12
SLIDE 12
  • Motivation •

FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Enabling Realistic Fine-Grain Voltage Scaling

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Fine-Grain DVFS Controller Power Distribution Network Stats Power Modes Voltages

Cornell University Christopher Torng 5 / 22

slide-13
SLIDE 13
  • Motivation •

FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Enabling Realistic Fine-Grain Voltage Scaling

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Power Distribution Network Stats Power Modes Voltages FG-SYNC+

FGVS Architecture: FG-SYNC+ Use lightweight software hints and lookup tables derived offline to enable fast multi-level voltage configuration

Cornell University Christopher Torng 5 / 22

slide-14
SLIDE 14
  • Motivation •

FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Enabling Realistic Fine-Grain Voltage Scaling

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Stats Power Modes Voltages FG-SYNC+ RPDN

FGVS Architecture: FG-SYNC+ Use lightweight software hints and lookup tables derived offline to enable fast multi-level voltage configuration FGVS Circuits: RPDN Enable sprinting cores to dynamically borrow energy storage from resting cores

Cornell University Christopher Torng 5 / 22

slide-15
SLIDE 15
  • Motivation •

FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Enabling Realistic Fine-Grain Voltage Scaling

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Stats Power Modes Voltages FG-SYNC+ RPDN

FGVS Architecture: FG-SYNC+ Use lightweight software hints and lookup tables derived offline to enable fast multi-level voltage configuration FGVS Circuits: RPDN Enable sprinting cores to dynamically borrow energy storage from resting cores Methodology and Evaluation

Cornell University Christopher Torng 5 / 22

slide-16
SLIDE 16
  • Motivation •

FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Enabling Realistic Fine-Grain Voltage Scaling

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Stats Power Modes Voltages FG-SYNC+ RPDN

FGVS Architecture: FG-SYNC+ Use lightweight software hints and lookup tables derived offline to enable fast multi-level voltage configuration FGVS Circuits: RPDN Enable sprinting cores to dynamically borrow energy storage from resting cores Methodology and Evaluation Architecture and Circuits Co-Design Approach

Cornell University Christopher Torng 5 / 22

slide-17
SLIDE 17

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Fine-Grain Synchronization Controller (FG-SYNC+)

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Power Distribution Network FG-SYNC+ Stats parallel_for(int i=0; i<N; n++){ < loop body > } Loop Start: Activity Hint -- busy Loop Ends: Activity Hint -- waiting Before Start: Work Left Hint

Cornell University Christopher Torng 6 / 22

slide-18
SLIDE 18

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Fine-Grain Synchronization Controller (FG-SYNC+)

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Power Distribution Network FG-SYNC+ Stats parallel_for(int i=0; i<N; n++){ < loop body > } Loop Start: Activity Hint -- busy Loop Ends: Activity Hint -- waiting Before Start: Work Left Hint

A A A A A A A w A A w w A w w w N N N N S N N r S S r r X r r r

Activity Pattern DVFS Mode Pattern Power Modes ( Active Waiting ) ( S, X Sprint modes ) ( N, r Nominal, rest )

Cornell University Christopher Torng 6 / 22

slide-19
SLIDE 19

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Fine-Grain Synchronization Controller (FG-SYNC+)

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Power Distribution Network FG-SYNC+ Stats parallel_for(int i=0; i<N; n++){ < loop body > } Loop Start: Activity Hint -- busy Loop Ends: Activity Hint -- waiting Before Start: Work Left Hint

A A A A A A A w A A w w A w w w N N N N S N N r S S r r X r r r

Activity Pattern DVFS Mode Pattern Power Modes ( Active Waiting ) ( S, X Sprint modes ) ( N, r Nominal, rest ) Voltages

Cornell University Christopher Torng 6 / 22

slide-20
SLIDE 20

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Fine-Grain Synchronization Controller (FG-SYNC+)

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Power Distribution Network FG-SYNC+ Stats parallel_for(int i=0; i<N; n++){ < loop body > } Loop Start: Activity Hint -- busy Loop Ends: Activity Hint -- waiting Before Start: Work Left Hint

A A A A A A A w A A w w A w w w N N N N S N N r S S r r X r r r

Activity Pattern DVFS Mode Pattern Power Modes ( Active Waiting ) ( S, X Sprint modes ) ( N, r Nominal, rest ) Voltages Performance Energy Efficiency isopower

Cornell University Christopher Torng 6 / 22

slide-21
SLIDE 21

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Fine-Grain Synchronization Controller (FG-SYNC+)

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Power Distribution Network FG-SYNC+ Stats parallel_for(int i=0; i<N; n++){ < loop body > } Loop Start: Activity Hint -- busy Loop Ends: Activity Hint -- waiting Before Start: Work Left Hint

A A A A A A A w A A w w A w w w N N N N S N N r S S r r X r r r

Activity Pattern DVFS Mode Pattern Power Modes ( Active Waiting ) ( S, X Sprint modes ) ( N, r Nominal, rest ) Voltages Performance Energy Efficiency isopower

Cornell University Christopher Torng 6 / 22

slide-22
SLIDE 22

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Fine-Grain Synchronization Controller (FG-SYNC+)

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Power Distribution Network FG-SYNC+ Stats parallel_for(int i=0; i<N; n++){ < loop body > } Loop Start: Activity Hint -- busy Loop Ends: Activity Hint -- waiting Before Start: Work Left Hint

A A A A A A A w A A w w A w w w N N N N S N N r S S r r X r r r

Activity Pattern DVFS Mode Pattern Power Modes ( Active Waiting ) ( S, X Sprint modes ) ( N, r Nominal, rest ) Voltages Performance Energy Efficiency isopower Core 1 Core 3 Core 2 Core 1 Core 0

Cornell University Christopher Torng 6 / 22

slide-23
SLIDE 23

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Ideal Fine-Grain Voltage Scaling: Breadth-First Search

time Core 7 Rest = 0.7V Nominal = 1.0V Sprint = 1.15V Super-Sprint = 1.33V Core 6 Core 5 Core 4 Core 3 Core 2 Core 1 Core 0 Busy Waiting

Cornell University Christopher Torng 7 / 22

slide-24
SLIDE 24

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Ideal Fine-Grain Voltage Scaling: Breadth-First Search

time Core 7 Rest = 0.7V Nominal = 1.0V Sprint = 1.15V Super-Sprint = 1.33V Core 6 Core 5 Core 4 Core 3 Core 2 Core 1 Core 0 Busy Waiting

Cornell University Christopher Torng 7 / 22

slide-25
SLIDE 25

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Ideal Fine-Grain Voltage Scaling: Breadth-First Search

time Core 7 Rest = 0.7V Nominal = 1.0V Sprint = 1.15V Super-Sprint = 1.33V Core 6 Core 5 Core 4 Core 3 Core 2 Core 1 Core 0 Busy Waiting

Cornell University Christopher Torng 7 / 22

slide-26
SLIDE 26

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Ideal Fine-Grain Voltage Scaling: Breadth-First Search

time Core 7 Rest = 0.7V Nominal = 1.0V Sprint = 1.15V Super-Sprint = 1.33V Core 6 Core 5 Core 4 Core 3 Core 2 Core 1 Core 0 Busy Waiting Execution Time Reduced by 24% Ideal FGVS Level Space Time

Cornell University Christopher Torng 7 / 22

slide-27
SLIDE 27

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Non-Ideal FGVS: Different Levels

2-Level Controllers 0.9 1.0 1.1 1.2 1.3 1.4 1.5 0.8 1.0 1.2 1.4 1.6 1.8 2.0 Speedup Normalized Energy Efficiency isopower No FGVS

Cornell University Christopher Torng 8 / 22

slide-28
SLIDE 28

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Non-Ideal FGVS: Different Levels

2-Level Controllers 0.9 1.0 1.1 1.2 1.3 1.4 1.5 0.8 1.0 1.2 1.4 1.6 1.8 2.0 Speedup Normalized Energy Efficiency isopower No FGVS r / N

Cornell University Christopher Torng 8 / 22

slide-29
SLIDE 29

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Non-Ideal FGVS: Different Levels

2-Level Controllers 0.9 1.0 1.1 1.2 1.3 1.4 1.5 0.8 1.0 1.2 1.4 1.6 1.8 2.0 Speedup Normalized Energy Efficiency isopower No FGVS r / N N / S

Cornell University Christopher Torng 8 / 22

slide-30
SLIDE 30

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Non-Ideal FGVS: Different Levels

2-Level Controllers 0.9 1.0 1.1 1.2 1.3 1.4 1.5 0.8 1.0 1.2 1.4 1.6 1.8 2.0 Speedup Normalized Energy Efficiency isopower No FGVS r / N N / S N / X

Cornell University Christopher Torng 8 / 22

slide-31
SLIDE 31

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Non-Ideal FGVS: Different Levels

2-Level Controllers 0.9 1.0 1.1 1.2 1.3 1.4 1.5 0.8 1.0 1.2 1.4 1.6 1.8 2.0 Speedup Normalized Energy Efficiency isopower No FGVS r / N N / S N / X 3-Level vs 4-Level Controllers Speedup i s

  • p
  • w

e r Normalized Energy Efficiency 1.0 1.1 1.2 1.3 1.4 1.5 0.8 1.0 1.2 1.4 1.6 1.8 2.0 No FGVS r / N / S r / N / X r / N / S / X Increasing sprinting levels

Cornell University Christopher Torng 8 / 22

slide-32
SLIDE 32

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

Non-Ideal FGVS: Space and Time

Different Numbers of Domains No FGVS 2 Domains 4 Domains 8 Domains 1.0 1.1 1.2 1.3 1.4 1.5 0.8 1.0 1.2 1.4 1.6 1.8 2.0 Speedup Normalized Energy Efficiency

i s

  • p
  • w

e r

Increasing number of domains Different Response Times Speedup Normalized Energy Efficiency 0.4 0.8 1.0 1.2 1.4 1.6 1.8 2.0 No FGVS 100 ns 1000 ns 0 ns

i s

  • p
  • w

e r

Decreasing response time 0.6 0.8 1.0 1.2 1.4 1.6

Cornell University Christopher Torng 9 / 22

slide-33
SLIDE 33

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

FG-SYNC+: Summary of FGVS Potential

Exploiting fine-grain voltage scaling requires:

◮ FGVS in Level: at least three levels and four levels

results in additional benefits

◮ FGVS in Space: per-core voltage control ◮ FGVS in Time: voltage settling response times of

100 ns or faster

Cornell University Christopher Torng 10 / 22

slide-34
SLIDE 34

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

FG-SYNC+: Summary of FGVS Potential

Exploiting fine-grain voltage scaling requires:

◮ FGVS in Level: at least three levels and four levels

results in additional benefits

◮ FGVS in Space: per-core voltage control ◮ FGVS in Time: voltage settling response times of

100 ns or faster How do we design a power distribution network that can enable fine-grain voltage scaling?

Cornell University Christopher Torng 10 / 22

slide-35
SLIDE 35

Motivation

  • FGVS Architecture: FG-SYNC+ •

FGVS Circuits: RPDN Methodology and Evaluation

FG-SYNC+: Summary of FGVS Potential

Exploiting fine-grain voltage scaling requires:

◮ FGVS in Level: at least three levels and four levels

results in additional benefits

◮ FGVS in Space: per-core voltage control ◮ FGVS in Time: voltage settling response times of

100 ns or faster How do we design a power distribution network that can enable fine-grain voltage scaling? Architecture and Circuits Co-Design Approach

Cornell University Christopher Torng 10 / 22

slide-36
SLIDE 36

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

Talk Outline

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Stats Power Modes Voltages FG-SYNC+ RPDN

FGVS Architecture: FG-SYNC+ Use lightweight software hints and lookup tables derived offline to enable fast multi-level voltage configuration FGVS Circuits: RPDN Enable sprinting cores to dynamically borrow energy storage from resting cores Methodology and Evaluation Architecture and Circuits Co-Design Approach

Cornell University Christopher Torng 11 / 22

slide-37
SLIDE 37

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

PDN: Basic Regulators

The three primary types of step-down voltage regulators are linear regulators, inductor-based switching regulators (buck), and capacitor-based switching regulators.

Cornell University Christopher Torng 12 / 22

slide-38
SLIDE 38

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

A Simple On-Chip Power Distribution Network

Single Fixed VR

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

CTRL

Cornell University Christopher Torng 13 / 22

slide-39
SLIDE 39

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

A Simple On-Chip Power Distribution Network

Single Fixed VR

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

CTRL

Flyback Cap Vout = Vin/2 Vin time S Vin S Load P P

Cornell University Christopher Torng 13 / 22

slide-40
SLIDE 40

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

A Simple On-Chip Power Distribution Network

Single Fixed VR

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

CTRL

S S Flyback Cap Vout = Vin/2 Vin time S P Vin P Load

Cornell University Christopher Torng 13 / 22

slide-41
SLIDE 41

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

A Simple On-Chip Power Distribution Network

Single Fixed VR

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

CTRL

Flyback Cap Vout = Vin/2 S Vin S Load P P Vin time S P

Cornell University Christopher Torng 13 / 22

slide-42
SLIDE 42

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

A Simple On-Chip Power Distribution Network

Single Fixed VR

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

CTRL

Flyback Cap Vout = Vin/2 S Vin S Load P P Vin time S P Vin/2 S P

Cornell University Christopher Torng 13 / 22

slide-43
SLIDE 43

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

A Simple On-Chip Power Distribution Network

Single Fixed VR

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

CTRL

Flyback Cap Vout = Vin/2 Vin time S Vin S Load P P How to design sophisticated control circuitry? How to use multiple phases to reduce ripple? How to size the energy storage? How to choose the switch-to-cap area ratio?

Cornell University Christopher Torng 13 / 22

slide-44
SLIDE 44

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

MAVR: Multiple Adjustable Voltage Regulators

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

Adjust- able VR Adjust- able VR Adjust- able VR Adjust- able VR CTRL CTRL CTRL CTRL

Cornell University Christopher Torng 14 / 22

slide-45
SLIDE 45

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

MAVR: Multiple Adjustable Voltage Regulators

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

Adjust- able VR Adjust- able VR Adjust- able VR Adjust- able VR CTRL CTRL CTRL CTRL

Vin Load Flyback Cap

Cornell University Christopher Torng 14 / 22

slide-46
SLIDE 46

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

MAVR: Multiple Adjustable Voltage Regulators

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

Adjust- able VR Adjust- able VR Adjust- able VR Adjust- able VR CTRL CTRL CTRL CTRL

Vin Load Flyback Cap How to design sophisticated control circuitry? How to use multiple phases to reduce ripple? How to size the energy storage? How to choose the switch-to-cap area ratio?

Cornell University Christopher Torng 14 / 22

slide-47
SLIDE 47

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

MAVR: Multiple Adjustable Voltage Regulators

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

Adjust- able VR Adjust- able VR Adjust- able VR Adjust- able VR CTRL CTRL CTRL CTRL

Vin Load Flyback Cap How to design sophisticated control circuitry? How to use multiple phases to reduce ripple? How to size the energy storage? How to choose the switch-to-cap area ratio?

Cornell University Christopher Torng 14 / 22

slide-48
SLIDE 48

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

MAVR: Per-Core Regulator Sizing Study

Power Efficiency 10 20 30 40 50 60 70 60% 65% 70% 75% 80% 85% Output Power (mW) Nominal (1.0V) Sprint (1.15V)

Cornell University Christopher Torng 15 / 22

slide-49
SLIDE 49

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

MAVR: Per-Core Regulator Sizing Study

Power Efficiency 10 20 30 40 50 60 70 60% 65% 70% 75% 80% 85% Output Power (mW) Nominal (1.0V) Sprint (1.15V) Cell Area = 0.011 mm2

Cornell University Christopher Torng 15 / 22

slide-50
SLIDE 50

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

MAVR: Per-Core Regulator Sizing Study

Power Efficiency 10 20 30 40 50 60 70 60% 65% 70% 75% 80% 85% Output Power (mW) Nominal (1.0V) Sprint (1.15V) Cell Area = 0.011 mm2 4 Cells

Cornell University Christopher Torng 15 / 22

slide-51
SLIDE 51

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

MAVR: Per-Core Regulator Sizing Study

Power Efficiency 10 20 30 40 50 60 70 60% 65% 70% 75% 80% 85% Output Power (mW) Nominal (1.0V) Sprint (1.15V) Cell Area = 0.011 mm2 4 Cells Super-Sprint (1.33V)

Cornell University Christopher Torng 15 / 22

slide-52
SLIDE 52

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

MAVR: Per-Core Regulator Sizing Study

Power Efficiency 10 20 30 40 50 60 70 60% 65% 70% 75% 80% 85% Output Power (mW) Nominal (1.0V) Sprint (1.15V) Cell Area = 0.011 mm2 4 Cells Super-Sprint (1.33V) 5 Cells 6 Cells 7 Cells

Cornell University Christopher Torng 15 / 22

slide-53
SLIDE 53

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

MAVR: Per-Core Regulator Sizing Study

Power Efficiency 10 20 30 40 50 60 70 60% 65% 70% 75% 80% 85% Output Power (mW) Nominal (1.0V) Sprint (1.15V) Cell Area = 0.011 mm2 4 Cells Super-Sprint (1.33V) 5 Cells 6 Cells 7 Cells Rest (0.7V)

Cornell University Christopher Torng 15 / 22

slide-54
SLIDE 54

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

MAVR: Per-Core Regulator Sizing Study

Power Efficiency 10 20 30 40 50 60 70 60% 65% 70% 75% 80% 85% Output Power (mW) Nominal (1.0V) Sprint (1.15V) Cell Area = 0.011 mm2 4 Cells Super-Sprint (1.33V) 5 Cells 6 Cells 7 Cells Rest (0.7V) 3 Cells 2 Cells 1 Cell

Cornell University Christopher Torng 15 / 22

slide-55
SLIDE 55

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

MAVR: Multiple Adjustable Voltage Regulators

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

CTRL CTRL CTRL CTRL

MAVR

Adjust- able VR Adjust- able VR Adjust- able VR Adjust- able VR

Cornell University Christopher Torng 16 / 22

slide-56
SLIDE 56

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

MAVR: Multiple Adjustable Voltage Regulators

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

CTRL CTRL CTRL CTRL

MAVR

Adjust- able VR Adjust- able VR Adjust- able VR Adjust- able VR

Cornell University Christopher Torng 16 / 22

slide-57
SLIDE 57

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

MAVR: Multiple Adjustable Voltage Regulators

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

CTRL CTRL CTRL CTRL

MAVR

Adjust- able VR Adjust- able VR Adjust- able VR Adjust- able VR

Key Observation MAVR requires enough area for each regulator to independently support all power modes Power limits mean FG-SYNC+ is designed such that only 1 or 2 cores are ever super-sprinting at once

Cornell University Christopher Torng 16 / 22

slide-58
SLIDE 58

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

RPDN: Reconfigurable Power Distribution Networks

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

CTRL CTRL CTRL CTRL

MAVR

Adjust- able VR Adjust- able VR Adjust- able VR Adjust- able VR

Key Observation MAVR requires enough area for each regulator to independently support all power modes Power limits mean FG-SYNC+ is designed such that only 1 or 2 cores are ever super-sprinting at once

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

CTRL CTRL CTRL CTRL

16:4 XBar

4b 4b 4b 4b

RPDN

Cornell University Christopher Torng 17 / 22

slide-59
SLIDE 59

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

RPDN: Reconfigurable Power Distribution Networks

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

CTRL CTRL CTRL CTRL

MAVR

Adjust- able VR Adjust- able VR Adjust- able VR Adjust- able VR

Key Observation MAVR requires enough area for each regulator to independently support all power modes Power limits mean FG-SYNC+ is designed such that only 1 or 2 cores are ever super-sprinting at once

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

CTRL CTRL CTRL CTRL

16:4 XBar

4b 4b 4b 4b

RPDN Core 2 Core 0 Core 1 Core 3 Core 0 Core 2 Core 1 Core 3

Cornell University Christopher Torng 17 / 22

slide-60
SLIDE 60

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

RPDN: Reconfigurable Power Distribution Networks

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

CTRL CTRL CTRL CTRL

MAVR

Adjust- able VR Adjust- able VR Adjust- able VR Adjust- able VR

Key Observation MAVR requires enough area for each regulator to independently support all power modes Power limits mean FG-SYNC+ is designed such that only 1 or 2 cores are ever super-sprinting at once

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank

CTRL CTRL CTRL CTRL

16:4 XBar

4b 4b 4b 4b

RPDN Core 2 Core 0 Core 1 Core 3 Core 0 Core 2 Core 1 Core 3

40% Area Savings

Cornell University Christopher Torng 17 / 22

slide-61
SLIDE 61

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

SPICE-Level Transient Response + Leakage Benefits

100 150 200 250 300 350 400 1.4 1.3 1.2 1.1 1.0 0.9 0.8 0.7 Time (ns) Voltage (V) 120 ns 150 ns 1.3 1.2 1.1 1.0 0.9 0.8 0.7 0.25 1.0 1.75 2.5 3.25 Time (us) Voltage (V) 1390 ns 960 ns 2900 ns MAVR Transient Response RPDN Transient Response

Cornell University Christopher Torng 18 / 22

slide-62
SLIDE 62

Motivation FGVS Architecture: FG-SYNC+

  • FGVS Circuits: RPDN •

Methodology and Evaluation

SPICE-Level Transient Response + Leakage Benefits

100 150 200 250 300 350 400 1.4 1.3 1.2 1.1 1.0 0.9 0.8 0.7 Time (ns) Voltage (V) 120 ns 150 ns 1.3 1.2 1.1 1.0 0.9 0.8 0.7 0.25 1.0 1.75 2.5 3.25 Time (us) Voltage (V) 1390 ns 960 ns 2900 ns MAVR Transient Response RPDN Transient Response 5 10 15 20 25 30 35 40 60% 65% 70% 75% 80% Rest Nominal Sprint Super-Sprint Output Power (mW) 4 cells 7 cells Power Efficiency 1 cell 55% 50% Power Efficiency vs Output Power in a Leakier Process

Cornell University Christopher Torng 18 / 22

slide-63
SLIDE 63

Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN

  • Methodology and Evaluation •

Talk Outline

Core 0 Core 1 Core 2 Cache Bank Cache Bank Cache Bank On-Chip Interconnect Core 3 Cache Bank Stats Power Modes Voltages FG-SYNC+ RPDN

FGVS Architecture: FG-SYNC+ Use lightweight software hints and lookup tables derived offline to enable fast multi-level voltage configuration FGVS Circuits: RPDN Enable sprinting cores to dynamically borrow energy storage from resting cores Methodology and Evaluation Architecture and Circuits Co-Design Approach

Cornell University Christopher Torng 19 / 22

slide-64
SLIDE 64

Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN

  • Methodology and Evaluation •

Methodology

Cross Compiler RISC ISA Sim RISC Binary C++ Multithreaded App Using Hints Cycle-Level Simulator Cycle-Level Multicore with DVFS Controller

Cornell University Christopher Torng 20 / 22

slide-65
SLIDE 65

Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN

  • Methodology and Evaluation •

Methodology

Cross Compiler RISC ISA Sim RISC Binary C++ Multithreaded App Using Hints Cycle-Level Simulator Cycle-Level Multicore with DVFS Controller Verilog RTL of Multicore RISC Processor Gate-Level Model Verilog Simulator Verilog Simulator Switching Activity Layout Synthesis P&R Instruction-Based Energy Dictionary

Cornell University Christopher Torng 20 / 22

slide-66
SLIDE 66

Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN

  • Methodology and Evaluation •

Methodology

Cross Compiler RISC ISA Sim RISC Binary C++ Multithreaded App Using Hints Cycle-Level Simulator Cycle-Level Multicore with DVFS Controller Verilog RTL of Multicore RISC Processor Gate-Level Model Verilog Simulator Verilog Simulator Switching Activity Layout Synthesis P&R Instruction-Based Energy Dictionary SPICE Circuit Models of PDNs SPICE Simulator DVFS Mode Transition Times

Cornell University Christopher Torng 20 / 22

slide-67
SLIDE 67

Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN

  • Methodology and Evaluation •

Methodology

Cross Compiler RISC ISA Sim RISC Binary C++ Multithreaded App Using Hints Cycle-Level Simulator Cycle-Level Multicore with DVFS Controller Verilog RTL of Multicore RISC Processor Gate-Level Model Verilog Simulator Verilog Simulator Switching Activity Layout Synthesis P&R Instruction-Based Energy Dictionary SPICE Circuit Models of PDNs SPICE Simulator DVFS Mode Transition Times Power Model Per-Core Stats Power Efficiencies for each PDN

Cornell University Christopher Torng 20 / 22

slide-68
SLIDE 68

Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN

  • Methodology and Evaluation •

Methodology

Cross Compiler RISC ISA Sim RISC Binary C++ Multithreaded App Using Hints Cycle-Level Simulator Cycle-Level Multicore with DVFS Controller Verilog RTL of Multicore RISC Processor Gate-Level Model Verilog Simulator Verilog Simulator Switching Activity Layout Synthesis P&R Instruction-Based Energy Dictionary SPICE Circuit Models of PDNs SPICE Simulator DVFS Mode Transition Times Power Model Per-Core Stats Power Efficiencies for each PDN System Performance and Energy Results

Cornell University Christopher Torng 20 / 22

slide-69
SLIDE 69

Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN

  • Methodology and Evaluation •

Evaluation

Normalized Energy Efficiency Speedup isopower MAVR RPDN No FGVS 0.6 0.8 1.0 1.2 1.4 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7

Benchmarks

◮ bfs ◮ bilateral ◮ dither ◮ kmeans ◮ mriq ◮ pbbs-dr ◮ pbbs-knn ◮ pbbs-mm ◮ rsort ◮ splash2-fft ◮ splash2-lu ◮ strsearch ◮ viterbi

Cornell University Christopher Torng 21 / 22

slide-70
SLIDE 70

Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN

  • Methodology and Evaluation •

Evaluation

Normalized Energy Efficiency Speedup isopower MAVR RPDN No FGVS 0.6 0.8 1.0 1.2 1.4 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7

Benchmarks

◮ bfs ◮ bilateral ◮ dither ◮ kmeans ◮ mriq ◮ pbbs-dr ◮ pbbs-knn ◮ pbbs-mm ◮ rsort ◮ splash2-fft ◮ splash2-lu ◮ strsearch ◮ viterbi

Cornell University Christopher Torng 21 / 22

slide-71
SLIDE 71

Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN

  • Methodology and Evaluation •

Evaluation

Normalized Energy Efficiency Speedup isopower MAVR RPDN No FGVS 0.6 0.8 1.0 1.2 1.4 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7

Benchmarks

◮ bfs ◮ bilateral ◮ dither ◮ kmeans ◮ mriq ◮ pbbs-dr ◮ pbbs-knn ◮ pbbs-mm ◮ rsort ◮ splash2-fft ◮ splash2-lu ◮ strsearch ◮ viterbi

Cornell University Christopher Torng 21 / 22

slide-72
SLIDE 72

Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN

  • Methodology and Evaluation •

Evaluation

Normalized Energy Efficiency Speedup isopower MAVR RPDN No FGVS 0.6 0.8 1.0 1.2 1.4 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7

Benchmarks

◮ bfs ◮ bilateral ◮ dither ◮ kmeans ◮ mriq ◮ pbbs-dr ◮ pbbs-knn ◮ pbbs-mm ◮ rsort ◮ splash2-fft ◮ splash2-lu ◮ strsearch ◮ viterbi

Cornell University Christopher Torng 21 / 22

slide-73
SLIDE 73

Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN

  • Methodology and Evaluation •

Evaluation

Normalized Energy Efficiency Speedup isopower MAVR RPDN No FGVS 0.6 0.8 1.0 1.2 1.4 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7

2X Performance Boost with 40% Area Savings over MAVR

Benchmarks

◮ bfs ◮ bilateral ◮ dither ◮ kmeans ◮ mriq ◮ pbbs-dr ◮ pbbs-knn ◮ pbbs-mm ◮ rsort ◮ splash2-fft ◮ splash2-lu ◮ strsearch ◮ viterbi

Cornell University Christopher Torng 21 / 22

slide-74
SLIDE 74

Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN

  • Methodology and Evaluation •

Evaluation

Normalized Energy Efficiency Speedup isopower MAVR RPDN No FGVS 0.6 0.8 1.0 1.2 1.4 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7

2X Performance Boost with 40% Area Savings over MAVR 10-50% Performance and 10-70% Energy Eff. over no FGVS

Benchmarks

◮ bfs ◮ bilateral ◮ dither ◮ kmeans ◮ mriq ◮ pbbs-dr ◮ pbbs-knn ◮ pbbs-mm ◮ rsort ◮ splash2-fft ◮ splash2-lu ◮ strsearch ◮ viterbi

Cornell University Christopher Torng 21 / 22

slide-75
SLIDE 75

Motivation FGVS Architecture: FG-SYNC+ FGVS Circuits: RPDN Methodology and Evaluation

Take-Away Points

◮ Architecture and Mixed-Signal Circuit Co-Design can maximize

the system-level benefit of the emerging trend towards integrated voltage regulation.

◮ Lightweight hints can provide an elegant solution to informing

hardware of fine-grain activity imbalance.

◮ Reconfigurable Power Distribution Networks can enable

realistic FGVS by significantly reducing regulator area overhead and improving voltage-settling response times by an order of magnitude.

This work was supported in part by the National Science Foundation (NSF), a Spork Fellowship, and donations from Intel Corporation and Synopsys, Inc.

Cornell University Christopher Torng 22 / 22