Clock Enable Timing Closure Methodology Harish Dangat Samsung - - PowerPoint PPT Presentation

clock enable timing closure methodology
SMART_READER_LITE
LIVE PREVIEW

Clock Enable Timing Closure Methodology Harish Dangat Samsung - - PowerPoint PPT Presentation

Clock Enable Timing Closure Methodology Harish Dangat Samsung Semiconductor (company logo if desired) Agenda Basics of Clock Gating Fixing Clock Enable Timing in RTL-2-GDSII Flow Results Conclusion Harish Dangat 2 Clock


slide-1
SLIDE 1

(company logo if desired)

Clock Enable Timing Closure Methodology

Harish Dangat Samsung Semiconductor

slide-2
SLIDE 2

2

Harish Dangat

  • Basics of Clock Gating
  • Fixing Clock Enable Timing in RTL-2-GDSII Flow
  • Results
  • Conclusion

Agenda

slide-3
SLIDE 3

3

Harish Dangat

Clock Gating Basic

  • Use internal (or external)

signal to disable clock

  • This saves Dynamic

Power

  • A must for low power

design

  • Creates new timing paths
slide-4
SLIDE 4

4

Harish Dangat

Two Types of Clock Gating

  • Using AND gate
  • Using ICG Cell

Rest of presentation is about ICG type clock gating

slide-5
SLIDE 5

5

Harish Dangat

Register to Register Path

slide-6
SLIDE 6

6

Harish Dangat

Register to Register Path with Clock Gating

EN D D

CE Path CE clk Path Clock gated clk Path 1ns 1ns 0.5ns

slide-7
SLIDE 7

7

Harish Dangat

What is different about CE path

  • Not noticed at Synthesis
  • Timing available is less than cycle time
  • ICG cells are not skew balanced with registers
  • Violations are seen only after Clock Tree Synthesis
  • Mostly affects timing critical blocks
slide-8
SLIDE 8

8

Harish Dangat

Effect of ICG Cells Location in Clock Tree

Good Location Acceptable Location Potential bad Location CE timing

CLK

Architectural Gaters

0ns 1ns 0.5ns 0.25ns 0.75ns

slide-9
SLIDE 9

9

Harish Dangat

  • Basics of Clock Gating
  • Fixing Clock Enable Timing in RTL-2-GDSII

Flow

  • Results
  • Conclusion

Agenda

slide-10
SLIDE 10

10

Harish Dangat

  • CE signal should be generated in the same

module

  • Generate CE signal from functionally related

modules

  • Simplify the logic that generates CE signal

What to Do at RTL Level

slide-11
SLIDE 11

11

Harish Dangat

  • Reduce cycle time to ICG cells
  • Set high setup time on ICG cells
  • Turn off bus sharing in Power Compiler

CE Timing at Synthesis Step

set_clock_latency

  • (cycle_time/2) \

[get_pin all_clock_gating_registers/CK] set_clock_latency 0 [get_pin all_clock_gating_registers/ECK] set timing_scgc_override_library_setup_hold true set_clock_gating_style –setup 400ps clock_gate set_clock_gating_style –no_sharing

slide-12
SLIDE 12

12

Harish Dangat

  • When placing modules, pay attention to CE

signal connectivity

  • If CE signal(s) are input pins, place them close

to modules that receive it

CE Timing at Floorplan Step

CE CE timing problem CE Good CE timing

slide-13
SLIDE 13

13

Harish Dangat

  • Tightening available cycle time by changing ICG

setup time

  • Tightening available cycle time by changing ICG

clock latency

CE Timing at placement Step

set timing_scgc_override_library_setup_hold true set_clock_gating_style –setup 400ps clock_gate set_clock_latency

  • (cycle_time/2) \

[get_pin all_clock_gating_registers/CK] set_clock_latency 0 [get_pin all_clock_gating_registers/ECK]

slide-14
SLIDE 14

14

Harish Dangat

  • Create group path and add extra weight
  • Place ICG cells close to flops

CE Timing at placement Step (cont)

group_path

  • weight 5 -name CLOCK_ENABLE \

–to [get_cell */*GATE_LATCH] set placer_disable_auto_bound_for_gated_clock false

slide-15
SLIDE 15

15

Harish Dangat

  • Apply global latency

– Easy, Not very efficient

  • Apply based on ICG depth and fanout

– Less depth – more latency – More fanout – more latency

  • Apply based on CTS results

– More accurate

How to Select Latency?

slide-16
SLIDE 16

16

Harish Dangat

  • Clone ICG Cells

CE Timing at Clock Tree Synthesis

set icg_cells { icg_cell_1 icg_cell_2 } split_clock_net -objects [get_cells $icg_cells] \

  • split_intermediate_level_clock_gates -gate_sizing

remove_ideal_network [all_fanout -flat -clock_tree] remove_propagated_clock * remove_clock_tree

slide-17
SLIDE 17

17

Harish Dangat

ICG Cloning

slide-18
SLIDE 18

18

Harish Dangat

CE Timing at Clock Tree Synthesis Cloning based on fanout and slack

foreach_in_collection CELLS [get_cells * -hier -filter "ref_name =~ *ICG*"] { set names [get_object_name $CELLS] set ckPins [get_object_name [get_pins -of_object [get_cells $CELLS] \

  • filter "full_name =~ */CLK"]]

set eckPins [get_object_name [get_pins -of_object [get_cells $CELLS] \

  • filter "full_name =~ */ENABLE_CLK"]]

set eckFanout [sizeof_collection [all_fanout -from [get_pins $eckPins] -flat]] set cgSlack [get_attribute [get_pins ${names}/ENABLE] max_slack if {$cgSlack > -0.150 && $eckFanout > 100} { echo "${names}/E" } remove_propagated_clock * remove_clock_tree

slide-19
SLIDE 19

19

Harish Dangat

CE Timing at Clock Tree Synthesis Two Pass Flow

Clock Tree Synthesis Placement Clone clock tree Write Verilog New Placement

slide-20
SLIDE 20

20

Harish Dangat

  • Basics of Clock Gating
  • Problems Created by Clock Gating
  • Fixing Clock Enable Timing in RTL-2-GDSII Flow
  • Results
  • Conclusion

Agenda

slide-21
SLIDE 21

21

Harish Dangat

Die Temperature Without and With Clock Gating

slide-22
SLIDE 22

22

Harish Dangat

ICG Cells and Flops Autobound

slide-23
SLIDE 23

23

Harish Dangat

Comparing Latency Schemes

  • 0.6
  • 0.5
  • 0.4
  • 0.3
  • 0.2
  • 0.1

100 200 300 400 500 600 700 800 900 Series1 Series2 Series3

Path CE violation (ns)

Baseline run 1ns latency

Selective latency

slide-24
SLIDE 24

24

Harish Dangat

Results – Effect on cloning on latency

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 200 400 600 800 1000 1200

Series1 Series2

Without Cloning IC ICG C Clock Latency (n (ns) Path ths (So (Sorted, lo low to to hig high) Wit ith Clo lonin ing

slide-25
SLIDE 25

25

Harish Dangat

Clock Subtree After Cloning

slide-26
SLIDE 26

26

Harish Dangat

Comparing Single Pass and Two pass flow

place_opt clock_opt place_opt clock_clone new place_opt clock_opt

slide-27
SLIDE 27

27

Harish Dangat

Different schemes to minimize latency

slide-28
SLIDE 28

28

Harish Dangat

Conclusion

  • Clock gating is requirement for low-power

design

  • Closing CE timing requires to pay attention at all

stages of design

  • By planning at every step, CE timing can be

closed in high-speed low-power designs

slide-29
SLIDE 29

29

Harish Dangat

Thank You !

slide-30
SLIDE 30

30

Harish Dangat

BACKUP SLIDES

BACKUP SLIDES

slide-31
SLIDE 31

31

Harish Dangat

Battery Life is Important

http://www.phonesreview.co.uk/2012/09/26/iphone-5-vs-samsung-galaxy-s3-battery-life-confrontation/

Smartphone power for continuous web access

slide-32
SLIDE 32

32

Harish Dangat

  • Use process designed for low power
  • Use low power architecture
  • User power-gating
  • Use Clock-gating

How to Minimize Power

slide-33
SLIDE 33

33

Harish Dangat

Power Saving Opportunity

Clock Gating

slide-34
SLIDE 34

34

Harish Dangat

  • 20% to 40% Dynamic power is consumed by

clock tree

  • About 80% clock tree power is consumed last

stages of clock tree

Few Facts About Clock Tree Power

Ref – ISPLED, 2008

slide-35
SLIDE 35

35

Harish Dangat

Architectural/Corse Grain Clock Gating

USB-0 USB-1 Control Logic

Clock_EN Clock_EN USB_CLOCK

en_usb_0 en_usb_1

slide-36
SLIDE 36

36

Harish Dangat

Automated/Fine Grain Clock Gating

slide-37
SLIDE 37

37

Harish Dangat

Example of Automated/Fine Grain Clock Gating

slide-38
SLIDE 38

38

Harish Dangat

What To Look For In ICG

  • Too many flops used for

generating CE signal

  • Large delay in combinational path
  • Generating flops placed away

from ICG cells

  • Flops used to generated ICG

signal placed away from each

  • ther
  • Too man flops receive gated clock

Flops receiving gated clock Flops generating gated clock Comb cells in clock gating path

slide-39
SLIDE 39

39

Harish Dangat

What To Look For In ICG

  • Too many flops used for

generating CE signal

  • Large delay in combinational path
  • Generating flops placed away

from ICG cells

  • Flops used to generated ICG

signal placed away from each

  • ther
  • Too man flops receive gated clock

Flops receiving gated clock Flops generating gated clock Comb cells in clock gating path

slide-40
SLIDE 40

40

Harish Dangat

slide-41
SLIDE 41

41

Harish Dangat

What To Look For In ICG

  • Too many flops used for

generating CE signal

  • Large delay in combinational path
  • Generating flops placed away

from ICG cells

  • Flops used to generated ICG

signal placed away from each

  • ther
  • Too man flops receive gated clock

Flops receiving gated clock Flops generating gated clock Comb cells in clock gating path