An Exact Polynomial Time Algorithm for Clock Tree Sizing for Register Files
Alexander Berkovich, Lawrence D. Gonzales, Ataur Patwary Intel Corporation Rupesh S. Shelar Synopsys Inc.
An Exact Polynomial Time Algorithm for Clock Tree Sizing for - - PowerPoint PPT Presentation
An Exact Polynomial Time Algorithm for Clock Tree Sizing for Register Files Alexander Berkovich, Lawrence D. Gonzales, Ataur Patwary Intel Corporation Rupesh S. Shelar Synopsys Inc. Agenda Introduction Clock Trees for Register Files
Alexander Berkovich, Lawrence D. Gonzales, Ataur Patwary Intel Corporation Rupesh S. Shelar Synopsys Inc.
edgerdpdecc0
rddecsdlc0
vn7inc30ls0v 1
vn7lap00ls0i1
vn7inc30ls0v 1
vn7lap00ls0i1
vn7inc30ls0v 1
vn7lap00ls0i1
vn7inc30ls0v 1
vn7lap00ls0i1
edgewrckdrv
inv Read Address Protection Latches/3:8 Decoder Write Address Protection Latches/3:8 Decoder
RDDECC RDDECC
CKRDM1N22
MCLK
RCB CKWRM2N33 Read Address Protection Latches/3:8 Decoder Write Address Protection Latches/3:8 Decoder RCB
ENTRY 0 ENTRY 47 RDDECC RDDECC RDDECC WRDECC WRDECC WRDECC
convention (for this example only)
reverse topological order in which solutions are created
RDWL[N:0] L0PCH rdbll0 Relative spec 0: R L0PCH 0.05 BR RDWL[N:0] Relative spec 1: F L0PCH 5ps AF RDWL[N:0] CKRDM1N44G
Hi Lo .. N .. L1PCH Hi Hi Lo Lo LoRCB
Hsw default mclk riseG
Hi Lo RDWL[N]. Absolute spec 2: 27 ar .. N .. Relative spec 0: R L1PCH 0.05 BR INSTANCE[A:0]/RDBLL0 Relative spec 1: F L1PCH 5ps AF INSTANCE[A:0]/RDBLL0 INSTANCE[0] INSTANCE[A]BR1 GR1 BP01 BR2 GR2 BP11 BP12 OP11 AP11 BP13 BP14 RCB
memcell[0] memcell[1] memcell[2] memcell[3] memcell[4] memcell[5] memcell[6] memcell[7]
bit bit bit bit bit bit bit bit bit bit bit bit bit bit bit bit rdbll0 rdbll0 Level0 Bitline[1] Level0 Bitline[0] Level0 Bitline[1] Level0 Bitline[0] Level0 Bitline[1] RDWL[0] RDWL[1] RDWL[2] RDWL[3] RDWL[4] RDWL[5] RDWL[6] RDWL[7] Level0 precharge clock ( left )Earliest L0PCH
earliest of earliest (latest) RDWL receivers
latest of latest (earliest) RDWL receivers
Earliest rise and fall Latest rise and fall Latest L0PCH
LFFRDWL EFRRDWL EFRL0PCH LFFL0PCH LNF
RDWL
ENRRDWL ENRL0PCH LNFL0PCH
– Each clock buffer has 3 strengths
characterized by a 10-element vector (EFR, LFF, ENR, LNF, RR_spc, RF_spc, Rise-cell- delay, fall-cell-delay, cell-power, total-power)
RDWL[N:0] L0PCH rdbll0 Relative spec 0: R L0PCH 0.05 BR RDWL[N:0] Relative spec 1: F L0PCH 5ps AF RDWL[N:0] CKRDM1N44G
Hi Lo .. N .. L1PCH Hi Hi Lo Lo LoRCB
Hsw default mclk riseG
Hi Lo RDWL[N]. Absolute spec 2: 27 ar .. N .. Relative spec 0: R L1PCH 0.05 BR INSTANCE[A:0]/RDBLL0 Relative spec 1: F L1PCH 5ps AF INSTANCE[A:0]/RDBLL0 INSTANCE[0] INSTANCE[A]BR1 GR1 BP01 BR2 GR2 BP11 BP12 OP11 AP11 BP13 BP14 RCB
– required arrival time at the specified entry is 27 unit – L0PCH has to rise/fall before/after RDWL by ≥5 unit
receivers of BR1[0:7] and BP01 to (27, 27, 27, 27, 27, 27, 0, 0, 0, 0)
– BR1, GR1, BP01, BR2, GR2, RCB – BP12, OP11, AP11, BP13, BP14, RCB
BR1 GR1 BP01 BR2 GR2 BP11 BP12 OP11 AP11 BP13 BP14 RCB
G
Hi Lo .. N .. L1PCH Hi Hi Lo Lo LoRCB
Hsw default mclk riseG
Hi Lo RDWL[N]. Absolute spec 2: 27 ar .. N .. Relative spec 0: R L1PCH 0.05 BR INSTANCE[A:0]/RDBLL0 Relative spec 1: F L1PCH 5ps AF INSTANCE[A:0]/RDBLL0 INSTANCE[0] INSTANCE[A]delay to be 6 and 2 unit
– (27-6, 27-6, 27-2, 27- 2, 27-6, 27-6, 0, 0, 0, 0) = (21, 21, 25, 25, 21, 21, 0, 0, 0, 0) – For 2nd and 3rd iteration, RDWL[i] may have different wire-delays, so use the appropriate
BR1 GR1 BP01 BR2 GR2 BP11 BP12 OP11 AP11 BP13 BP14 RCB
G
Hi Lo .. N .. L1PCH Hi Hi Lo Lo LoRCB
Hsw default mclk riseG
Hi Lo RDWL[N]. Absolute spec 2: 27 ar .. N .. Relative spec 0: R L1PCH 0.05 BR INSTANCE[A:0]/RDBLL0 Relative spec 1: F L1PCH 5ps AF INSTANCE[A:0]/RDBLL0 INSTANCE[0] INSTANCE[A]use actual values for later ones
choices
– Solution 1: (21-22, 21- 22, 25-22, 25-22, 21- 22, 21-22, 22, 22, 3, 3) = (-1, -1, 3, 3, -1, -1, 22, 22, 3, 3) – Solution 2: (-3, -3, 1, 1,
– Solution 3: (-5, -5, -1, - 1, -5, -5, 26, 26, 1, 1)
BR1 GR1 BP01 BR2 GR2 BP11 BP12 OP11 AP11 BP13 BP14 RCB
G
Hi Lo .. N .. L1PCH Hi Hi Lo Lo LoRCB
Hsw default mclk riseG
Hi Lo RDWL[N]. Absolute spec 2: 27 ar .. N .. Relative spec 0: R L1PCH 0.05 BR INSTANCE[A:0]/RDBLL0 Relative spec 1: F L1PCH 5ps AF INSTANCE[A:0]/RDBLL0 INSTANCE[0] INSTANCE[A]RR_spc, RF_spc, but higher power
– Solution 1: (21-22, 21- 22, 25-22, 25-22, 21- 22, 21-22, 22, 22, 3, 3) = (-1, -1, 3, 3, -1, -1, 22, 22, 3, 3) – Solution 2: (-3, -3, 1, 1,
– Solution 3: (-5, -5, -1, - 1, -5, -5, 26, 26, 1, 1)
BR1 GR1 BP01 BR2 GR2 BP11 BP12 OP11 AP11 BP13 BP14 RCB
G
Hi Lo .. N .. L1PCH Hi Hi Lo Lo LoRCB
Hsw default mclk riseG
Hi Lo RDWL[N]. Absolute spec 2: 27 ar .. N .. Relative spec 0: R L1PCH 0.05 BR INSTANCE[A:0]/RDBLL0 Relative spec 1: F L1PCH 5ps AF INSTANCE[A:0]/RDBLL0 INSTANCE[0] INSTANCE[A]and 6 unit, and for the specific entry wire- delay is 4 unit, we get the following solutions after the result of combine: – Solution 1: (-1-6, -1-6, 3-2, 3-2, - 1-4, -1-4, 22, 22, 3, 3*4) = (-7, -7, 1, 1, -5, -5, 22, 22, 3, 12) – Solution 2: (-3-6, -3-6, 1-2, 1-2, - 3-4, -3-4, 24, 24, 2, 2*4) = (-9, -9,
– Solution 3: (-5-6, -5-6, -1-2, -1-2, - 5-4, -5-4, 26, 26, 1, 1*4) = (-11, - 11, -3, -3, -9, -9, 26, 26, 1, 4)
BR1 GR1 BP01 BR2 GR2 BP11 BP12 OP11 AP11 BP13 BP14 RCB
follows:
21, -21, -27, -27, 22, 22, 3, 15)
23, -23, -29, -29, 24, 24, 2, 14)
25, -25, -31, -31, 26, 26, 1, 13)
√ √ √ X √ X √ X X
results in the following:
(CNTD.)
G
Hi Lo .. N .. L1PCH Hi Hi Lo Lo LoRCB
Hsw default mclk riseG
Hi Lo RDWL[N]. Absolute spec 2: 27 ar .. N .. Relative spec 0: R L1PCH 0.05 BR INSTANCE[A:0]/RDBLL0 Relative spec 1: F L1PCH 5ps AF INSTANCE[A:0]/RDBLL0 INSTANCE[0] INSTANCE[A]BR1 GR1 BP01 BR2 GR2 BP11 BP12 OP11 AP11 BP13 BP14 RCB
delay to be 6 and 2 unit
–(27-6, 27-6, 27-2, 27- 2, 27-6, 27-6, 0, 0, 0, 0) = (21, 21, 25, 25, 21, 21, 0, 0, 0, 0)
assume fall is 18 unit slower and that it has 3 sizes with rise/fall delays as follows
G
Hi Lo .. N .. L1PCH Hi Hi Lo Lo LoRCB
Hsw default mclk riseG
Hi Lo RDWL[N]. Absolute spec 2: 27 ar .. N .. Relative spec 0: R L1PCH 0.05 BR INSTANCE[A:0]/RDBLL0 Relative spec 1: F L1PCH 5ps AF INSTANCE[A:0]/RDBLL0 INSTANCE[0] INSTANCE[A]BR1 GR1 BP01 BR2 GR2 BP11 BP12 OP11 AP11 BP13 BP14 RCB
62, 21-44, 21-62, 44, 62, 3, 3) = (-23, - 41, -19, -37, -23, -41, 44, 62, 3, 3)
66, 21-48, 21-66, 48, 66, 2, 2) = (-27, - 45, -23, -41, -27, -45, 48, 66, 2, 2)
70, 21-52, 21-70, 52, 70, 1, 1) = (-31, - 49, -27, -45, -31, -49, 52, 70, 1, 1)
preserved in this case
G
Hi Lo .. N .. L1PCH Hi Hi Lo Lo LoRCB
Hsw default mclk riseG
Hi Lo RDWL[N]. Absolute spec 2: 27 ar .. N .. Relative spec 0: R L1PCH 0.05 BR INSTANCE[A:0]/RDBLL0 Relative spec 1: F L1PCH 5ps AF INSTANCE[A:0]/RDBLL0 INSTANCE[0] INSTANCE[A]from BP01 and GR1
So, the solutions from L0PCH path are:
So, the solutions from RDWL path are:
BR1 GR1 BP01 BR2 GR2 BP11 BP12 OP11 AP11 BP13 BP14 RCB
at disadvantage)
–
–
–
–
sizing GR2, RCB
time requirement, and by sizing RCB to the same delay from L0PCH/RDWL sizing iteration (Why? )
performance applications