Synthesis Script for 8 x 8 Binary Multiplier load_library - - PDF document

synthesis script for 8 x 8 binary multiplier
SMART_READER_LITE
LIVE PREVIEW

Synthesis Script for 8 x 8 Binary Multiplier load_library - - PDF document

Synthesis Script for 8 x 8 Binary Multiplier load_library /linux_apps/ADK3.1/technology/leonardo/tsmc035_typ analyze "../src/adder.vhd" "../src/RegN.vhd" "../src/mctrl.vhd" "../src/mult_comp.vhd"


slide-1
SLIDE 1

Synthesis Script for 8 x 8 Binary Multiplier

load_library /linux_apps/ADK3.1/technology/leonardo/tsmc035_typ analyze "../src/adder.vhd" "../src/RegN.vhd" "../src/mctrl.vhd" "../src/mult_comp.vhd" "../src/multiplier.vhd" elaborate clock_cycle 1 CLOCK

  • ptimize -macro -hierarchy preserve
  • ptimize_timing

write multiplier_4.vhd write multiplier_4.v write multiplier_4.sdf report_area mult_area.rpt -cell_usage -hierarchy report_delay -num_paths 1 -show_nets -clock_frequency -critical_paths mult_delay.rpt report_delay -num_paths 1 -show_nets -longest_path -to [list PRODUCT* DONE] mult_outdelay.rpt report_delay -num_paths 1 -show_nets -longest_path -from [list MCAND* MPLIER* START] mult_indelay.rpt My multiplier has 19 primary outputs, 16-bit PRODUCT and the DONE signal, and 17 primary inputs, 8-bit MCAND and MPLIER, and the START signal. I want delay estimates from the primary inputs (MCAND, MPLIER, START) to the internal flip-flops, and from the CLOCK to primary outputs (PRODUCT, DONE). In TCL, note that a “list” is written in TCL in the form [list item1 item2 …]. An asterisk matches any character string. So MCAND* matches MCAND(0), MCAND(1), etc. This allows all bits of a vector to be specified in a compact form. The following pages show the three generated delay reports, with simulation results to verify the output and input delays.

slide-2
SLIDE 2

Critical path delay report, produced by command: report_delay -num_paths 1 -show_nets -clock_frequency -critical_paths mult_delay.rpt

CLOCK : 289.6 MHz Critical Path Report Critical path #1, (path slack = -1.0): NAME GATE ARRIVAL LOAD

  • CLOCK (offset) 0.00 (rising edge)

delay thru clock network 0.00 (ideal) A/reg_Rint(2)/CLK dffr 0.00 (rising edge) A/reg_Rint(2)/Q dffr 0.00 0.49 up 0.06 PRODUCT(10) (net) 0.00 0.49 up (fan) 5.00 ADR/ix207/Y and02 0.40 0.88 up 0.05 ADR/nx159 (net) 0.00 0.88 up (fan) 4.00 ADR/ix218/Y inv02 0.07 0.95 dn 0.01 ADR/nx170 (net) 0.00 0.95 dn (fan) 1.00 ADR/ix219/Y aoi21 0.33 1.29 up 0.02 ADR/nx171 (net) 0.00 1.29 up (fan) 1.00 ADR/ix220/Y oai32 0.23 1.52 dn 0.04 ADR/nx172 (net) 0.00 1.52 dn (fan) 3.00 ADR/ix227/Y ao221 0.49 2.01 dn 0.01 ADR/nx179 (net) 0.00 2.01 dn (fan) 1.00 ADR/ix238/Y nand03_2x 0.13 2.14 up 0.01 ADR/nx190 (net) 0.00 2.14 up (fan) 1.00 ADR/reg_Z(6)/Y oai321 0.21 2.35 dn 0.01 ADDout(6) (net) 0.00 2.35 dn (fan) 1.00 A/ix33/Y mux21_ni 0.33 2.68 dn 0.01 A/nx32 (net) 0.00 2.68 dn (fan) 1.00 A/ix259/Y mux21_ni 0.30 2.98 dn 0.02 A/nx258 (net) 0.00 2.98 dn (fan) 1.00 A/reg_Rint(6)/D dffr 0.00 2.98 dn 0.00 data arrival time 2.98 CLOCK (offset) 0.00 (rising edge) delay thru clock network 0.00 (ideal) A/reg_Rint(6)/CLK dffr 0.00 (rising edge) clock cycle 2.50 library setup time (0.47) data required time 2.03

  • data required time 2.03

data arrival time 2.98

  • slack -0.95
  • The worst case path is 2.98 ns from flip-flop reg_Rint(6) in the accumulator register (instance A), through the ALU

(adder), back to flip-flop reg_Rint(6) in the accumulator. Since the flip-flop setup time is 0.47ns, the minimum clock period is 2.98 + 0.47 = 3.45ns. This missed my target clock period of 2.50ns, with a negative slack of 0.95ns.

slide-3
SLIDE 3

Output delay report, for worst-case path from clock to a primary output, is produced by the command: report_delay -num_paths 1 -show_nets -longest_path -to [list PRODUCT* DONE] mult_outdelay.rpt

Critical path #1, (unconstrained path) NAME GATE ARRIVAL LOAD

  • CLOCK (offset) 0.00 (rising edge)

delay thru clock network 0.00 (ideal) C/reg_State(0)/CLK dffs_ni 0.00 (rising edge) C/reg_State(0)/Q dffs_ni 0.00 0.56 dn 0.03 C/State(0) (net) 0.00 0.56 dn (fan) 3.00 C/ix87/Y and02 0.19 0.75 dn 0.00 DONE (net) 0.00 0.75 dn (fan) 1.00 DONE/ 0.00 0.75 dn 0.00 data arrival time 0.75 data required time not specified

  • data required time not specified

data arrival time 0.75

  • unconstrained path

The worst case delay a CLOCK transition to a primary output is 0.75ns from controller (instance C) flip-flop reg_State(0), via net State(0), to primary output DONE, passing through one and02 gate. (This signals the completion of the multiply algorithm, with the controller in its HALT state.) This delay can be verified in simulation, with the List window showing CLOCK, the two state variables, and DONE. ps /multiplier/CLOCK delta C/State_1 - Controller state var. 1 C/State_0 - Controller state var. 0 /multiplier/DONE - DONE primary output 0 +0 0 X X X 2000 +1 0 1 1 X 2150 +0 0 1 1 1 5000 +0 1 1 1 1 10000 +0 0 1 1 1 15000 +0 1 1 1 1 – CLOCK triggers state change 15520 +0 1 1 0 1 – State variable 0 changes 15540 +0 1 0 0 1 – State variable 1 changes 15670 +0 1 0 0 0 – DONE=0 for INIT state (00) 20000 +0 0 0 0 0 25000 +0 1 0 0 0 25520 +0 1 0 1 0

. . . . . . 175540 +0 1 1 0 0 180000 +0 0 1 0 0 185000 +0 1 1 0 0 – CLOCK triggers state change 185520 +0 1 1 1 0 – State variable 0 changes 185670 +0 1 1 1 1 – DONE=1 for HALT state (11) 190000 +0 0 1 1 1

In the first case, DONE changed 15670-15000 = 670ps (0.67ns) after the rising clock edge, when the controller state was changed from 11 (HALT) to 00 (INIT). The same delay can be observed at the end of the algorithm, when DONE changed to 1 after the controller state changed from SHIFT (10) to HALT (11). While not exactly 0.75ns, as Leonardo estimated, the 0.67ns delay is close.

slide-4
SLIDE 4

To report maximum delay from an input pin to a flip-flop input is produced by the command:

report_delay -num_paths 1 -show_nets -longest_path -from [list MCAND* MPLIER* START] mult_indelay.rpt Report:

NAME GATE ARRIVAL LOAD

  • MCAND(3)/ 0.00 0.00 dn 0.01

M/ix57/Y mux21_ni 0.24 0.24 dn 0.01 M/nx56 (net) 0.00 0.24 dn (fan) 1.00 M/ix257/Y mux21_ni 0.30 0.54 dn 0.02 M/nx256 (net) 0.00 0.54 dn (fan) 1.00 M/reg_Rint(3)/D dffr 0.00 0.54 dn 0.00 data arrival time 0.54 CLOCK (offset) 0.00 (rising edge) delay thru clock network 0.00 (ideal) M/reg_Rint(3)/CLK dffr 0.00 (rising edge) clock cycle 2.50 library setup time (0.47) data required time 2.03

  • data required time 2.03

data arrival time 0.54

  • slack 1.49

The worst-case path is 0.54ns, from primary input MCAND(3), through two multiplexers to the D input of flip-flop reg_Rint(1) of the multiplicand register (instance M). With flip-flop setup time of 0.47ns, MCAND(3) must be stable 0.54+0.47 = 1.01 ns before the CLOCK transition, which is 1.49 ns less than the target clock period constraint of 2.50ns (positive slack of 1.49ns). Simulation was performed to verify the above path from primary input MCAND(3) to flip-flop input reg_Rint(1)/D. The path goes through one mux21_ni gate, driving net nx56, and a second mux21_ni gate, driving net nx256, which connects to the flip-flop D input. I included only CLOCK and these three nets in my simulation list window, to focus on this path.

ps /multiplier/CLOCK M/nx56 - 1st mux output delta /multiplier/MCAND - primary input M/nx256 - 2nd mux out/flip-flop D input 0 +0 0 00000000 X X 5000 +0 1 00000000 X X 10000 +0 0 00000000 X X 15000 +0 1 00000000 X X 16670 +0 1 00000000 0 X 16960 +0 1 00000000 0 0 20000 +0 0 11111111 0 0 - MCAND changes from 0-1 20240 +0 0 11111111 1 0 - nx56 changes from 0-1 20530 +0 0 11111111 1 1 - nx256 changes from 0-1 25000 +0 1 11111111 1 1 30000 +0 0 11111111 1 1 35000 +0 1 11111111 1 1

The simulated path delay = 20530 – 20000 = 530ps (0.53ns), which is within a “round off error” of the delay of 0.54ns estimated by Leonardo.