AMCHIP: Backend integration A. Stabile for the AMchip collaboration - - PowerPoint PPT Presentation

▶

Jun 07, 2023 211 likes •516 views

AMCHIP: Backend integration A. Stabile for the AMchip collaboration FTK review, CERN 09 Nov. 2011 FTK review (CERN) Alberto Stabile 9 Nov. 2011 1 / 29 Project flow The entire chip has been designed with a hybrid approach virtuoso Verilog

SLIDE 1

AMCHIP: Backend integration

A. Stabile for the AMchip collaboration

FTK review, CERN 09 Nov. 2011

FTK review (CERN) Alberto Stabile 9 Nov. 2011 1 / 29

SLIDE 2

Project flow

The entire chip has been designed with a hybrid approach

virtuoso calibre

FULL CUSTOM DESIGN

geometries timing

Verilog

netlist

SDF

time back-annotate Design rule check Layout versus schematic geometries time back-annotate

More repetitive regions have been designed with a full custom approach More complex logics have been designed with a standard cell approach To place and route standard cells, we have used Foundation Flow of Cadence Encounter

FTK review (CERN) Alberto Stabile 9 Nov. 2011 2 / 29

SLIDE 3

The full custom cell

At the middle of the design, we have placed peripheral circuitry

2 layers = 1/4 pattern 128 layers + 1 dummy layer in the middle 64 pattern vertically 8 layers STD CELLS FULL CUSTOM

The TOP2 block is the larger full custom block designed. It’s composed by an array of 64×4 layers

FTK review (CERN) Alberto Stabile 9 Nov. 2011 3 / 29

SLIDE 4

The full custom frame interface

A fringe interface in metal 1 has been designed with the same VSS-to-VDD pitch and widths

STD cells

More repetitive regions have been designed with a full custom approach More complex logics have been designed with a standard cell approach

FTK review (CERN) Alberto Stabile 9 Nov. 2011 4 / 29

SLIDE 5

The LEF file and the bblib file

LEF

The LEF file is used to describe the geometries of obstacles and pins of our full custom block

bblib

The bblib file describes the internal timing of full custom block in coarse way : In particular it describes: time arcs between:

INIT → ML OUT (3:0) (From init signal to the registered match lines) MLEN → ML OUT (3:0) (From current-source enable signal to the registered match lines)

setup time between:

BL(3:0) → ML OUT (3:0) (From bit lines to the registered match lines)

capacitances of each pin:

These values are really important information for Encounter, which is able to choose the correct size buffer for each bitline.

FTK review (CERN) Alberto Stabile 9 Nov. 2011 5 / 29

SLIDE 6

First floorplan: free placing (without constraints)

All blocks (macro + standard cells) have been automatically placed by Encounter. The routing algorithm does not converge!

FTK review (CERN) Alberto Stabile 9 Nov. 2011 6 / 29

SLIDE 7

First floorplan: placed without fences

User defined positions for TOP2 blocks Automatic placement of other blocks The placement has been done using a flat description of the logic.

FTK review (CERN) Alberto Stabile 9 Nov. 2011 7 / 29

SLIDE 8

First floorplan: placed without fences

User defined positions for TOP2 blocks Automatic placement of other blocks The placement has been done using a flat description of the logic.

FTK review (CERN) Alberto Stabile 9 Nov. 2011 8 / 29

SLIDE 9

First floorplan: placed without fences

However the congestion (red lines) is too high and routing algorithm does not converge

FTK review (CERN) Alberto Stabile 9 Nov. 2011 9 / 29

SLIDE 10

First floorplan: placed with fences

To decrease the routing congestion, we have designed fence areas containing the majority, the fisher tree, and 4 TOP2 macro blocks

FTK review (CERN) Alberto Stabile 9 Nov. 2011 10 / 29

SLIDE 11

First floorplan: placed with fences

To decrease the routing congestion, we have designed fence areas containing the majority, the fisher tree, and 4 TOP2 macro blocks

FTK review (CERN) Alberto Stabile 9 Nov. 2011 11 / 29

SLIDE 12

First floorplan: placed with fences

However the congestion is still high and routing algorithm do not converge!

Solution:

Analyze the congestions and re-size the routing channels

FTK review (CERN) Alberto Stabile 9 Nov. 2011 12 / 29

SLIDE 13

Final floorplan

full custom macro block full custom macro block full custom macro block full custom macro block full custom macro block full custom macro block full custom macro block full custom macro block full custom macro block full custom macro block

size: 3510 µm × 3985.0 µm

FTK review (CERN) Alberto Stabile 9 Nov. 2011 13 / 29

SLIDE 14

Pin out configuration

FTK review (CERN) Alberto Stabile 9 Nov. 2011 14 / 29

SLIDE 15

Power routing

A ring pads of 208 pads has been placed We have left a space of 30 µm between the ring pads and the standard cells. This space has been used to place power rings for VDD and VSS just inside the pad frame. The width of each power ring is 10 µm. The power ring is connected to horizontal strips distributing power inside the chip using metal 1 (width = 330 nm) and vertical strips using metal 6 (width = 1300 nm).

After some calculation we found that two layers (metal 1 and 6) are not sufficient for a good power distribution

Solution: horizontal power strips in metal 5 (width = 3600 nm). These strips have been placed with a staggered/fringe approach to prevent routing

FTK review (CERN) Alberto Stabile 9 Nov. 2011 15 / 29

SLIDE 16

Routing (metal 1)

FTK review (CERN) Alberto Stabile 9 Nov. 2011 16 / 29

SLIDE 17

Routing (metal 2)

FTK review (CERN) Alberto Stabile 9 Nov. 2011 17 / 29

SLIDE 18

Routing (metal 3)

FTK review (CERN) Alberto Stabile 9 Nov. 2011 18 / 29

SLIDE 19

Routing (metal 4)

FTK review (CERN) Alberto Stabile 9 Nov. 2011 19 / 29

SLIDE 20

Routing (metal 5)

FTK review (CERN) Alberto Stabile 9 Nov. 2011 20 / 29

SLIDE 21

Power routing (metal 5)

FTK review (CERN) Alberto Stabile 9 Nov. 2011 21 / 29

SLIDE 22

Power routing (metal 6)

FTK review (CERN) Alberto Stabile 9 Nov. 2011 22 / 29

SLIDE 23

Number of power pads

We have placed a large number of power supply and ground pads. This is done, to guarantee a correct power supply of the core

0.5 1 1.5 2 10 20 30 40 50 60 70 80 Current consumption of a nominal rating power supply VDD [A] Min number of power pad Mmin gnd VDDcore VDDio VDDcore + VDDio gnd + vdd Current consumption estimated

FTK review (CERN) Alberto Stabile 9 Nov. 2011 23 / 29

SLIDE 24

Signal pad considerations

We have chosen bidirectional pads capable of driving a current of 2 mA or 4 mA Even with only 2 mA the estimated pad to output time (including line capacity on PCB for up to 3000 mils) is less than 4 ns.

2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4 50 100 150 200 time [ns] number of wire PDDW0204SCDG DS! embedded microstrip PDDW0204SCDG DS embedded microstrip PDDW0408SCDG DS! embedded microstrip PDDW0408SCDG DS embedded microstrip

FTK review (CERN) Alberto Stabile 9 Nov. 2011 24 / 29

SLIDE 25

Power consumption

The power consumption has been estimated as high as about 70 mW for all full custom blocks and 75 mW for all the standard cells Empty spaces have been filled with decoupling capacitors to filter the power supply The power consumption has been estimated of the IO blocks is 100 mW. Blocks Power consumption Voltage supply Core standard cells ∼ 75 mW 1.2 V Core fullcustom blocks ∼ 70 mW 1.2 V IO cells ∼ 100 mW 3.3 V

FTK review (CERN) Alberto Stabile 9 Nov. 2011 25 / 29

SLIDE 26

Place and route optimized

We have used the Foundation Flow (FF) by Cadence Encounter to perform place and route A lot of time was spent to set up the configuration file of the Foundation Flow, because it is important to select compatible libraries The flow contains some important timing optimization steps:

pre-cts optimization post-cts optimization post-route optimization

We have described the timing constraints in a .sdc file which contains:

Setup time for input clocked registers ranges from 0.1 ns to 2.5 ns Hold time after clock for all outputs ranges from 0.1 ns to 2.5 ns The minimum clock period is 10 ns

Timing results

All the constraints are guaranteed in all optimization steps.

FTK review (CERN) Alberto Stabile 9 Nov. 2011 26 / 29

SLIDE 27

Clock Tree to enhance the timing perfomance

Clock tree has been generated and results confirm that the clock distribution is good

The maximum clock skew is equal to 400 ps

FTK review (CERN) Alberto Stabile 9 Nov. 2011 27 / 29

SLIDE 28

BackEnd status

Completed: TOP2 (macro fullcustom block) designed to have a compatible interface with the STD cell world Pad ring completed: 208 pads have been placed to be compatible with the LAMB routing

A study about driving current in the pads has been performed A study about power pad number has been performed

Power routing completed: power rings and strips have been placed to obtain a good power distribuition in all regions Floorplan re-designed several times to avoid the routing congestion (stretching of routing channels)

A stript has been developed to automatically create a .fp file (floorplan file) with parametric values

FTK review (CERN) Alberto Stabile 9 Nov. 2011 28 / 29

SLIDE 29

BackEnd status

In progress: We are running place and route for the new version of FrontEnd logic We are refining the time constraints Future: Solve the issues signal integrity optimization Run DRC and LVS on the final version layout

BackEnd simplification for the final user

With the Foundation Flow we can easily run the command make postroute to run the script with all place and route instructions

FTK review (CERN) Alberto Stabile 9 Nov. 2011 29 / 29