SoC Design SoC Design : Designing with FPGAs Designing with FPGAs - - PowerPoint PPT Presentation
SoC Design SoC Design : Designing with FPGAs Designing with FPGAs - - PowerPoint PPT Presentation
SoC Design SoC Design : Designing with FPGAs Designing with FPGAs es g es g g w t g w t G s G s Lecture 5: Lecture ectu e 5: ectu e Shaahin Hessabi Shaahin Hessabi Department of Computer Engineering Department of Computer Engineering
Outline Outline
Designing for High Speed
Designing for High Speed
Designing for Signal Integrity
Designing for Signal Integrity
Designing for Signal Integrity
Designing for Signal Integrity
Designing for Low Power
Designing for Low Power
Designing for Security
Designing for Security
Designing for Security
Designing for Security
Asynchronous Design Issues
Asynchronous Design Issues
Sharif University of Technology Page 2 Designing with FPGAs
Designing for High Speed Designing for High Speed
1.
- 1. Provide high
Provide high-
- level floor planning
level floor planning
Intelligent pin assignment
Intelligent pin assignment
Intelligent pin assignment
Intelligent pin assignment
– prevents routing congestion and poor performance
prevents routing congestion and poor performance
Natural structure:
Natural structure:
– Data flows horizontally, Control flows vertically
Data flows horizontally, Control flows vertically
– Vertical adders and counters, carry going upwards
Vertical adders and counters, carry going upwards
Pick the best I/O standard
Pick the best I/O standard
Pick the best I/O standard
Pick the best I/O standard
Place & route tool should not do all your work Place & route tool should not do all your work Place & route tool should not do all your work Place & route tool should not do all your work
Sharif University of Technology Page 3 Designing with FPGAs
Designing for High Speed (cont’d) Designing for High Speed (cont’d)
2.
- 2. Design synchronously, use global clocks
Design synchronously, use global clocks
- Up to
Up to 16 16 Global Clocks are available Global Clocks are available
- Up to
Up to 16 16 Global Clocks are available Global Clocks are available
–
Very low skew on these clock nets Very low skew on these clock nets
- DLL (Delay
DLL (Delay-
- Locked Loop) eliminates
Locked Loop) eliminates clock distribution delay clock distribution delay
–
Inside the chip, or even on the pc Inside the chip, or even on the pc-
- board
board
- Do not gate the clock, use CE instead
Do not gate the clock, use CE instead
–
But you may need clock gating for lowest power But you may need clock gating for lowest power But you may need clock gating for lowest power But you may need clock gating for lowest power
–
Virtex Virtex-
- II has glitch
II has glitch-
- free clock gate and clock
free clock gate and clock mux mux
- Use Carry for adders, counters and comparators
Use Carry for adders, counters and comparators
–
Superior speed, less logic, forces vertical orientation Superior speed, less logic, forces vertical orientation
- Use predefined cores
Use predefined cores
–
Have been tested and are guaranteed to work at speed Have been tested and are guaranteed to work at speed Have been tested and are guaranteed to work at speed Have been tested and are guaranteed to work at speed
Sharif University of Technology Page 4 Designing with FPGAs
Designing for High Speed (cont’d) Designing for High Speed (cont’d)
3.
- 3. Use local buffers to reduce clock skew
Use local buffers to reduce clock skew
- Global buffers are connected to dedicated routing
Global buffers are connected to dedicated routing
- Global buffers are connected to dedicated routing
Global buffers are connected to dedicated routing
–
Global clock network is balanced to minimize skew Global clock network is balanced to minimize skew
- All Xilinx FPGAs have global buffers
All Xilinx FPGAs have global buffers
–
XC XC4000 4000 and Spartan have and Spartan have 8 8
–
Virtex Virtex and Spartan and Spartan-
- II have
II have 4 4
–
Virtex Virtex-
- II has
II has 16 16 BUFGs with glitch BUFGs with glitch-
- free input
free input mux mux g p
- You can always use a BUFG symbol and the software will
You can always use a BUFG symbol and the software will choose an appropriate buffer type choose an appropriate buffer type
All j th i t l i f l b l b ff t l k i l All j th i t l i f l b l b ff t l k i l
–
All major synthesis tools can infer global buffers onto clock signals All major synthesis tools can infer global buffers onto clock signals that come from off that come from off-
- chip
chip
Sharif University of Technology Page 5 Designing with FPGAs
Designing for High Speed (cont’d) Designing for High Speed (cont’d)
4.
- 4. Use timing constraints
Use timing constraints
- The implementation tools do NOT try to find the placement and
The implementation tools do NOT try to find the placement and
- The implementation tools do NOT try to find the placement and
The implementation tools do NOT try to find the placement and routing that achieves the fastest speed routing that achieves the fastest speed
–
They just try to meet your performance expectations They just try to meet your performance expectations
- YOU
t i t t ti YOU t i t t ti
- YOU must communicate your expectations
YOU must communicate your expectations
–
Through Timing Constraints Through Timing Constraints
- Timing Constraints improve performance
Timing Constraints improve performance g p p g p p
–
By placing logic closer together and shortening the routing By placing logic closer together and shortening the routing
–
Timing constraints define your performance objectives Timing constraints define your performance objectives
- Ti ht ti
i t i t i il ti Ti ht ti i t i t i il ti
- Tight timing constraints increases compile time
Tight timing constraints increases compile time
- Unrealistic constraints causes the Flow Engine to stop
Unrealistic constraints causes the Flow Engine to stop
- Logic Level Timing Report tells whether constraints are
Logic Level Timing Report tells whether constraints are realistic realistic
Timing constraints are the best high Timing constraints are the best high-
- level tool to achieve
level tool to achieve guaranteed performance guaranteed performance
Sharif University of Technology
guaranteed performance guaranteed performance
Page 6 Designing with FPGAs
Designing for Signal Integrity Designing for Signal Integrity
1.
- 1. Devices need good
Devices need good Vcc Vcc bypassing bypassing
Bypass capacitor is the only source of dynamic current
Bypass capacitor is the only source of dynamic current
Bypass capacitor is the only source of dynamic current
Bypass capacitor is the only source of dynamic current
2.
- 2. User needs understanding of transmission line effects
User needs understanding of transmission line effects
Characteristic impedance, reflections,
Characteristic impedance, reflections, dV dV/dt dt p
Series termination, parallel termination
Series termination, parallel termination
3.
- 3. Decouple power supply
Decouple power supply
CMOS current is dynamic
CMOS current is dynamic
– Icc
Icc current spike on every active clock edge current spike on every active clock edge
Peak current can be
Peak current can be 5x the average current x the average current
Peak current can be
Peak current can be 5x the average current x the average current
– Instantaneous current peaks only supplied by decoupling capacitors
Instantaneous current peaks only supplied by decoupling capacitors
Use one
Use one 0 0. .1 1 μF ceramic chip capacitor per F ceramic chip capacitor per Vcc Vcc pin pin
Sharif University of Technology Page 7 Designing with FPGAs
Designing for Signal Integrity (cont’d) Designing for Signal Integrity (cont’d)
4.
- 4. Use SLOW attribute where available
Use SLOW attribute where available
- Increases transition time
Increases transition time
- Increases transition time
Increases transition time
–
especially when driving transmission lines especially when driving transmission lines
- Reduce fan
Reduce fan-
- out and load capacitance
- ut and load capacitance
- Add virtual
Add virtual ground ground
–
Ground output pin inside and outside, give it max strength Ground output pin inside and outside, give it max strength
5
Test for performance and reliability Test for performance and reliability
5.
- 5. Test for performance and reliability
Test for performance and reliability
- Manipulate circuit speed for testing purposes:
Manipulate circuit speed for testing purposes:
–
Hot and low Hot and low Vcc Vcc = slow operation = slow operation p
–
Cold and high Cold and high Vcc Vcc = fast operation = fast operation
- If it fails hot: insufficient speed
If it fails hot: insufficient speed
U f t d d U f t d d
–
Use a faster speed grade Use a faster speed grade
–
Modify the design, add pipelining Modify the design, add pipelining
Sharif University of Technology Page 8 Designing with FPGAs
Designing for Signal Integrity (cont’d) Designing for Signal Integrity (cont’d)
- If it fails cold: signal integrity and hold time issues
If it fails cold: signal integrity and hold time issues
–
Look for clock reflections Look for clock reflections
–
Look for excessive internal clock delays Look for excessive internal clock delays
–
Look for decoding spikes driving clocks Look for decoding spikes driving clocks
–
Look for Look for “asynchronous “asynchronous tricks” tricks” Look for Look for asynchronous asynchronous tricks tricks
Sharif University of Technology Page 9 Designing with FPGAs
Designing for Low Power Designing for Low Power
To extend battery life
To extend battery life
To reduce chip temperature and cooling requirements
To reduce chip temperature and cooling requirements
To reduce chip temperature and cooling requirements
To reduce chip temperature and cooling requirements
Tjmax
jmax =
= 125 125 ° °C ( C (150 150 ° °C in ceramic) C in ceramic)
Delays increase
Delays increase 0 0. .35 35% / % / ° °C C y
above the guaranteed
above the guaranteed 85 85 ° °C junction temperature C junction temperature
Use the free Xilinx Power Estimator
Use the free Xilinx Power Estimator
http://www.xilinx.com/cgi
http://www.xilinx.com/cgi-
- bin/powerweb.pl
bin/powerweb.pl
Power is proportional to CV Power is proportional to CV2f Minimize all three ! Minimize all three !
Sharif University of Technology Page 10 Designing with FPGAs
Designing for Low Power (cont’d) Designing for Low Power (cont’d)
Power: clock power + I/O power + logic power Power: clock power + I/O power + logic power
Clock Power
Clock Power
Clock Power
Clock Power
Minimize # of high
Minimize # of high-
- speed clock nets
speed clock nets
Use DLLs for phase
Use DLLs for phase-
- aligned sub
aligned sub-
- clocks
clocks p g
CE does not reduce clock power
CE does not reduce clock power
I/O power
I/O power
Avoid wasted current in input buffers
Avoid wasted current in input buffers
Use fast, full
Use fast, full-
- swing input signals
swing input signals U t t i t t id t t lit h U t t i t t id t t lit h
Use output registers to avoid output glitches
Use output registers to avoid output glitches
Logic power
Logic power
Control
Control Vcc Vcc tightly tightly
Control
Control Vcc Vcc tightly tightly
– Power is proportional to V
Power is proportional to Vcc
cc 2 2 Sharif University of Technology Page 11 Designing with FPGAs
Designing for Low Power (cont’d) Designing for Low Power (cont’d)
Minimize logic transitions and glitches
Minimize logic transitions and glitches
Optimize counters:
Optimize counters:
– Gray and Johnson are best
Gray and Johnson are best
– Binary counters double the power
Binary counters double the power Linear Feedback Shift Register are even worse Linear Feedback Shift Register are even worse
– Linear Feedback Shift Register are even worse
Linear Feedback Shift Register are even worse
Minimize internal node capacitance
Minimize internal node capacitance
– Use aggressive
Use aggressive timespecs timespecs
– Design for the highest speed possible, even if not needed
Design for the highest speed possible, even if not needed
This assures lowest interconnect capacitance and provides the
This assures lowest interconnect capacitance and provides the lowest power at the lower clock frequency lowest power at the lower clock frequency p q y p q y
Sharif University of Technology Page 12 Designing with FPGAs
Designing for Security Designing for Security
Configuration
Configuration bitstream bitstream can be intercepted can be intercepted
But not interpreted or reverse
But not interpreted or reverse-engineered engineered
But not interpreted or reverse
But not interpreted or reverse engineered engineered
Some users are concerned about IP theft
Some users are concerned about IP theft
Virtex
Virtex -
- II offers security through encryption
II offers security through encryption y g yp y g yp
Triple
Triple-
- DES with
DES with 3 3 x x 56 56 bits bits
Bitstream
Bitstream Encryption Encryption
Sharif University of Technology Page 13 Designing with FPGAs
Configuration Modes: Serial Modes Configuration Modes: Serial Modes g
Data is loaded one bit per CCLK
Data is loaded one bit per CCLK
Master serial
Master serial
FPGA drives configuration clock (CCLK)
FPGA drives configuration clock (CCLK) G
FPGA provides all control logic
FPGA provides all control logic
Slave serial
Slave serial
E ternal control logic generates CCLK E ternal control logic generates CCLK
External control logic generates CCLK
External control logic generates CCLK
– – Microprocessor
Microprocessor
– – Xilinx download cable
Xilinx download cable Xilinx download cable Xilinx download cable
– – Another FPGA
Another FPGA
Sharif University of Technology Page 14 Designing with FPGAs
Configuration Modes Configuration Modes g
Byte
Byte-
- Wide
Wide Slave Slave SelectMAP SelectMAP Mode Mode
CCLK is driven by external logic
CCLK is driven by external logic
Data is loaded one byte per CCLK
Data is loaded one byte per CCLK
Master
Master SelectMAP SelectMAP Mode Mode
CCLK is driven by the
CCLK is driven by the Virtex Virtex II FPGA II FPGA
Data is loaded one byte per CCLK
Data is loaded one byte per CCLK
Sharif University of Technology Page 15 Designing with FPGAs
Configuration Modes: Configuration Modes: Boundary Scan Mode Boundary Scan Mode Boundary Scan Mode Boundary Scan Mode
External control logic required
External control logic required
Control and data drive the boundary
Control and data drive the boundary scan pins (TDI, TMS, TCK) scan pins (TDI, TMS, TCK)
Data is loaded bit
Data is loaded bit-
- serially one bit per
serially one bit per TCK TCK
Sharif University of Technology Page 16 Designing with FPGAs
Asynchronous Issues: Asynchronous Issues: Metastability Metastability y y
Most systems operate synchronously inside
Most systems operate synchronously inside
But asynchronous inputs are a fact of life
But asynchronous inputs are a fact of life
Occasionally, an asynchronous input will cause a flip
Occasionally, an asynchronous input will cause a flip-
- flop
flop t t t bl t t bl to go to go metastable metastable
This is a rare, but unavoidable, probabilistic event
This is a rare, but unavoidable, probabilistic event
Violations occur when the flip
Violations occur when the flip flop input changes too close flop input changes too close
Violations occur when the flip
Violations occur when the flip-flop input changes too close flop input changes too close to a clock edge to a clock edge Th ibl lt Th ibl lt
Three possible results:
Three possible results:
FF clocks in old data value
FF clocks in old data value
FF clocks in new data value
FF clocks in new data value
FF clocks in new data value
FF clocks in new data value
FF output becomes
FF output becomes metastable metastable
Sharif University of Technology Page 17 Designing with FPGAs
Metastability Metastability
Caused by asynchronous data input
Caused by asynchronous data input
Violates set
Violates set-up time requirement up time requirement
Violates set
Violates set up time requirement up time requirement
Usually gets synchronized in the flip
Usually gets synchronized in the flip-
- flop without problem
flop without problem
But if data changes within a tiny set
But if data changes within a tiny set-
- up time window
up time window g y g y p
Then the flip
Then the flip-
- flop can go
flop can go metastable metastable
Resulting in unpredictable delay to reach stable
Resulting in unpredictable delay to reach stable 1 1 or
- r 0
The
The 0 0 vs.
- vs. 1
1 uncertainty is irrelevant uncertainty is irrelevant
The slightest timing change would give a correct
The slightest timing change would give a correct 1 1 or
- r 0
Th di t bl d l i th bl Th di t bl d l i th bl
The unpredictable delay is the problem
The unpredictable delay is the problem
It can violate set
It can violate set-
- up times in the system, causing erratic
up times in the system, causing erratic
- peration or even crashes
- peration or even crashes
- peration or even crashes
- peration or even crashes
Sharif University of Technology Page 18 Designing with FPGAs
Synchronization Circuit Synchronization Circuit y
Solution:
Solution:
Faster flip
Faster flip-
- flops recover faster
flops recover faster
Double
Double-
- synchronization reduces probability
synchronization reduces probability
Sharif University of Technology Page 19 Designing with FPGAs