A Network of Time-Division Multiplexed Wiring for FPGAs Rosemary - - PowerPoint PPT Presentation

a network of time division multiplexed wiring for fpgas
SMART_READER_LITE
LIVE PREVIEW

A Network of Time-Division Multiplexed Wiring for FPGAs Rosemary - - PowerPoint PPT Presentation

A Network of Time-Division Multiplexed Wiring for FPGAs Rosemary Francis Simon Moore Robert Mullins Motivation FPGAs are now home to complex Systems on-Chip ...but still optimised for single-core designs FPGA global wiring is


slide-1
SLIDE 1

A Network of Time-Division Multiplexed Wiring for FPGAs

Rosemary Francis Simon Moore Robert Mullins

slide-2
SLIDE 2

Motivation

  • FPGAs are now home to complex Systems
  • n-Chip
  • ...but still optimised for single-core designs
  • FPGA global wiring is simple in comparison

with ASIC Networks-on-Chip

  • Networks for FPGAs use lots of logic
  • Hard blocks are limited by the soft IP blocks
slide-3
SLIDE 3

Goals

  • Use TDM components for effective soft

NoC implementation

  • Funnel data to high-speed hard blocks

– Hard NoC – Multipliers – Block RAM

  • Determine optimum TDM architecture

– What are the costs? – Is it possible to design for global and local routing?

slide-4
SLIDE 4

Hierarchy of interconnect

Clusters of logic elements with local interconnect Time-division multiplexed wires in a fine-grain network Coarse-grain packet-switched network

slide-5
SLIDE 5

Architecture: Stratix vs TDM

Switch box TDM Global routing Local routing SRAM

LUT

Cluster of logic elements with latched inputs

LUT

Cluster of logic elements Switch box Stratix Global routing Local routing

slide-6
SLIDE 6

Wire Sharing

  • Many wires can be

shared without a problem

1 1 2 2 3 3 4 4 5

slide-7
SLIDE 7

Wire Sharing

  • Many wires can be

shared without a problem

  • Other configurations

require a more intelligent approach

Conflict!! 2 2 1 1

slide-8
SLIDE 8

Wire Sharing

  • Many wires can be

shared without a problem

  • Other configurations

require a more intelligent approach

  • Signals can be

delayed to allow more efficient wire use without rerouting

1 2 2 3 3 4 4 5

slide-9
SLIDE 9

Our Scheduler

  • Our scheduler

– maps benchmarks from a Stratix FPGA to a TDM FPGA – resolved TDM conflicts after place and route

  • Benchmarks

– IP cores taken from the Altera University Suite

  • Aim

– To reduce the amount of wiring as far as possible using TDM wiring with realistic characteristics

slide-10
SLIDE 10

Parameter selection (1 of 3)

  • Assume infinite time slots to reduce wiring

– Determine minimum number of TDM wires

slide-11
SLIDE 11

Infinite number of time slots

6 7 8 10 12 14 16 18 5 10 15 20 25 30 35 40 45 50 55 60 65 70

Number of TDM wires Total number of wires needed

Stratix with static wiring

slide-12
SLIDE 12

Parameter selection (2 of 3)

  • Assume infinite time slots to reduce wiring

– Determine minimum number of TDM wires

  • Vary number of time slots

– Determine optimum number of time slots – Investigate the effect this has on latency

slide-13
SLIDE 13

Determine number of time slots

1 8 12 16 20 24 28 32 36

5 10 15 20 25 30 35 40 45 50 55 60 65 70

Number of time slots (= number of configurations bits per mux) Wires per switch box

Stratix with static wiring

slide-14
SLIDE 14

Number of time slots vs latency

1 8 12 16 20 24 28 32 36 0.5 1 1.5 2 2.5 3 3.5 4

Number of time slots (=number of configuration bits per mux) Normalised latency of critical path

slide-15
SLIDE 15

Parameter selection (3 of 3)

  • Assume infinite time slots to reduce wiring

– Determine minimum number of TDM wires

  • Vary number of time slots

– Determine optimum number of time slots – Investigate the effect this has on latency

  • Using optimum number of time slots

– Re-evaluate optimum number of TDM wires

slide-16
SLIDE 16

6 7 8 10 12 14 16 18 5 10 15 20 25 30 35 40 45 50 55 60 65 70

Number of TDM wires Total number of wires needed

Limited resources

Stratix with static wiring All but two benchmarks map to 13 wires

  • A reduction of over 75%
slide-17
SLIDE 17

Architectural drawbacks

  • Extra configuration SRAM
  • High-speed interconnect clock
  • Benchmarks run over three times slower
  • New CAD tools needed

– Re-routing in space as well as time – Optimise for TDM wiring at every stage

slide-18
SLIDE 18

Conclusions

  • Using TDM wiring we can reduce the

number of wires whilst increasing the data rate within channels

– 75% less wiring * 24 time slots * 3 times slower means 2 times channel data rate

  • This will allow

– the design of effective global interconnect – more efficient sharing of on-chip resources – simplification of multi-chip designs

slide-19
SLIDE 19

Future Work

  • Current scheduling algorithm gives

– Large wire reduction, large latency penalty

  • We are investigating a better compromise

– Small wiring reduction, small latency penalties? – Recent new results show this is possible

  • Area and power

– Is the wiring reduction enough to justify the extra area and power costs?

slide-20
SLIDE 20

Thanks for listening...

Rosemary.Francis@cl.cam.ac.uk