A Network of Time Division Multiplexing for FPGAs Rosemary Francis - - PowerPoint PPT Presentation

a network of time division multiplexing for fpgas
SMART_READER_LITE
LIVE PREVIEW

A Network of Time Division Multiplexing for FPGAs Rosemary Francis - - PowerPoint PPT Presentation

A Network of Time Division Multiplexing for FPGAs Rosemary Francis Motivation FPGAs are now home to complex Systems on-Chip Designs require the use of Network-on- Chip FPGA global wiring is simple in comparison with ASIC


slide-1
SLIDE 1

A Network of Time Division Multiplexing for FPGAs

Rosemary Francis

slide-2
SLIDE 2

Motivation

  • FPGAs are now home to complex Systems
  • n-Chip
  • Designs require the use of Network-on-

Chip

  • FPGA global wiring is simple in comparison

with ASIC Networks-on-Chip

  • Networks for FPGAs use lots of wires or

lots of logic

  • Hard blocks are limited by the soft IP blocks
slide-3
SLIDE 3

Goals

  • Improve wiring density through TDM
  • Use TDM components for effective soft

NoC implementation

  • Funnel data to high-speed hard blocks

– Hard NoC – Multipliers – Block RAM

slide-4
SLIDE 4

Hierarchy of interconnect

Clusters of logic elements with local interconnect Time-division multiplexed wires in a fine-grain network Coarse-grain packet-switched network

slide-5
SLIDE 5

Architecture: Stratix vs TDM

Switch box TDM Global routing Local routing SRAM

LUT

Cluster of logic elements with latched inputs

LUT

Cluster of logic elements Switch box Stratix Global routing Local routing

slide-6
SLIDE 6

Wire Sharing

  • Many wires can be

shared without a problem

1 1 2 2 3 3 4 4 5

slide-7
SLIDE 7

Wire Sharing

  • Many wires can be

shared without a problem

  • Other configurations

require a more intelligent approach

Conflict!! 2 2 1 1

slide-8
SLIDE 8

Wire Sharing

  • Many wires can be

shared without a problem

  • Other configurations

require a more intelligent approach

  • Signals can be

delayed to allow more efficient wire use without rerouting

1 2 2 3 3 4 4 5

slide-9
SLIDE 9

Parameter selection

  • Assume infinite time slots to reduce wiring

– Determine optimum number of TDM wires

slide-10
SLIDE 10

Infinite resources

6 7 8 10 12 14 16 18 5 10 15 20 25 30 35 40 45 50 55 60 65 70

Number of TDM wires Total number of wires needed

slide-11
SLIDE 11

Parameter selection

  • Assume infinite time slots to reduce wiring

– Determine optimum number of TDM wires

  • Vary number of time slots

– Determine optimum number of time slots – Investigate the effect this has on latency

slide-12
SLIDE 12

Determine number of time slots

1 8 12 16 20 24 28 32 36

5 10 15 20 25 30 35 40 45 50 55 60 65 70

Number of time slots (= number of configurations bits per mux) Wires per switch box

slide-13
SLIDE 13

Number of time slots vs latency

1 8 12 16 20 24 28 32 36 0.5 1 1.5 2 2.5 3 3.5 4

Number of time slots (=number of configuration bits per mux) Normalised latency of critical path

slide-14
SLIDE 14

Parameter selection

  • Assume infinite time slots to reduce wiring

– Determine optimum number of TDM wires

  • Vary number of time slots

– Determine optimum number of time slots – Investigate the effect this has on latency

  • Using optimum number of time slots

– Re-evaluate optimum number of TDM wires

slide-15
SLIDE 15

6 7 8 10 12 14 16 18 5 10 15 20 25 30 35 40 45 50 55 60 65 70

Number of TDM wires Total number of wires needed

Limited resources

slide-16
SLIDE 16

Architectural drawbacks

  • Extra configuration SRAM
  • High-speed interconnect clock
  • Benchmarks run over three times slower
  • New CAD tools needed

– Re-routing in space as well as time – Optimise for TDM wiring at every stage

slide-17
SLIDE 17

Conclusions

  • Using TDM wiring we can reduce the

number of wires whilst increasing the data rate within channels

– 75% less wiring * 24 time slots * 3 times slower means 2 times channel data rate

  • This will allow

– the design of effective global interconnect – more efficient sharing of on-chip resources – simplification of multi-chip designs

slide-18
SLIDE 18

Future Work

  • Current scheduling algorithm gives
  • Large wire reduction
  • Large latency penalty
  • Is there a better compromise?
  • Halve the wiring, small latency penalties
  • How can we reduce latency in other ways?
  • Better scheduling algorithms
  • Circuit redesign
slide-19
SLIDE 19

Thanks for listening...

Rosemary.Francis@cl.cam.ac.uk