Highlydense Mixed Grained Reconfigurable Architecture with Viaswitch - - PowerPoint PPT Presentation

highly dense mixed grained reconfigurable architecture
SMART_READER_LITE
LIVE PREVIEW

Highlydense Mixed Grained Reconfigurable Architecture with Viaswitch - - PowerPoint PPT Presentation

Highlydense Mixed Grained Reconfigurable Architecture with Viaswitch Ryutaro Doi 1,6 Junshi Hotate 2,6 Takashi Kishimoto 2,6 Toshiki Higashi 2,6 Hiroyuki Ochi 2,6 Munehiro Tada 3,6 Tadahiko Sugibayashi 3,6 Kazutoshi Wakabayashi 3,6 Hidetoshi


slide-1
SLIDE 1

Highly‐dense Mixed Grained Reconfigurable Architecture with Via‐switch

Ryutaro Doi1,6 Junshi Hotate2,6 Takashi Kishimoto2,6 Toshiki Higashi2,6 Hiroyuki Ochi2,6 Munehiro Tada3,6 Tadahiko Sugibayashi3,6 Kazutoshi Wakabayashi3,6 Hidetoshi Onodera4,6 Yukio Mitsuyama5,6 Masanori Hashimoto1,6

1Osaka University 2Ritsumeikan University 3NEC 4Kyoto University 5Kochi University of Technology 6JST, CREST

nanocrest@gmail.com

1

slide-2
SLIDE 2

Contribution

2

Proposed Architecture Conventional FPGA Logic (LUT)

T1 T2 C1

Varistor Atom SW

T1 T2

Logic (Arithmetic/Memory Unit+MUX for LUT)

SRAM + MOS SW in FEOL “Via-switch” in BEOL

BEOL layer FEOL layer FEOL layer C2

26X higher density 66% smaller interconnect delay at 0.5V

slide-3
SLIDE 3

Via‐switch

3

M5 M6 M7 M8

C1 T2 C2 T1

Signal lines Program lines

C1

Varistor Atom SW

T1 T2 C2

Atom SW: Electrochemical nonvolatile R-change device On-R can be reduced to 200Ω.

(Complimentary Atom SW)

[Banno, IEDM2015] Access Tr. unnecessary for programming

slide-4
SLIDE 4

Why two program lines?

4

1

Other lines floating

1 1 0 1

2 3 4

Atom SW under intentional programming On-state Atom SW Off-state Atom SW Atom SW under unintentional programming

1

With a single program line, unintentional programming will happen.

slide-5
SLIDE 5

5

Other lines floating

1 1 1 1

1 2 3 4

Atom SW under programming On-state atom SW Off-state atom SW

Why two program lines?

With two program lines, unintentional programming will not happen. Multiple-ON in a column enables multiple fanouts.

slide-6
SLIDE 6

Proposed crossbar structure

To/From North To/From North Long Wire To/From East To/From South To/From West To/From South Long Wire To/From East Long Wire To/From West Long Wire

OUT IN0 IN1 IN2 IN3

LUT1

OUT IN0 IN1 IN2 IN3

LUT0 Repeater OR Coarse‐ grained block

Bi-directional → Higher usage → Smaller crossbar On-demand repeater insertion Close-packed via-switch → Higher density → Smaller crossbar

M5 M6 M7 M8

C1 T2 C2 T1

18F2 via-switch Signals from 4 directions can be input/outout due to multiple fanouts

Switch block Connection block

slide-7
SLIDE 7

Interconnect Performance Evaluation (65nm)

7

2 4 6 8 10 10 20 30 40 50

Delay [ns] Distance (# of CLBs)

0.2 0.4 0.6 0.8 1 10 20 30 40 50

Energy [pJ] Distance (# of CLBs) 80 tracks 120 tracks 33% reduction 120 tracks 80 tracks 29% reduction

(a) (b) Smaller crossbar thanks to bidirectional signaling reduces delay and energy. Delay/energy can be

  • ptimized by flexible

buffering.

2 4 6 8 10 20 40 60 80

Delay [ns] Distance (# of CLBs)

0.5 1 1.5 2 2.5 20 40 60 80

Energy [pJ] Distance (# of CLBs) no repeaters per 10CLBs per 15CLBs per 20CLBs no repeaters per 10CLBs per 15CLBs per 20CLBs

(a) (b)

117x80 or 157x120 crossbars No repeaters @1.0V 117x80 crossbar @1.0V

slide-8
SLIDE 8

Comparison w/ SRAM‐based FPGA (TMG+SRAM crossbar)

8

2 4 6 8 10 20 40 60 80

Delay [ns] Distance (# of CLBs)

2 4 6 8 10 20 40 60 80

Energy [pJ] Distance (# of CLBs)

(a) (b)

Conventional Proposed 35% reduction 71% reduction Conventional Proposed

5 10 15 20 25 30 20 40 60 80

Delay [ns] Distance (# of CLBs)

0.5 1 1.5 2 20 40 60 80

Energy [pJ] Distance (# of CLBs)

(a) (b)

Conventional Proposed 66% reduction 82% reduction Conventional Proposed

1.0V 0.5V

On-R of via-switch is independent of supply voltage.

26X higher area density

117x80 crossbar repeater inserted 117x80 crossbar repeater inserted

slide-9
SLIDE 9

Conclusion

  • Proposed a highly‐dense reconfigurable architecture that

exploits via‐switch.

– 26X higher density – Interconnection delay is reduced by 35% (1.0V) and 66% (0.5V) – Interconnection energy is reduced by 71% (1.0V) and 82% (0.5V)

  • Future works

– Import long wire interconnection – Application mapping and performance evaluation

9