Finding a Needle in the Haystack of Hardened Interconnect Patterns - - PowerPoint PPT Presentation

finding a needle in the haystack of hardened interconnect
SMART_READER_LITE
LIVE PREVIEW

Finding a Needle in the Haystack of Hardened Interconnect Patterns - - PowerPoint PPT Presentation

Finding a Needle in the Haystack of Hardened Interconnect Patterns S. Nikoli, G. Zgheib*, and P. Ienne FPL19, Barcelona, 09.09.2019 cole Polytechnique Fdrale de Lausanne *Intel Corporation Why harden connections? 2 crossbar LUT LUT


slide-1
SLIDE 1

Finding a Needle in the Haystack of Hardened Interconnect Patterns

  • S. Nikolić, G. Zgheib*, and P. Ienne

FPL19, Barcelona, 09.09.2019

École Polytechnique Fédérale de Lausanne *Intel Corporation

slide-2
SLIDE 2

Why harden connections?

LUT LUT LUT

crossbar

2

slide-3
SLIDE 3

Why harden connections?

LUT LUT LUT

crossbar

2

slide-4
SLIDE 4

Why harden connections?

LUT LUT LUT

crossbar

2

slide-5
SLIDE 5

Why harden connections?

LUT LUT LUT

crossbar

2

slide-6
SLIDE 6

Why harden connections?

LUT LUT LUT

crossbar

2

slide-7
SLIDE 7

Why harden connections?

LUT LUT LUT

crossbar

2

slide-8
SLIDE 8

Why harden connections?

LUT LUT LUT

crossbar

2

slide-9
SLIDE 9

What is the price?

LUT LUT LUT

crossbar

Cluster architecture Circuit to be mapped 3

slide-10
SLIDE 10

XC4000 [1] Triptych [3] UTFPGA1 [2]

[1] H.-C. Hsieh, W. S. Carter, J. Ja, E. Cheung, S. Schreifels, C. Erickson, P. Freidin,

  • L. Tinkey, and R. Kanazawa. Third-generation architecture boosts speed and

density of field-programmable gate arrays, 1990 [2] P. Chow, S. O. Seo, D. Au, B. Fallah, C. Li, and J. Rose. A 1.2um CMOS FPGA using cascaded logic blocks and segmented routing, 1991 [3] C. Ebeling, G. Borriello, S. A. Hauck, D. Song, E. A. Walkup. TRIPTYCH: A New FPGA Architecture, 1991

slide-11
SLIDE 11

Challenges

How to design the patterns?

  • Intuition?
  • Enumeration

5 5-LUT 108

How to map on patterns? (CAD tool scalability)

LUT LUT LUT LUT LUT 12

5

slide-12
SLIDE 12

Challenges

How to design the patterns?

  • Intuition?
  • Enumeration

5 5-LUT 108

How to map on patterns? (CAD tool scalability)

LUT LUT LUT LUT LUT 12

5

slide-13
SLIDE 13

Challenges

How to design the patterns?

  • Intuition?
  • Enumeration

5 5-LUT 108

How to map on patterns? (CAD tool scalability)

LUT LUT LUT LUT LUT 12

5

slide-14
SLIDE 14

Challenges

How to design the patterns?

  • Intuition?
  • Enumeration

5 5-LUT 108

How to map on patterns? (CAD tool scalability)

LUT LUT LUT LUT LUT 12

5

slide-15
SLIDE 15

Challenges

How to design the patterns?

  • Intuition?
  • Enumeration

5 5-LUT 108

How to map on patterns? (CAD tool scalability)

LUT LUT LUT LUT LUT 12

5

slide-16
SLIDE 16

Enumeration

slide-17
SLIDE 17

Representation

I I LUT LUT LUT LUT LUT LUT

  • represent each LUT by a node (circles)
  • only represent shared inputs (triangles)
  • each edge is a hardened connection

6

slide-18
SLIDE 18

Representation

I I LUT LUT LUT LUT LUT LUT

  • represent each LUT by a node (circles)
  • only represent shared inputs (triangles)
  • each edge is a hardened connection

6

slide-19
SLIDE 19

Representation

I I LUT LUT LUT LUT LUT LUT

  • represent each LUT by a node (circles)
  • only represent shared inputs (triangles)
  • each edge is a hardened connection

6

slide-20
SLIDE 20

Representation

I I LUT LUT LUT LUT LUT LUT

  • represent each LUT by a node (circles)
  • only represent shared inputs (triangles)
  • each edge is a hardened connection

6

slide-21
SLIDE 21

Representation

I I LUT LUT LUT LUT LUT LUT

  • represent each LUT by a node (circles)
  • only represent shared inputs (triangles)
  • each edge is a hardened connection

6

slide-22
SLIDE 22

Enumeration (no input sharing for now)

//V - vertex set G = (V, {}) expandable = (G) while expandable { G = pop(expandable) for e in V x V { if keep(G + e) { push(G + e, expandable) } } }

a b c

7

slide-23
SLIDE 23

Enumeration (no input sharing for now)

//V - vertex set G = (V, {}) expandable = (G) while expandable { G = pop(expandable) for e in V x V { if keep(G + e) { push(G + e, expandable) } } }

a b c a b c a b c

7

slide-24
SLIDE 24

Enumeration (no input sharing for now)

//V - vertex set G = (V, {}) expandable = (G) while expandable { G = pop(expandable) for e in V x V { if keep(G + e) { push(G + e, expandable) } } }

a b c a b c a b c a b c a b c a b c a b c

7

slide-25
SLIDE 25

Enumeration (no input sharing for now)

//V - vertex set G = (V, {}) expandable = (G) while expandable { G = pop(expandable) for e in V x V { if keep(G + e) { push(G + e, expandable) } } }

a b c a b c a b c a b c a b c a b c a b c

keep 7

slide-26
SLIDE 26

When to stop?

When area or delay stop decreasing? When area or delay start increasing?

8

slide-27
SLIDE 27

When to stop?

When area or delay stop decreasing? When area or delay start increasing?

8

slide-28
SLIDE 28

When to stop?

7

LUT LUT LUT Circuit to be mapped No hardened connections

9

slide-29
SLIDE 29

When to stop?

7

LUT LUT LUT

7

LUT LUT LUT Circuit to be mapped No hardened connections With hardened connections

9

slide-30
SLIDE 30

When to stop?

7

LUT LUT LUT

7

LUT LUT LUT Circuit to be mapped No hardened connections With hardened connections

9

slide-31
SLIDE 31

When to stop?

7 7

LUT LUT LUT

7

LUT LUT LUT LUT LUT LUT Circuit to be mapped No hardened connections With hardened connections

9

slide-32
SLIDE 32

When to stop?

7 7

LUT LUT LUT

7

LUT LUT LUT LUT LUT LUT Circuit to be mapped No hardened connections With hardened connections When area or delay start increasing? When area or delay stop decreasing?

9

slide-33
SLIDE 33

When to stop?

slide-34
SLIDE 34

Other issues: avoiding listing duplicates

A B C A B C

11

slide-35
SLIDE 35

Other issues: maintaining subgraph relations

G H2 H1

x y z

xx xy xz yy yz zz

xxx xxy xxz xyy xyz xzz yyy yyz yzz zzz

12

slide-36
SLIDE 36

Challenges

How to design the patterns?

  • Intuition?
  • Enumeration

5 5-LUT 108 How to map on patterns? (CAD tool scalability)

LUT LUT LUT LUT LUT 12

13

slide-37
SLIDE 37

Experiments

slide-38
SLIDE 38

Setup

  • Search space: acyclic five 5-LUT patterns

( 108 patterns)

  • Architecture = 4x the pattern with a shared crossbar

(20 5-LUT clusters)

14

slide-39
SLIDE 39

Setup

  • Search space: acyclic five 5-LUT patterns

( 108 patterns)

  • Architecture = 4x the pattern with a shared crossbar

(20 5-LUT clusters)

14

slide-40
SLIDE 40

Setup

  • Search space: acyclic five 5-LUT patterns

( 108 patterns)

  • Architecture = 4x the pattern with a shared crossbar

(20 5-LUT clusters)

14

slide-41
SLIDE 41

Results

c a b d e a b d e c e a b c d

12

b e a c d

Some examples

Found 261 patterns with only 12 external inputs achieving 80% packing density

15

slide-42
SLIDE 42

Results

blob_merge boundtop ch_intrinsics diffeq1 diffeq2 mkDelayWorker32B mkPktMerge mkSMAdapter4B

  • r1200

raygentop sha stereovision0 stereovision1 stereovision3 65 70 75 80 85 90 95 utilization [%]

16

slide-43
SLIDE 43

Conclusions

Numerical results not satisfactory (18-29% critical path delay increase) But... We have an efficient way of searching for good patterns

  • searched the space 108 in < 12h
  • search techniques completely independent of the

mapping algorithms In the future, this should help us understand what makes a good pattern and profit from connection hardening to the fullest

17

slide-44
SLIDE 44

Thank you for attention

For questions, please see the poster