Limited Address Range The proposed architecture: LAR Architecture - - PDF document

limited address range
SMART_READER_LITE
LIVE PREVIEW

Limited Address Range The proposed architecture: LAR Architecture - - PDF document

ICS/Eindhoven University of Technology Introduction Limited Address Range The proposed architecture: LAR Architecture for Reducing Code Sequential code generation for LAR Annotated Conflict Graph(ACG) Size in Embedded


slide-1
SLIDE 1

Limited Address Range Architecture for Reducing Code Size in Embedded Processors

Qin Zhao, Bart Mesman, Henk Corporaal Eindhoven University of Technology, The Netherlands Philips Research Laboratories, The Netherlands

ICS/Eindhoven University of Technology

  • Introduction
  • The proposed architecture: LAR
  • Sequential code generation for LAR

– Annotated Conflict Graph(ACG)

  • The integrated approach

– Annotated Worst-Case Conflict Graph(AWCCG)

  • Experimental results
  • Conclusions and future work
  • Introduction

– Code size, power consumption of embedded cores must be small since they are on chip – Irregularities in architectures

  • Difficult for efficient code generation

– Clustered register file vs. central register file

  • Advantage: small code size, power consumption
  • Disadvantage: extra hardware, copy operations

– Phase coupling in code generation

  • Sequential phases may generate inefficient code
  • Integrated approach potentially offers better solutions
  • Introduction
  • The proposed architecture: LAR
  • Sequential code generation for LAR

– Annotated Conflict Graph(ACG)

  • The integrated approach

– Annotated Worst-Case Conflict Graph(AWCCG)

  • Experimental results
  • Conclusions and future work

FU1 FU2 FU3 FU4 S1 S2

r1 r2

  • Introduction
  • The proposed architecture: LAR
  • Sequential code generation for LAR

– Annotated Conflict Graph(ACG)

  • The integrated approach

– Annotated Worst-Case Conflict Graph(AWCCG)

  • Experimental results
  • Conclusions and future work
slide-2
SLIDE 2

n0 n1 n2 n3 n4

a b c d

1 2

t a b c d a c a c b d a b c d a b c d b d a b c d a b c d

– Annotated Conflict Graph(ACG)

a b c d

(1,2) (1,2) (2,3) (2,3)

a b c d

(1,2) (1,2) (2,3) (2,3) 1 2 3

a b c d

1 2 3 2 1 2 3

1 2

a c b d

1 2 3 2 S1 S2

3

  • Introduction
  • The proposed architecture: LAR
  • Sequential code generation for LAR

– Annotated Conflict Graph(ACG)

  • The integrated approach

– Annotated Worst-Case Conflict Graph(AWCCG)

  • Experimental results
  • Conclusions and future work

Pu Cu Pv Cv u v Pu Cu Pv Cv u v

u and v overlap for sure – No conflict: u and v can never overlap – Weak conflict: neither of the above holds

ld ld + +

*

+ + n0 n1 n2 n3 n4 n5 n6 a b c d e f f a b c d e Best-Case Conflict Graph(BCCG) f a b c d e ld ld + +

*

+ + n0 n1 n2 n3 n4 n5 n6 a b c d e f 1 2 3 t

slide-3
SLIDE 3

ld ld +

*

+ + n0 n1 n2 n3 n4 + n5 n6 a b c d e f 1 2 3 t f a b c d e + n5 ld ld + +

*

+ + n0 n1 n2 n3 n4 n5 n6 a b c d e f f a b c d e Worst-Case Conflict Graph(WCCG)

  • Range assignment conflict:

– Strong conflict: u and r have strong conflict if u can never reside in S_i, where r in S_i – No conflict: u and r have no conflict if u can always reside in S_i, where r in S_i – Weak conflict: u and r have weak conflict if u can reside in S_j, where r notin S_i

Pu Cu u r S_i S_j r1 r2 r3 r4 S1 S2 + S1

*

S2 ld S1 or S2 f a b c d e

(r1,r2,r3,r4) (r1,r2,r3,r4) (r1,r2,r3) (r1,r2,r3) (r1,r2,r3) (r2,r3,r4)

r1 r2 r3 r4 AWCCG and ABCCG

– Overview of the integrated approach

Constraint analysis Address range assignment Lifetime serialization Bottleneck identification f a b c d e r1 r2 r3 r4

slide-4
SLIDE 4

f a b c d e r1 r2 r3 r4 r1 r2 r3 r4 S1 S2 ld ld + +

*

+ + n0 n1 n2 n3 n4 n5 n6 a b c d e f f a b c d e r1 r2 r3 r4 ld ld + +

*

+ + n0 n1 n2 n3 n4 n5 n6 a b c d e f f a b c d e r1 r2 r3 r4

  • Introduction
  • The proposed architecture: LAR
  • Sequential code generation for LAR

– Annotated Conflict graph(ACG)

  • The integrated approach

– Annotated Worst-Case Conflict Graph(AWCCG)

  • Experimental results
  • Conclusions and future work

DFG_fu,l central LAR % |S| |S_o| T(s) |S| |S_o| T(s) ar_filter_1,18 392 308 78.57 5 2 0.07 5 2 0.09 wdelf_1,27 476 398 83.61 6 2 0.07 6 2 0.12 fdct_2,20 714 588 82.35 9 5 inf 9 5 4.95 9 6 inf 9 6 0.35 9 7 0.26 9 7 0.37 12 7 0.95 12 7 no 12 8 inf 12 8 1.43 12 9 0.2 12 9 no loef_4,11 952 952 100 12 9 inf 12 9 0.89 8 4 inf 8 4 1.28 8 5 0.24 8 5 4.28 9 3 0.22 9 3 no 9 4 0.19 9 4 0.24 9 5 0.16 9 5 0.24 chen_4,8 680 560 82.35 chen_2,15 680 560 82.35 loef_2,15 952 952 100 encoding sequential integrated fdct_4,11 714 588 82.35

  • Conclusions and future work

– Conclusions

  • New encoding style for reducing code size
  • No extra hardware, no extra move operations
  • Corresponding code generation techniques
  • ACG for range constraints
  • AWCCG solves phase coupling problem

– Future work

  • More versatile architectures
  • Combine with the operation assignment phase