Optimizations for intensive signal processing applications on - - PowerPoint PPT Presentation

optimizations for intensive signal processing
SMART_READER_LITE
LIVE PREVIEW

Optimizations for intensive signal processing applications on - - PowerPoint PPT Presentation

Optimizations for intensive signal processing applications on Systems-on-Chip Calin Glitia September 6, 2010 PhD defense Calin Glitia 1/34 Intensive signal processing Detection systems Multimedia PhD defense Calin Glitia 2/34


slide-1
SLIDE 1

Optimizations for intensive signal processing applications on Systems-on-Chip

Calin Glitia September 6, 2010

PhD defense – Calin Glitia 1/34

slide-2
SLIDE 2

Intensive signal processing

Detection systems Multimedia

PhD defense – Calin Glitia 2/34

slide-3
SLIDE 3

Intensive signal processing

Detection systems Multimedia Repetitive computations Considerable amount of data ⇒ Multi-dimensional arrays

PhD defense – Calin Glitia 2/34

slide-4
SLIDE 4

Modular decomposition

Data-fiow oriented modeling Logical parallelism

1 Task parallelism and pipeline 2 Data parallelism

PhD defense – Calin Glitia 3/34

slide-5
SLIDE 5

Modular decomposition

Data-fiow oriented modeling Logical parallelism

1 Task parallelism and pipeline 2 Data parallelism

Complexity:

Elementary functions assemblage Complex accesses at the data structures

PhD defense – Calin Glitia 3/34

slide-6
SLIDE 6

Execution platforms

Systems-on-Chip: Increase in the integration capacity Multiprocessors Multiprocessor SoC

PhD defense – Calin Glitia 4/34

slide-7
SLIDE 7

Execution platforms

Systems-on-Chip: Increase in the integration capacity Multiprocessors Architecture models : Repetitive topologies Physical parallelism Multiprocessor SoC

XAUI 1 PHY/MAC Serialize Deserialize XAUI 1 PHY/MAC Serialize Deserialize Flexible I/O PCIe 1 PHY/MAC Flexible I/O UART HPI, I2C JTAG, SPI PCIe 0 PHY/MAC Serialize Deserialize Serialize Deserialize GbE 0 GbE 1 DDR Controller 0 DDR Controller 1 DDR Controller 3 DDR Controller 2

PhD defense – Calin Glitia 4/34

slide-8
SLIDE 8

Co-design Application Architecture Mapping Code Generation

PhD defense – Calin Glitia 5/34

slide-9
SLIDE 9

Co-design Application Architecture Mapping Code Generation

Logical parallelism Physical parallelism Distribution Eflcient execution

PhD defense – Calin Glitia 5/34

slide-10
SLIDE 10

Co-design Application Architecture Mapping Code Generation

Logical parallelism Physical parallelism Distribution Eflcient execution

Optimizations

PhD defense – Calin Glitia 5/34

slide-11
SLIDE 11

Outline Application Architecture Mapping Code Generation

PhD defense – Calin Glitia 6/34

slide-12
SLIDE 12

Outline Application Architecture Mapping Code Generation

Array Oriented Language

PhD defense – Calin Glitia 6/34

slide-13
SLIDE 13

Outline Application Architecture Mapping Code Generation

Array Oriented Language Inter repetition dependence

PhD defense – Calin Glitia 6/34

slide-14
SLIDE 14

Outline Application Architecture Mapping Code Generation

Array Oriented Language Inter repetition dependence Modeling and Analysis of Real-Time Embedded Systems

PhD defense – Calin Glitia 6/34

slide-15
SLIDE 15

Outline Application Architecture Mapping Code Generation

Array Oriented Language Modeling and Analysis of Real-Time Embedded Systems Inter repetition dependence High-level refactoring – data-parallel transformations – strategies

PhD defense – Calin Glitia 6/34

slide-16
SLIDE 16

Outline Application Architecture Mapping Code Generation

Array Oriented Language Modeling and Analysis of Real-Time Embedded Systems Inter repetition dependence High-level refactoring – data-parallel transformations – strategies Gaspard2 environment

PhD defense – Calin Glitia 6/34

slide-17
SLIDE 17

Some of the existing languages

Data-Flow

1 Synchronous Data Flow

PhD defense – Calin Glitia Modeling intensive signal processing applications 7/34

slide-18
SLIDE 18

Some of the existing languages

Data-Flow

1 Synchronous Data Flow 2 Extensions:

Cyclo-Static Data Flow Multi-dimensional Synchronous Data Flow, Windowed Synchronous Data Flow Boolean Data Flow, Dynamic Data Flow

PhD defense – Calin Glitia Modeling intensive signal processing applications 7/34

slide-19
SLIDE 19

Some of the existing languages

Data-Flow

1 Synchronous Data Flow 2 Extensions:

Cyclo-Static Data Flow Multi-dimensional Synchronous Data Flow, Windowed Synchronous Data Flow Boolean Data Flow, Dynamic Data Flow

3 Array-OL

PhD defense – Calin Glitia Modeling intensive signal processing applications 7/34

slide-20
SLIDE 20

Some of the existing languages

Data-Flow

1 Synchronous Data Flow 2 Extensions:

Cyclo-Static Data Flow Multi-dimensional Synchronous Data Flow, Windowed Synchronous Data Flow Boolean Data Flow, Dynamic Data Flow

3 Array-OL

Functional languages Alpha: polyhedron, recurrence equations Sisal, Single Assignment C

PhD defense – Calin Glitia Modeling intensive signal processing applications 7/34

slide-21
SLIDE 21

Multidimensional Synchronous DataFlow (MDSDF)

Matrix multiplication B

(M, N)

T

Transposition (M, N, 1) (1, M, N) Repetition (1, 1, 1) (L, 1, 1)

A

(L, M) Repetition (1, 1, 1) (1, 1, N) Downsample (1, M, 1) (1, 1, 1)

T

Transposition (L, 1, N) (L, N, 1) (0, 1, 0) PhD defense – Calin Glitia Modeling intensive signal processing applications 8/34

slide-22
SLIDE 22

Multidimensional Synchronous DataFlow (MDSDF)

Matrix multiplication B

(M, N)

T

(3, 1, 2) (M, N, 1) (1, M, N) (1, 1, 1) (L, 1, 1)

A

(L, M) (1, 1, 1) (1, 1, N) Downsample (1, M, 1) (1, 1, 1)

T

(1, 3, 2) (L, 1, N) (L, N, 1) (0, 1, 0)

Data-fiow graph: actors consuming/producing MultiDimensional-tokens Static analysis/schedule

PhD defense – Calin Glitia Modeling intensive signal processing applications 8/34

slide-23
SLIDE 23

Multidimensional Synchronous DataFlow (MDSDF)

Matrix multiplication B

(M, N)

T

(3, 1, 2) (M, N, 1) (1, M, N) (1, 1, 1) (L, 1, 1)

A

(L, M) (1, 1, 1) (1, 1, N) Downsample (1, M, 1) (1, 1, 1)

T

(1, 3, 2) (L, 1, N) (L, N, 1) (0, 1, 0)

Data-fiow graph: actors consuming/producing MultiDimensional-tokens Static analysis/schedule Limitations: Multiple data consumptions Extensions

PhD defense – Calin Glitia Modeling intensive signal processing applications 8/34

slide-24
SLIDE 24

Array Oriented Language – principles

Similarities with other data-fiow languages Hierarchical decomposition – task parallelism Data-fiow oriented formalism – multi-dimensional data structures

Matrix multiplication B

(M, N)

A

(L, M)

C

(L, N) (L, N)

Line×Column multiplication

(M) (M) (1)

Ligne×Colonne

(M) (M) (M) (M)

×

(1) (1) (1)

Sum

(M) (1) PhD defense – Calin Glitia Modeling intensive signal processing applications 9/34

slide-25
SLIDE 25

Array Oriented Language – principles

Similarities with other data-fiow languages Hierarchical decomposition – task parallelism Data-fiow oriented formalism – multi-dimensional data structures

Matrix multiplication B

(M, N)

A

(L, M)

C

(L, N) (L, N)

Line×Column multiplication

(M) (M) (1)

Ligne×Colonne

(M) (M) (M) (M)

×

(1) (1) (1)

Sum

(M) (1)

Explicit data parallelism : data-parallel repetitions (“loops”) Uniform paving by sub-arrays

PhD defense – Calin Glitia Modeling intensive signal processing applications 9/34

slide-26
SLIDE 26

Array Oriented Language – principles

Similarities with other data-fiow languages Hierarchical decomposition – task parallelism Data-fiow oriented formalism – multi-dimensional data structures

Matrix multiplication B

(M, N)

A

(L, M)

C

(L, N) (L, N)

Line×Column multiplication

(M) (M) (1)

Ligne×Colonne

(M) (M) (M) (M)

×

(1) (1) (1)

Sum

(M) (1)

Explicit data parallelism : data-parallel repetitions (“loops”) Uniform paving by sub-arrays Space and time mixed as dimensions of the data structures

PhD defense – Calin Glitia Modeling intensive signal processing applications 9/34

slide-27
SLIDE 27

Data-parallel repetitions

Matrix multiplication: N = 3, M = 4, L = 2

(3, 4) (4, 2) (3, 2) (3, 2) (4) (4) (1)

Matrix multiplication

PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34

slide-28
SLIDE 28

Data-parallel repetitions

Matrix multiplication: N = 3, M = 4, L = 2 Line×Column multiplications

PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34

slide-29
SLIDE 29

Data-parallel repetitions

Matrix multiplication: N = 3, M = 4, L = 2 uniformly spaced input/output patterns

PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34

slide-30
SLIDE 30

Data-parallel repetitions

Matrix multiplication: N = 3, M = 4, L = 2 DATA-PARALLEL instances

PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34

slide-31
SLIDE 31

Data-parallel repetitions

Matrix multiplication: N = 3, M = 4, L = 2

(3, 4) (4, 2) (3, 2) (3, 2) (4) (4) (1)

Compact representation: repetition space

PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34

slide-32
SLIDE 32

Data-parallel repetitions

Matrix multiplication: N = 3, M = 4, L = 2

(3, 4) (4, 2) (3, 2) (3, 2) (4) (4) (1) F = 1

  • =
  • P =

1

  • F =

1

  • =
  • P =

1

  • F =
  • =
  • P =

1 1

  • Tilers – links between input and output patterns

PhD defense – Calin Glitia Modeling intensive signal processing applications 10/34

slide-33
SLIDE 33

Uniform sub-array accesses (patterns)

Tiler

  • : origin of the reference pattern

8 5

PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

slide-34
SLIDE 34

Uniform sub-array accesses (patterns)

Tiler

  • : origin of the reference pattern

8 5

PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

slide-35
SLIDE 35

Uniform sub-array accesses (patterns)

Tiler

  • : origin of the reference pattern

F: Fitting matrix – the shape of the tiles in the array

8 5

PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

slide-36
SLIDE 36

Uniform sub-array accesses (patterns)

Tiler

  • : origin of the reference pattern

F: Fitting matrix – the shape of the tiles in the array P: Paving matrix – uniform spacing of the tiles

8 5

Formal specification:

  • + (P F) ·

r i

  • mod sarray

PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

slide-37
SLIDE 37

Uniform sub-array accesses (patterns)

Tiler

  • : origin of the reference pattern

F: Fitting matrix – the shape of the tiles in the array P: Paving matrix – uniform spacing of the tiles

8 5

Formal specification:

  • + (P F) ·

r i

  • mod sarray

PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

slide-38
SLIDE 38

Common motif-based accesses

Pattern examples

3 4 5 4 5 3 5 3 5 5 PhD defense – Calin Glitia Modeling intensive signal processing applications 12/34

slide-39
SLIDE 39

Common motif-based accesses

Pattern examples

3 4 5 4 5 3 5 3 5 5

Paving example

9 4 r =

  • 9

4 r = 1

  • 9

4 r = 2

  • 9

4 r = 1

  • 9

4 r = 1 1

  • 9

4 r = 2 1

  • 9

4 r = 2

  • 9

4 r = 1 2

  • 9

4 r = 2 2

  • F =
  • 1

1

  • spattern =
  • 5

3

  • =
  • sarray =
  • 10

5

  • P =
  • 3

1

  • srepetition =
  • 3

3

  • PhD defense – Calin Glitia

Modeling intensive signal processing applications 12/34

slide-40
SLIDE 40

Summary ARRAY-OL

Specification Data-fiow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism

PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

slide-41
SLIDE 41

Summary ARRAY-OL

Specification Data-fiow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis

PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

slide-42
SLIDE 42

Summary ARRAY-OL

Specification Data-fiow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis Limitations Numerical values for the multidimensional spaces/accesses

PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

slide-43
SLIDE 43

Summary ARRAY-OL

Specification Data-fiow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis Limitations Numerical values for the multidimensional spaces/accesses Cycles not allowed in the dependence graph

PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

slide-44
SLIDE 44

Summary ARRAY-OL

Specification Data-fiow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis Limitations Numerical values for the multidimensional spaces/accesses Cycles not allowed in the dependence graph Extension: inter-repetition dependences

PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

slide-45
SLIDE 45

Need for uniform dependences

State construction Transfer data between different instances of the same repetition.

PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

slide-46
SLIDE 46

Need for uniform dependences

State construction Transfer data between different instances of the same repetition. Examples: Sum, Integrate

PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

slide-47
SLIDE 47

Need for uniform dependences

State construction Transfer data between different instances of the same repetition. Examples: Sum, Integrate

Integrate – Flatten ... ... +[0] +[1] +[2]

...

PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

slide-48
SLIDE 48

Need for uniform dependences

State construction Transfer data between different instances of the same repetition. Examples: Sum, Integrate

Integrate – Flatten ... ... +[0] +[1] +[2]

... Uniform data dependences between instances of a repetition

PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

slide-49
SLIDE 49

Uniform dependences

Integrate (∞) (∞) (∞)

+

pin pout

default

Inter-repetition dependence

PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

slide-50
SLIDE 50

Uniform dependences

Integrate (∞) (∞) (∞)

+

pin pout

default

Inter-repetition dependence

1 Data dependence: pout→pin

PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

slide-51
SLIDE 51

Uniform dependences

Integrate (∞) (∞) (∞)

+

pin pout

d = 1

default

Inter-repetition dependence

1 Data dependence: pout→pin 2 Dependence vector inside the

repetition space

PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

slide-52
SLIDE 52

Uniform dependences

Integrate (∞) (∞) () (∞)

+

pin pout

default

Inter-repetition dependence

1 Data dependence: pout→pin 2 Dependence vector inside the

repetition space

3 Initial values: default link for

dependences that exit the repetition space

PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

slide-53
SLIDE 53

Uniform dependences

Integrate (∞) (∞) () (∞)

+

pin pout

d = 1

default

Inter-repetition dependence

1 Data dependence: pout→pin 2 Dependence vector inside the

repetition space

3 Initial values: default link for

dependences that exit the repetition space

Calin Glitia, Philippe Dumont, and Pierre Boulet. Array-OL with delays, a domain specific specification language for multidimensional intensive signal processing. Multidimensional Systems and Signal Processing, 2009.

PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

slide-54
SLIDE 54

Initial values

Initial values – default link

d = 2, 0 PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

slide-55
SLIDE 55

Initial values

Initial values – default link Same initial value

PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

slide-56
SLIDE 56

Initial values

Initial values – default link Same initial value Different values – Tiler

F =

  • =
  • P =

1 1

  • PhD defense – Calin Glitia

Modeling intensive signal processing applications 16/34

slide-57
SLIDE 57

Initial values

Initial values – default link Same initial value Different values – Tiler Different default links – Exclusive tilers

F = 1

  • =

P = 4 F = 1

  • =

−4 P = 4 PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

slide-58
SLIDE 58

Initial values

Initial values – default link Same initial value Different values – Tiler Different default links – Exclusive tilers

F = 1

  • =

P = 4 F = 1

  • =

−4 P = 4 PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

slide-59
SLIDE 59

Complex dependences

Dependence constructions: Multiple default links

PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34

slide-60
SLIDE 60

Complex dependences

Dependence constructions: Multiple default links Multiple dependences on a repetition space

PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34

slide-61
SLIDE 61

Complex dependences

Dependence constructions: Multiple default links Multiple dependences on a repetition space Dependences connected through the hierarchy Dependences on the complete repetition space

PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34

slide-62
SLIDE 62

Impact on the parallelism

PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

slide-63
SLIDE 63

Impact on the parallelism

The repetition space is split in parallel hyper-planes

PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

slide-64
SLIDE 64

Impact on the parallelism

The repetition space is split in parallel hyper-planes Pipeline execution following the distance vector

PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

slide-65
SLIDE 65

Impact on the parallelism

The repetition space is split in parallel hyper-planes Pipeline execution following the distance vector Scheduling uniform loops

Alain Darte and Yves Robert. Constructive methods for scheduling uniform loop nests. IEEE Trans. Parallel Distributed Systems, 5(8):814–822, 1994.

PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

slide-66
SLIDE 66

Outline Application Architecture Mapping Code Generation

Array Oriented Language Inter repetition dependence

PhD defense – Calin Glitia Modeling intensive signal processing applications 19/34

slide-67
SLIDE 67

Outline Application Architecture Mapping Code Generation

Array Oriented Language Inter repetition dependence Modeling and Analysis of Real-Time Embedded Systems

PhD defense – Calin Glitia Modeling intensive signal processing applications 19/34

slide-68
SLIDE 68

Modeling and Analysis of Real-Time Embedded Systems

Profile UML – standard OMG Model Driven Engineering Co-design: application, architecture, mapping Repetitive Structure Modeling All the ARRAY-OL concepts are included Proposed by the DaRT team

PhD defense – Calin Glitia Modeling intensive signal processing applications 20/34

slide-69
SLIDE 69

Model repeated inter-connected architecture topologies

Physical connections between architecture components Compact expression

PhD defense – Calin Glitia Modeling intensive signal processing applications 21/34

slide-70
SLIDE 70

Model repeated inter-connected architecture topologies

Physical connections between architecture components Compact expression Cyclic uniform inter-connections

PhD defense – Calin Glitia Modeling intensive signal processing applications 21/34

slide-71
SLIDE 71

Summary inter-repetition dependences

1 Expression of state constructions

+ PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

slide-72
SLIDE 72

Summary inter-repetition dependences

1 Expression of state constructions 2 Complex dependences through the hierarchy

+ PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

slide-73
SLIDE 73

Summary inter-repetition dependences

1 Expression of state constructions 2 Complex dependences through the hierarchy 3 Parallelism – pipeline

+ PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

slide-74
SLIDE 74

Summary inter-repetition dependences

1 Expression of state constructions 2 Complex dependences through the hierarchy 3 Parallelism – pipeline 4 Repeated inter-connected architectures

+ PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

slide-75
SLIDE 75

Outline Application Architecture Mapping Code Generation

Array Oriented Language Inter repetition dependence

PhD defense – Calin Glitia From a high-level specification to the execution 23/34

slide-76
SLIDE 76

Outline Application Architecture Mapping Code Generation

Array Oriented Language Modeling and Analysis of Real-Time Embedded Systems Inter repetition dependence High-level refactoring – data-parallel transformations – strategies

PhD defense – Calin Glitia From a high-level specification to the execution 23/34

slide-77
SLIDE 77

Execution

Logical space and time as mixed dimensions of multidimensional structure Specification: expresses the data dependences

between all the data elements that transits the system

And a partial execution order

between all the execution of the tasks in of the system

PhD defense – Calin Glitia From a high-level specification to the execution 24/34

slide-78
SLIDE 78

Execution

Logical space and time as mixed dimensions of multidimensional structure Specification: expresses the data dependences

between all the data elements that transits the system

And a partial execution order

between all the execution of the tasks in of the system

EFFICIENT execution Optimized code generation Projection of specification into physical space and time

PhD defense – Calin Glitia From a high-level specification to the execution 24/34

slide-79
SLIDE 79

Execution

Logical space and time as mixed dimensions of multidimensional structure Specification: expresses the data dependences

between all the data elements that transits the system

And a partial execution order

between all the execution of the tasks in of the system

EFFICIENT execution Optimized code generation Projection of specification into physical space and time Adapt a specification to the execution High-level refactoring Execution that refiects the specification

PhD defense – Calin Glitia From a high-level specification to the execution 24/34

slide-80
SLIDE 80

Projection into space and time

Multi-dimensional structures repetition spaces data structures

PhD defense – Calin Glitia From a high-level specification to the execution 25/34

slide-81
SLIDE 81

Projection into space and time

Multi-dimensional structures repetition spaces ⇓ in space data structures ⇓ in time

PhD defense – Calin Glitia From a high-level specification to the execution 25/34

slide-82
SLIDE 82

Projection into space and time

Multi-dimensional structures repetition spaces ⇓ in space ← → linked (trade-off) data structures ⇓ in time

PhD defense – Calin Glitia From a high-level specification to the execution 25/34

slide-83
SLIDE 83

Projection into space and time

Multi-dimensional structures repetition spaces ⇓ in space ← → linked (trade-off) data structures ⇓ in time Take into account the execution constraints Data dependences Available resources

PhD defense – Calin Glitia From a high-level specification to the execution 25/34

slide-84
SLIDE 84

Projection example

Horizontal Filter (1920, 1080, ∞) (720, 1080, ∞) (240, 1080, ∞) (13) (3) Vertical Filter (720, 480, ∞) (720, 120, ∞) (14) (4)

PhD defense – Calin Glitia From a high-level specification to the execution 26/34

slide-85
SLIDE 85

Projection example

Horizontal Filter (1920, 1080, ∞) (720, 1080, ∞) (240, 1080, ∞) (13) (3) Vertical Filter (720, 480, ∞) (720, 120, ∞) (14) (4)

PhD defense – Calin Glitia From a high-level specification to the execution 26/34

slide-86
SLIDE 86

Projection example

Horizontal Filter (1920, 1080, ∞) (720, 1080, ∞) (240, 1080, ∞) (13) (3) Vertical Filter (720, 480, ∞) (720, 120, ∞) (14) (4)

Maximal parallelism Memory size Infinite data structures – Blocking points

PhD defense – Calin Glitia From a high-level specification to the execution 26/34

slide-87
SLIDE 87

Projection example

Horizontal Filter (1920, 1080, ∞) (720, 1080, ∞) (240, 1080, ∞) (13) (3) Vertical Filter (720, 480, ∞) (720, 120, ∞) (14) (4)

Pipeline Execution Order

PhD defense – Calin Glitia From a high-level specification to the execution 26/34

slide-88
SLIDE 88

Projection example

(1920, 1080, ∞) (720, 480, ∞) (240, 120, ∞) Horizontal Filtre (14, 13) (3, 14) (14) (13) (3) Vertical Filter (3, 4) (3) (14) (4)

Fusion of successive repetitions Minimize the arrays – macro-patterns Distribution of the common repetition Each processor its macro-patterns in memory

PhD defense – Calin Glitia From a high-level specification to the execution 26/34

slide-89
SLIDE 89

Projection example

(1920, 1080, ∞) (720, 480, ∞) (240, 120, ∞) Horizontal Filtre (14, 13) (3, 14) (14) (13) (3) Vertical Filter (3, 4) (3) (14) (4)

Re-computations When intermediate values are consumed by multiple repetitions Trade-off

Recompute values Keep in memory – increase of memory size

PhD defense – Calin Glitia From a high-level specification to the execution 26/34

slide-90
SLIDE 90

High-level transformations

Adapt a specification to execution change the granularity of the repetitions array sizes reductions

PhD defense – Calin Glitia From a high-level specification to the execution 27/34

slide-91
SLIDE 91

High-level transformations

Adapt a specification to execution change the granularity of the repetitions array sizes reductions “High-level” loop transformations repetition = visual representation of data-parallel loop nest fusion, change paving, tiling, collapse, . . .

Calin Glitia and Pierre Boulet. High level loop transformations for multidimensional signal processing embedded applications. In International Symposium on Systems, Architectures, Modeling, and Simulation (SAMOS VIII), Samos, Greece, July 2008.

PhD defense – Calin Glitia From a high-level specification to the execution 27/34

slide-92
SLIDE 92

Optimization strategies – memory size reduction

r1 r2 r3 r4 r5 r6 r7

MAXIMAL reduction of the intermediate arrays

PhD defense – Calin Glitia From a high-level specification to the execution 28/34

slide-93
SLIDE 93

Optimization strategies – memory size reduction

r1 r2 r3 r4 r5 r6 r7

MAXIMAL reduction of the intermediate arrays Fusion of multiple repetitions

Minimizes only the last intermediate array Re-computations!

PhD defense – Calin Glitia From a high-level specification to the execution 28/34

slide-94
SLIDE 94

Optimization strategies – memory size reduction

r1 r2 r3 r4 r5 r6 r7

MAXIMAL reduction of the intermediate arrays Fusion of multiple repetitions

Minimizes only the last intermediate array Re-computations!

Complete fusion?

Too much re-computations Limited array reduction

PhD defense – Calin Glitia From a high-level specification to the execution 28/34

slide-95
SLIDE 95

Optimization strategies – memory size reduction

r1 r2 r3 r4 r5 r6 r7

MAXIMAL reduction of the intermediate arrays Strategy that limits the re-computations

using result from complete fusion and two-by-two fusions where re-computations are introduces and minimal achievable array reduction

PhD defense – Calin Glitia From a high-level specification to the execution 28/34

slide-96
SLIDE 96

Optimization strategies – memory size reduction

r1 r2 r3 r4 r5 r6 r7 r14 r67 r12

MAXIMAL reduction of the intermediate arrays

Repetitions Repetitions Re-computations Reduction factor before fusion after fusion (product)

  • f the output arrays

8 × 128 × 96 96 ×     119 ×

  • 10 × 8

1

  • 80 × 80

1     9.29 1228.8 119 × 96 1 96 80 × 80 × 96 1 96 96 1 1 128 × 96 × 80 128 × 96 × 80 1 1 119 × 128 × 96 128 × 96 ×

  • 119

1

  • 1

12288 128 × 96 1 1

PhD defense – Calin Glitia From a high-level specification to the execution 28/34

slide-97
SLIDE 97

Optimization strategies – memory size reduction

r1 r2 r3 r4 r5 r6 r7 r14 r67 r12

MAXIMAL reduction of the intermediate arrays

Calin Glitia, Pierre Boulet, ´ Eric Lenormand, and Michel Barreteau. Repetitive model refactoring strategy for the design space exploration of intensive signal processing applications. Journal of Systems Architecture, Special Issue: Hardware/Software CoDesign.

PhD defense – Calin Glitia From a high-level specification to the execution 28/34

slide-98
SLIDE 98

And the inter-repetition dependences?

Why ? To allow the use of the refactoring tools on models with uniform dependences

PhD defense – Calin Glitia From a high-level specification to the execution 29/34

slide-99
SLIDE 99

And the inter-repetition dependences?

Why ? To allow the use of the refactoring tools on models with uniform dependences Typically: ⇒

PhD defense – Calin Glitia From a high-level specification to the execution 29/34

slide-100
SLIDE 100

And the inter-repetition dependences?

Why ? To allow the use of the refactoring tools on models with uniform dependences Typically: ⇒ Algorithm The global accesses and dependences MUST remain unchanged Automatically compute new dependences after a transformation

PhD defense – Calin Glitia From a high-level specification to the execution 29/34

slide-101
SLIDE 101

And the inter-repetition dependences?

Why ? To allow the use of the refactoring tools on models with uniform dependences Typically: ⇒ Algorithm The global accesses and dependences MUST remain unchanged Automatically compute new dependences after a transformation

Calin Glitia and Pierre Boulet. Interaction between inter-repetition dependences and high-level transformations in array-ol. In Conference on Design and Architectures for Signal and Image Processing (DASIP 2009), Sophia Antipolis, France, September 2009.

PhD defense – Calin Glitia From a high-level specification to the execution 29/34

slide-102
SLIDE 102

Outline Application Architecture Mapping Code Generation

Array Oriented Language Modeling and Analysis of Real-Time Embedded Systems Inter repetition dependence High-level refactoring – data-parallel transformations – strategies

PhD defense – Calin Glitia Gaspard2 – co-design environment for SoC 30/34

slide-103
SLIDE 103

Outline Application Architecture Mapping Code Generation

Array Oriented Language Modeling and Analysis of Real-Time Embedded Systems Inter repetition dependence High-level refactoring – data-parallel transformations – strategies Gaspard2 environment

PhD defense – Calin Glitia Gaspard2 – co-design environment for SoC 30/34

slide-104
SLIDE 104

Gaspard2 environment

SoC visual co-design

1 allows modeling, simulation and code generation of SoC 2 approach Model Driven Engineering 3 Subset of the MARTE UML profile

PhD defense – Calin Glitia Gaspard2 – co-design environment for SoC 31/34

slide-105
SLIDE 105

Gaspard2 conception fiow

high-level specification specializations code generation

PhD defense – Calin Glitia Gaspard2 – co-design environment for SoC 32/34

slide-106
SLIDE 106

Gaspard2 conception fiow

high-level specification

inter-repetition dependences refactoring tools: implementation and integration MDE contributions to Gaspard2

specializations code generation

PhD defense – Calin Glitia Gaspard2 – co-design environment for SoC 32/34

slide-107
SLIDE 107

Conclusion Application Architecture Mapping Code Generation

PhD defense – Calin Glitia Conclusion 33/34

slide-108
SLIDE 108

Conclusion Application Architecture Mapping Code Generation

Array Oriented Language

PhD defense – Calin Glitia Conclusion 33/34

slide-109
SLIDE 109

Conclusion Application Architecture Mapping Code Generation

Array Oriented Language Inter repetition dependence

PhD defense – Calin Glitia Conclusion 33/34

slide-110
SLIDE 110

Conclusion Application Architecture Mapping Code Generation

Array Oriented Language Inter repetition dependence Modeling and Analysis of Real-Time Embedded Systems

PhD defense – Calin Glitia Conclusion 33/34

slide-111
SLIDE 111

Conclusion Application Architecture Mapping Code Generation

Array Oriented Language Modeling and Analysis of Real-Time Embedded Systems Inter repetition dependence High-level refactoring – data-parallel transformations – strategies

PhD defense – Calin Glitia Conclusion 33/34

slide-112
SLIDE 112

Conclusion Application Architecture Mapping Code Generation

Array Oriented Language Modeling and Analysis of Real-Time Embedded Systems Inter repetition dependence High-level refactoring – data-parallel transformations – strategies Gaspard2 environment

PhD defense – Calin Glitia Conclusion 33/34

slide-113
SLIDE 113

Perspectives

. . .

PhD defense – Calin Glitia Conclusion 34/34