Lecture 1: Overview Lecture 1: Overview Hsie-Chia Chang E-mail : - - PowerPoint PPT Presentation

lecture 1 overview lecture 1 overview
SMART_READER_LITE
LIVE PREVIEW

Lecture 1: Overview Lecture 1: Overview Hsie-Chia Chang E-mail : - - PowerPoint PPT Presentation

Lecture 1: Overview Lecture 1: Overview Hsie-Chia Chang E-mail : hcchang@mail.nctu.edu.tw Fall 2006 Outline Outline Typical DSP Algorithm Convolution Correlation Digital Filter Adaptive Filter Decimator and


slide-1
SLIDE 1

Lecture 1: Overview Lecture 1: Overview

Hsie-Chia Chang 張錫嘉 E-mail : hcchang@mail.nctu.edu.tw

Fall 2006

slide-2
SLIDE 2

2

Optimized Application-Specific I ntegrated Systems

Outline Outline

Typical DSP Algorithm

– Convolution、Correlation、Digital Filter、Adaptive Filter、Decimator and Expander、Viterbi Algorithm、Motion Estimation、Discrete Cosine Transform、Vector Quantization、Wavelets and Filter Banks

Representations of DSP Algorithms

– Block Diagram – Signal-Flow Graph – Data-Flow Graph – Dependence Graph

I teration Bound

– Loop Bound and Iteration Bound – Algorithms for Computing Iteration Bound – Iteration Bond of Multi-rate Data-Flow Graphs

slide-3
SLIDE 3

3

Optimized Application-Specific I ntegrated Systems

Typical DSP Algorithm (1/ 4) Typical DSP Algorithm (1/ 4)

DSP has advantages over analog signal processing

– Robust w.r.t. temperature, process variation, … – Higher precision by increasing wordlength – High signal to noise ratio – Repeatability and flexibility by algorithms

Algorithm

– A set of rules for solving a problem in a finite number of steps – DSP algorithms can be found in packages and literatures easily

Two features of DSP

– Real-time throughput requirement

  • No advantage if the processing rate faster than the input sample rate

– Data-driven property

slide-4
SLIDE 4

4

Optimized Application-Specific I ntegrated Systems

Typical DSP Algorithm (2/ 4) Typical DSP Algorithm (2/ 4)

Convolution Correlation

– The correlation operation can be described as a convolution

slide-5
SLIDE 5

5

Optimized Application-Specific I ntegrated Systems

Typical DSP Algorithm (3/ 4) Typical DSP Algorithm (3/ 4)

Digital Filters

– To modify the frequency properties of the input signal x(n) to meet certain specific design requirements in LTI systems – FI R filter – I I R filter – Linear phase FIR filters are attractive as their unit-sample responses are symmetric and require only half the number of multiplications.

Adaptive filters

– The coefficients are updated at each iteration in order to minimize the difference between the filter output and the desired signal

slide-6
SLIDE 6

6

Optimized Application-Specific I ntegrated Systems

Typical DSP Algorithm (4/ 4) Typical DSP Algorithm (4/ 4)

Decimator (compressor or downsampler)

– yD(n) = x(Mn), where M is a positive integer – Output rate is M times slower than input

Expander (interpolator or upsampler)

– yE(n) = x(n/L), if n is interger – multiple of L 0,

  • therwise

– Every input sample, inserting L-1 zeros.

Decimator and Expander are nonlinear operations Noble identities Delay elements transfer

slide-7
SLIDE 7

7

Optimized Application-Specific I ntegrated Systems

Representation of DSP Algorithms (1/ 3) Representation of DSP Algorithms (1/ 3)

I teration period

– the time required for execution of one iteration of the algorithm

Critical path

– longest path between any 2 storage elements (delay elements) – Minimum feasible clock period

Sampling rate (throughput)

– number of samples processed per second

Latency

– The difference between an output generated and its corresponding input received by the system

The clock rate of a DSP system is not the same as its sampling rate

] 3 [ ] 2 [ ] 1 [ ] [ ] [

3 2 1

− + − + − + = n x h n x h n x h n x h n y

slide-8
SLIDE 8

8

Optimized Application-Specific I ntegrated Systems

Representation of DSP Algorithms (2/ 3) Representation of DSP Algorithms (2/ 3)

DSP algorithm can be described by mathematic formations

– Behavioral description

  • Applicative language

e.g. Silage

  • Prescriptive language

e.g. C

  • Descriptive language

e.g. Verilog

– Graphical description

  • Block diagram
  • Signal-Flow graph
  • Data-Flow graph
  • Dependence graph
  • > least structure bias

– Graphical representations are efficient for investigating and analyzing data flow properties of DSP algorithm and for exploiting the inherent parallelism

slide-9
SLIDE 9

9

Optimized Application-Specific I ntegrated Systems

Representation of DSP Algorithms (3/ 3) Representation of DSP Algorithms (3/ 3)

4 possible paths

– Input nodes to delay element – Input node to output node – Delay element to delay element – Delay element to output

Example: 5-tap FI R filter and assume TA= 4ns, TM= 10ns

crit ical pat hs = 26ns

slide-10
SLIDE 10

10

Optimized Application-Specific I ntegrated Systems

Block Diagram Block Diagram

A block diagram

– Consists functional blocks connected with directed edges – Can be constructed with different levels of abstraction

A system can be represented using various block diagrams

– Data-broadcast structure

] 3 [ ] 2 [ ] 1 [ ] [ ] [

3 2 1

− + − + − + = n x h n x h n x h n x h n y

slide-11
SLIDE 11

11

Optimized Application-Specific I ntegrated Systems

Signal Signal-

  • Flow Graph (SFG)

Flow Graph (SFG)

A SFG is a collection of nodes and directed edges

– Nodes

  • source

no entering edge

  • Sink
  • nly entering edge
  • adder、multiplier、…

– Directed edge (j,k)

  • constant gain multipliers
  • delay elements

j k

] 3 [ ] 2 [ ] 1 [ ] [ ] [

3 2 1

− + − + − + = n x h n x h n x h n x h n y

slide-12
SLIDE 12

12

Optimized Application-Specific I ntegrated Systems

Signal Signal-

  • Flow Graph (SFG)

Flow Graph (SFG)

Transposition of SFG is applicable to linear SISO systems

– Reserve the direction of all edges – Exchange input and output

Transpose operations are also applicable to MIMO

systems described by symmetric transformation matrices

] 3 [ ] 2 [ ] 1 [ ] [ ] [

3 2 1

− + − + − + = n x h n x h n x h n x h n y

slide-13
SLIDE 13

13

Optimized Application-Specific I ntegrated Systems

Data Data-

  • Flow Graph (DFG)

Flow Graph (DFG)

I n DFG representations,

– Each node associate an execution time

  • computations, functions, or tasks

– Each edge may have a nonnegative number of delays

slide-14
SLIDE 14

14

Optimized Application-Specific I ntegrated Systems

Data Data-

  • Flow Graph (DFG)

Flow Graph (DFG)

Data-driven property can be captured by the DFG

– Node fire – Many nodes can be fired simultaneously

  • > Concurrency

– Each directed edge

  • > Precedence constraint

I ntra-iteration precedence constraint

– The edge has zero delays

I nter-iteration precedence constraint

– One or more delays

slide-15
SLIDE 15

15

Optimized Application-Specific I ntegrated Systems

Synchronous Data Synchronous Data-

  • Flow Graph (SDFG)

Flow Graph (SDFG)

Synchronous Data-Flow-Graph (SDFG)

– A special case of DFG where the number of data samples produced

  • r consumed by each node in each execution is specified a priori.

Single rate system Multi-rate SDFG

3f A=5f B 2f B=3f C

Single-rate DFG

slide-16
SLIDE 16

16

Optimized Application-Specific I ntegrated Systems

Dependence Graph (DG) Dependence Graph (DG)

A DG is a directed graph to show the dependence of the computation in an algorithm

– Nodes: computation – Edge: precedence constraint

DGs are widely used in systolic array designs

– SFGs can be derived by DGs

slide-17
SLIDE 17

17

Optimized Application-Specific I ntegrated Systems

Summary Summary

Block diagram SFG

– It provides an abstract flowgraph representations of linear networks and have been extensively used in digital filter structure design and analyis of finite wordlength effects

DFG

– It’s generally used for high-level synthesis to derive concurrent implementation of DSP applications onto parallel hardware, where subtask scheduling and resource allocation are of major concern

DG

– It’s widely used in systolic array designs

slide-18
SLIDE 18

18

Optimized Application-Specific I ntegrated Systems

I teration Period I teration Period

I teration

– For a node, it’s the execution of the node exactly once – For a DFG, it’s the execution of each node in the DFG exactly once

I teration period

– the time required for execution of one iteration

I teration rate

– the number of iterations executed per second

Ak ⇒ Bk Bk ⇒ Ak+1

M A

T T +

slide-19
SLIDE 19

19

Optimized Application-Specific I ntegrated Systems

Loop Bound Loop Bound

Loop (cycle)

– a directed path that begins and ends at the same node

Loop bound of the loop j Tj / Wj

– Tj is the loop computation time – Wj is the number of delays in the loop – Critical loop is the loop with the maximum loop bound

Examples:

– The loop bound = 3

y(n)= ay(n-2)+ x(n)

slide-20
SLIDE 20

20

Optimized Application-Specific I ntegrated Systems

I teration Bound I teration Bound

Many DSP Algorithms contain feedback loops I teration bound

– An inherent lower bound on the iteration (or sample period)

  • It’s not possible to achieve iteration period lower than iteration bound

even with infinite processing elements

– The loop bound of the critical loop

slide-21
SLIDE 21

21

Optimized Application-Specific I ntegrated Systems

Remarks Remarks

ret iming

AN⇒BN+1 ⇒AN+2 ⇒BN+3…

slide-22
SLIDE 22

22

Optimized Application-Specific I ntegrated Systems

Algorithms for Computing I teration Bound Algorithms for Computing I teration Bound

Long execution time for finding the iteration bound

– It’s because the number of loops in a DFG can be exponentially with respect to the number of nodes

Two algorithms for computing T∞

– Longest Path Matrix (LPM) Algorithm – Minimum Cycle Mean (MCM) Algorithm

slide-23
SLIDE 23

23

Optimized Application-Specific I ntegrated Systems

LPM Algorithm (1/ 2) LPM Algorithm (1/ 2)

slide-24
SLIDE 24

24

Optimized Application-Specific I ntegrated Systems

LPM Algorithm (2/ 2) LPM Algorithm (2/ 2)

slide-25
SLIDE 25

25

Optimized Application-Specific I ntegrated Systems

MCM Algorithm MCM Algorithm

slide-26
SLIDE 26

26

Optimized Application-Specific I ntegrated Systems

T T∞

∞ of

  • f Multirate

Multirate DFGs DFGs ( ( MRDFGs MRDFGs) )

Construct the equivalent single-rate DFG (SRDFG)

– Compute the iteration bound of the equivalent SRDFG – The iteration bound of the MRDFG is the same as the iteration bound of the equivalent SRDFG.

kx: t he number of nodes relat ed t o “x” Not e: same number of delay element s