Chapter 18: Programmable DSPs Keshab K. Parhi and Viktor Owall DSP - - PowerPoint PPT Presentation
Chapter 18: Programmable DSPs Keshab K. Parhi and Viktor Owall DSP - - PowerPoint PPT Presentation
Chapter 18: Programmable DSPs Keshab K. Parhi and Viktor Owall DSP Applications DSP applications are often real time but with a wide variety of sample rates High rates Radar Video Medium rates Audio Speech Low rates
- Chap. 18
2
DSP Applications
DSP applications are often real time but with a wide variety of sample rates
- High rates
– Radar – Video
- Medium rates
– Audio – Speech
- Low rates
– Weather – Finance
- Chap. 18
3
...with different demands on
- numeric representation
– float or fixed – and nmber of bits
- Throughput/speed
- Power/energy dissipation
- Cost
- Chap. 18
4
DSP features
D D D
x(n) h0 h3 h2 h1 y(n)
Fast Multiply/Accumulate (MAC)
- FIR
- FFT
- etc.
- Multiple Access Memories
- Specialized addressing modes
- Specialized execution control (loops)
- Specialized interfaces, e.g. AD/DA
- Chap. 18
5
Addressing Modes
- Implied addressing
P=X*Y;
- peration sets location
- Immediate data
AX0=1234
- Memory direct
R1=Mem[101]
- Register direct
sub R1, R2
- Register indirect
A0=A0+ *R5
- Register indirect with increment/decrement
A0=A0+ *R5++ A0=A0+ *R5--
- Chap. 18
6
Standard DSP Alternatives
PCs or Workstations
- Non-real time
- low requirements
General purpose microprocessors
- slower for DSP applications
- might be one µproc. there anyway
Custom
- perfomance
- low cost at volume
- High development cost
- Chap. 18
7
Standard Processors vs. Special Purpose
Algorithm Special Purpose Standard Processor
Processor Cores Domain Specific Processors etc.
- Programmable
- Low Design cost
- Standard Interface
- Good supply of tools
- High Calculation Capacity
- Low Power
- User defined Interface
- Variable Wordlength
- Low Price at Volume
- Chap. 18
8
Main MEM
Conflicting req.
- Throughput
- Flexibility
- Power Consumption
- Time to market
Processor Core
ASIC ASIC
Main MEM
Dist. MEM
Local busses and Distributed memory to decrease data transfers MIPS intensive algorithms in dedicated HW to increase throughput and save power Flexibility by using programmable processor core
Architectural Partitioning
Processor Core
- Chap. 18
9
Fixed point DSP
Motorola DSP56000x
X0 X1 Y0 Y1 Shifter ALU A (56) B (56) Shifter/ Limiter 24 24 24 24 24 24 24 24 56 56 56 56 56 Operand Registers Accumulators
- Usually DSP has single cycle
multiplier, may be pipelined
- Double wordlength out
+ guard bits
- scaling
- Altenative is mult
with reduced wordlength output, e.g. 24
- Chap. 18
10
Memory Structures, von Neuman
Addresss bus Data bus
Processor Core Memory
- Chap. 18
11
Memory Structures, Harvard
Addresss bus 2 Data bus 2
Processor Core Memory A
Addresss bus 2 Data bus 2
Memory B Original Harvard
- one data
- one program
- Chap. 18
12
TI Processors, high speed
- Chap. 18
13
TI Processors, low power
- Chap. 18
14
TI, C64
- Chap. 18
15
TI, C55
- Chap. 18
16
Processor Architectures
SIMD – Single Instruction Multiple Data
Program Processor Processor Processor Processor
- Chap. 18
17
Processor Architectures
MIMD – Multiple Instruction Multiple Data
Processor Processor Processor Processor Program Program Program Program
- Chap. 18
18
Processor Architectures
VLIW – Very Long Instruction Words
Functional Unit
VLIW Instruction Control Unit
Functional Unit Functional Unit Functional Unit
- Chap. 18
19
Split Processors
Functional units can be split into submodules, e.g. for images (8bits) TI320C80, 1 RISC 4 x 32bit DSP which can be split into 8bit modules
- Chap. 18
20
Vector Processors
- Chap. 18
21
Low Power MMAC
Multiplier Multiple Accumulator
- V. Sundararajan and K.K. Parhi, "A
Novel Multiply Multiple Accumulator Component for Low Power PDSP Design", Proc. of 2000 IEEE Int. Conf.
- n Acoustics, Speech and Signal
Processing, Vol. 6, pp. 3247-3250, Istanbul, June 2000
- Chap. 18
22