Section 4 Section 4 Arithmetic Units a 4-1 1 ALU ALU a 4-2 - - PowerPoint PPT Presentation

section 4 section 4
SMART_READER_LITE
LIVE PREVIEW

Section 4 Section 4 Arithmetic Units a 4-1 1 ALU ALU a 4-2 - - PowerPoint PPT Presentation

Section 4 Section 4 Arithmetic Units a 4-1 1 ALU ALU a 4-2 2 Arithmetic Logic Unit (ALU) Arithmetic Logic Unit (ALU) Data Arithmetic Unit LD0 32-bits 16 16 8 8 8 8 R7 R7.H R7.L R6 R6.H R6.L LD1 32-bits R5 R5.H R5.L


slide-1
SLIDE 1

1

4-1

a

Section 4 Section 4

Arithmetic Units

slide-2
SLIDE 2

2

4-2

a

ALU ALU

slide-3
SLIDE 3

3

4-3

a

Arithmetic Logic Unit (ALU) Arithmetic Logic Unit (ALU)

Data Arithmetic Unit A1 40 barrel shifter A0 40 16 16 8 8 8 8

LD0 32-bits LD1 32-bits SD 32-bits

R0 R1 R2 R3 R4 R5 R6 R7 R0.L R1.L R2.L R3.L R4.L R5.L R6.L R7.L R0.H R1.H R2.H R3.H R4.H R5.H R6.H R7.H

slide-4
SLIDE 4

4

4-4

a

Arithmetic Logic Unit (ALU) Arithmetic Logic Unit (ALU)

  • Two 40-bit ALUs operating on 16-bit, 32-bit, and 40-bit input
  • perands and output 16-bit, 32-bit, and 40-bit results.
  • Functions

− Fixed-point addition and subtraction − Addition and subtraction of immediate values − Accumulator and subtraction of multiplier results − Logical AND, OR, NOT, XOR, bitwise XOR (LFSR), Negate − Functions: ABS, MAX, MIN, Round, division primitives − Supports conditional instructions

  • Four 8-bit video ALUs

− Explained in more detail as part of Advanced Instructions section

slide-5
SLIDE 5

5

4-5

a

40 40-

  • bit ALU Operations

bit ALU Operations

  • 40-bit ALU operations support the following operations:

− Single 16-Bit Operations − Dual 16-Bit Operations − Quad 16-Bit Operations − Single 32-Bit Operations − Dual 32-Bit Operations

slide-6
SLIDE 6

6

4-6

a

ALU Operations ALU Operations Single 16 Single 16-

  • Bit Operations

Bit Operations

  • Single 16-bit Addition, Subtraction Operations

− Any two 16-bit register halves may be used as inputs. − One 16-bit result is deposited in designated 16-bit register half. − Must specify saturation option (s) or (ns)

  • General Form:

Dreg_lo_hi = Dreg_lo_hi + Dreg_lo_hi (sat_flag);

Example: R6.H = R3.H + R2.L (s);

Single 16-bit addition

31 16 31 16 31 16

R2 R3 R6

+

slide-7
SLIDE 7

7

4-7

a

ALU Operations ALU Operations Dual 16 Dual 16-

  • Bit Operations

Bit Operations

  • Dual 16-bit Addition, Subtraction Operations

− Any two 32-bit registers may be used as inputs. − Two 16-bit results are deposited in designated 32-bit register.

  • General Form:

Dreg = Dreg +|+ Dreg [(opt_mode_0)]; Dreg = Dreg -|- Dreg [(opt_mode_0)]; Dreg = Dreg +|- Dreg [(opt_mode_0)]; Dreg = Dreg -|+ Dreg [(opt_mode_0)];

Example: R6 = R2 + | - R3;

Dual 16-bit addition

31 16

R2 R3 R6

+

slide-8
SLIDE 8

8

4-8

a

ALU Operations ALU Operations Quad 16 Quad 16-

  • Bit Operations

Bit Operations

  • Quad 16-bit Addition, Subtraction Operations

− Any two 32-bit registers may be used as inputs. − Four 16-bit results are deposited in two designated 32-bit registers.

  • General Form:

Dreg = Dreg +|+ Dreg, Dreg = Dreg -|- Dreg [(opt_mode_0, opt_mode_2)]; Dreg = Dreg +|- Dreg, Dreg = Dreg -|+ Dreg [(opt_mode_0, opt_mode_2)];

Example: R3 = R0 + | + R1, R2 = R0 - | - R1;

31 16

R0 R1

+

R3

+

R2

  • R0

R1

31 16

Quad 16-bit addition

slide-9
SLIDE 9

9

4-9

a

ALU Operations ALU Operations Single 32 Single 32-

  • Bit Operations

Bit Operations

  • Single 32-bit Addition, Subtraction Operations

− Any two 32-bit registers may be used as inputs. − One 32-bit result is deposited in designated 32-bit register. − Optional saturation flag

  • General Form:

Dreg = Dreg + Dreg [(sat_flag)]; Dreg = Dreg – Dreg [(sat_flag)];

Example: R6 = R2 + R3;

32-bit addition

31 31 31

R3 R6

+

R2

slide-10
SLIDE 10

10

4-10

a

ALU Operations ALU Operations Dual 32 Dual 32-

  • Bit Operations

Bit Operations

  • Dual 32-bit Addition, Subtraction Operations

− Any two 32-bit registers may be used as inputs. − Two 32-bit result is deposited in designated 32-bit register.

  • General Form:

Dreg = Dreg + Dreg, Dreg = Dreg – Dreg [(opt_mode_1)]; Example: R3 = R1 + R2, R4 = R1 - R2; R4

  • R1

R2

31

R3

+

R1 R2

31

Dual 32-bit operation

slide-11
SLIDE 11

11

4-11

a

ALU Operation ALU Operation Options and Examples Options and Examples

Examples: R6 = R0 -|+ R1 (s); R7 = R3 -|- R6 (SCO); The Vector Add / Subtract instruction provides three option modes.

  • opt_mode_0 supports the Dual and Quad 16-Bit Operations versions of this instruction.
  • opt_mode_1 supports the Dual 32-bit and 40-bit operations.
  • opt_mode_2 supports the Quad 16-Bit Operations versions of this instruction.
slide-12
SLIDE 12

12

4-12

a

ALU Operations Dual 16-Bit Cross Options

  • High result is placed in the low half of designated result

register.

  • Low result is placed in the high half of designated result

register. Example: R0 = R2 +|- R1 (CO);

31 15

R2 R1

  • R0

+

slide-13
SLIDE 13

13

4-13

a

Rounding Instructions Rounding Instructions

Example /* If r6 = 0xFFFC FFFF, then rounding to 16-bits with . . . */ r1.l = r6 (rnd) ; // . . . produces r1.l = 0xFFFD // If r7 = 0x0001 8000, then rounding . . . r1.h = r7 (rnd) ; // . . . produces r1.h = 0x0002 General Form dest_reg = src_reg (RND) Syntax Dreg_lo_hi =Dreg (RND) ; /* round and saturate the source to 16 bits. (b) */

The Round to Half-Word instruction rounds a 32-bit, normalized-fraction number into a 16-bit, normalized-fraction number by extracting and saturating bits 31–16, then discarding bits 15–0. The instruction supports only biased rounding, which adds a half LSB (in this case, bit 15) before truncating bits 15–0. The ALU performs the rounding. The RND_MOD bit in the ASTAT register has no bearing on the rounding behavior of this instruction. Fractional data types such as the operands used in this instruction are always signed.

slide-14
SLIDE 14

14

4-14

a

Other ALU Operations Other ALU Operations Pointer Register Example Instructions Pointer Register Example Instructions

  • P5 = P3 + P0;

// add two 32-bit pointer registers

  • P5 += -4;

// add immediate value to P register

slide-15
SLIDE 15

15

4-15

a

32 32-

  • bit ALU Logical Operations

bit ALU Logical Operations

  • AND

General Form: Dreg = Dreg & Dreg; Example: R4 = R4 & R3;

  • NOT

General Form: Dreg = ~Dreg; Example: R3 = ~ R4;

  • OR

General Form: Dreg = Dreg | Dreg; Example: R4 = R4 | R3;

  • XOR

General Form: Dreg = Dreg ^ Dreg; Example: R4 = R4 ^ R3;

slide-16
SLIDE 16

16

4-16

a

ASTAT Register ASTAT Register

slide-17
SLIDE 17

17

4-17

a

ALU Instruction Summary ALU Instruction Summary

slide-18
SLIDE 18

18

4-18

a

ALU Instruction Summary ALU Instruction Summary

slide-19
SLIDE 19

19

4-19

a

Conditional Code (CC) Bit in ASTAT Conditional Code (CC) Bit in ASTAT

  • CC bit is used in several instructions

− Action taken in the instruction depends on the value of CC − If cc jump here; //if cc = 1, jump to label “here” − If cc r3 = r0; // perform move if cc=1

  • CC bit value is based on a comparison of two registers, pointers
  • r accumulators
  • CC bit can be moved to and from a data register or ASTAT bit
  • CC bit can be negated
slide-20
SLIDE 20

20

4-20

a

CC Bit Instructions CC Bit Instructions

General Syntax for Data/Pointer Register Compare Operations

CC = operand_1 == operand_2 CC = operand_1 < operand_2 CC = operand_1 <= operand_2 CC = operand_1 < operand_2 (IU) CC = operand_1 <= operand_2 (IU) Examples

CC = Dreg == Dreg ; /* equal, register, signed (a) */ CC = Dreg == imm3 ; /* equal, immediate, signed (a) */ CC = Preg == Preg ; /* equal, register, signed (a) */ CC = Preg == imm3 ; /* equal, immediate, signed (a) */

General Syntax for Accumulator Compare Operations

CC = A0 == A1 CC = A0 < A1 CC = A0 <= A1

slide-21
SLIDE 21

21

4-21

a

ALU Exercise ALU Exercise

LAB 3

slide-22
SLIDE 22

22

4-22

a

Multiply Multiply-

  • Accumulators (MAC)

Accumulators (MAC)

slide-23
SLIDE 23

23

4-23

a

Multiply Multiply-

  • Accumulators (MAC)

Accumulators (MAC)

Data Arithmetic Unit acc1 40 barrel shifter acc0 40 16 16 8 8 8 8

LD0 32-bits LD1 32-bits SD 32-bits

R0 R1 R2 R3 R4 R5 R6 R7 R0.L R1.L R2.L R3.L R4.L R5.L R6.L R7.L R0.H R1.H R2.H R3.H R4.H R5.H R6.H R7.H

slide-24
SLIDE 24

24

4-24

a

Multiply Multiply-

  • Accumulators (MAC)

Accumulators (MAC)

  • Two identical MACs

− Each can perform fixed point multiplication and multiply-and- accumulate operations on 16-bit fixed point input data and

  • utputs 32-bit or 40-bit results depending the destination.
  • Functions

− Multiplication − Multiply-and-accumulate with addition (optional rounding) − Multiply-and-accumulate with subtraction (optional rounding) − Dual versions of the above

  • Features

− Saturation of accumulator results − Optional rounding of multiplier results

slide-25
SLIDE 25

25

4-25

a

Placement of Binary Point in Multiplication

  • Binary Integer Multiplication

M Bits P Bits x M+P Bits Example: 16.0 x 16.0 => 32.0

  • Mixed/Fractional Multiplication

M.N Bits P.Q Bits x (M+P).(N+Q) Bits

Example: 1.15 x 1.15 => 2.30** 4.12 x 1.15 => 5.27

** In fractional mode the result of a multiplication will be automatically left shifted by 1 bit resulting in a 1.31 format

slide-26
SLIDE 26

26

4-26

a

Multiplier Results Multiplier Results

Fractional mode Integer mode

slide-27
SLIDE 27

27

4-27

a

Placement of Binary Point in A0

S.

OVERFLOW

A0.X A0.H A0.L Sign Bit

Most Significant 16 Bits Least Significant 16 Bits

S

OVERFLOW

A0.X A0.H A0.L Sign Bit

Most Significant 16 Bits Least Significant 16 Bits

. .

Fractional Mode Integer Mode

39 39

slide-28
SLIDE 28

28

4-28

a

Multiplication Modes -- Fractional Mode

Mode 1: fractional mode

  • Multiplier assumes all numbers in a 1.15 format
  • Multiplier automatically shifts product 1-bit left before accumulation

(Result forced to 1.31 format)

  • Example: A0 = R0.L * R1.L;

0x4000 0x4000

R0.L R1.L A0.L A0.H A0.X A0.H

0x00 2000 0000 0x2000 underflow

  • verflow

=0.5 =0.5 =0.5 =0.5 =0.25 =0.25

slide-29
SLIDE 29

29

4-29

a

Multiplication Modes -- Integer Mode

Mode 2: integer mode

  • Multiplier assumes all numbers in a 16.0 format
  • No automatic left-shift necessary
  • Example: A0 = R0.L * R1.L (IS);

0x4000 0x4000

R0.L R1.L A0.L A0.H A0.X

0x00 1000 0000 0x0000

  • verflow

A0.L

  • verflow

=2 =2 =2 =2 =2 =2

14 14 14 14 28 28

slide-30
SLIDE 30

30

4-30

a

Multiply Operations Multiply Operations Example Instructions Example Instructions

  • Example input operand combinations
  • Accumulator or data register or half-

register can be the destination

A0 = R2.L * R3.L;

R2 R3

X

A1

R0 = R2.L * R3.H;

R2 R3

X

R0.H = R2.H * R3.L;

R2 R3

X

A0 R2 R3

X

A0

A1 = R2.H * R3.H;

R0 R0

slide-31
SLIDE 31

31

4-31

a

Multiply and Accumulate Operations Multiply and Accumulate Operations Example Instructions Example Instructions

  • Example input operand combinations

A0 += R2.L * R3.L;

R2 R3

X +

A1

A0 -= R2.L * R3.H;

R2 R3

X+

A1

A1 += R2.H * R3.L;

R2 R3

X-

A0 A0 R2 R3

X+

A0

A1 += R2.H * R3.H;

slide-32
SLIDE 32

32

4-32

a

Multiply and MAC Operations Multiply and MAC Operations When Result is Transferred From the Accumulator to a 16 When Result is Transferred From the Accumulator to a 16-

  • bit Data Register

bit Data Register

R4.L = (A0 += R2.L * R3.L);

R2 R3

X+

A0 R4

R4.H = (A1 += R2.L * R3.L);

R2 R3

X+

A1 R4

slide-33
SLIDE 33

33

4-33

a

Multiply and MAC Operations Multiply and MAC Operations When Result is Transferred From the Accumulator to a 32 When Result is Transferred From the Accumulator to a 32-

  • bit Data Register

bit Data Register

R1 = (A1 += R2.L * R3.H);

R2 R3

X+

A0 A1 R1

R0 = (A0 += R2.L * R3.H);

R2 R3

X+

A0 A0 R0

When A0 is used, the destination must be to an even Data Register, e.g. R0, R2, R4, R6 When A1 is used, the destination must be to an odd Data Register, e.g. R1, R3, R5, R7 In both cases, the accumulate can be removed or replaced by a subtraction

slide-34
SLIDE 34

34

4-34

a

Dual Multiply Operations Dual Multiply Operations Example Instruction Example Instruction

  • Both Multipliers can be used in the same operation to double the
  • throughput. The same two 32-bit input registers must be used.

A1 = R2.H * R3.H, A0 = R2.L * R3.L; R2 R3

X X

A1 A0

slide-35
SLIDE 35

35

4-35

a

Dual MAC Operations Dual MAC Operations Example Instruction Example Instruction

  • Both MACs can be used in the same operation to double the

MAC throughput. The same two 32-bit input registers must be used (R2 and R3 in this example). A1 -= R2.H * R3.H, A0 += R2.L * R3.L; R2 R3

X+

A1

X

  • A0

In both cases, the accumulate and subtraction are interchangeable

slide-36
SLIDE 36

36

4-36

a

Dual MAC Operations Dual MAC Operations With Destination of Two 16 With Destination of Two 16-

  • bit Data Registers

bit Data Registers

  • Both MACs can be used in the same operation to double the

MAC throughput. The same two 32-bit input registers must be used (R6 and R7 in this example). R2.H = (A1 += R7.H * R6.H), R2.L = (A0 += R7.L * R6.L); R6 R7 A0

X+

A1

X

  • R2
slide-37
SLIDE 37

37

4-37

a

Dual MAC Operations Dual MAC Operations With Destination of Two 32 With Destination of Two 32-

  • bit Data Registers

bit Data Registers

  • Both MACs can be used in the same operation to double the

MAC throughput. The same two 32-bit input registers must be used. R3 = (A1 += R7.H * R6.H), R2 = (A0 += R7.L * R6.L); R2 R3

32-bit Data Register Destinations must be used in pairs, e.g. R0:R1 or R2:R3, R4:R5 or R6:R7

R6 R7 A0

X+

A1

X

  • R2
slide-38
SLIDE 38

38

4-38

a

Dual Multiply Operations Dual Multiply Operations With Destination of Two 32 With Destination of Two 32-

  • bit Data Registers

bit Data Registers

  • Both Multipliers can be used in the same operation to

double the throughput. The same two 32-bit input registers must be used. R0 = R2.H * R3.H, R1 = R2.L * R3.L; R2 R3 R1 R0

X

32-bit Data Register Destinations must be used in pairs, e.g. R0:R1 or R2:R3 or R4:R5 or R6:R7

X

slide-39
SLIDE 39

39

4-39

a

16 16-

  • bit Multiplier Options

bit Multiplier Options

slide-40
SLIDE 40

40

4-40

a

Unbiased and Biased Rounding Unbiased and Biased Rounding

  • Unbiased Rounding: Returns number closest to the original

number

  • When it lies exactly halfway between 2 numbers …
  • The nearest even number is returned
  • Biased Rounding: Returns number closest to the original

number

  • When it lies exactly halfway between 2 numbers …
  • The larger of the numbers is returned
  • RND_MOD (bit 8 of ASTAT register) set to “1” enables biased

rounding

slide-41
SLIDE 41

41

4-41

a

ASTAT Register

slide-42
SLIDE 42

42

4-42

a

Multiplier Instruction Summary Multiplier Instruction Summary

slide-43
SLIDE 43

43

4-43

a

Multiplier Exercise Multiplier Exercise

LAB 4

slide-44
SLIDE 44

44

4-44

a

Barrel Barrel-

  • Shifter (Shifter)

Shifter (Shifter)

slide-45
SLIDE 45

45

4-45

a

Barrel Barrel-

  • Shifter (Shifter)

Shifter (Shifter)

Data Arithmetic Unit acc1 40 barrel shifter acc0 40 16 16 8 8 8 8

LD0 32-bits LD1 32-bits SD 32-bits

R0 R1 R2 R3 R4 R5 R6 R7 R0.L R1.L R2.L R3.L R4.L R5.L R6.L R7.L R0.H R1.H R2.H R3.H R4.H R5.H R6.H R7.H

slide-46
SLIDE 46

46

4-46

a

Barrel Barrel-

  • Shifter (Shifter)

Shifter (Shifter)

  • The shifter performs bitwise shifting for 16-bit, 32-bit or 40-bit

inputs and yields 16-bit, 32-bit, or 40-bit outputs.

  • Functions

− Arithmetic Shift: The Arithmetic Shift instruction shifts a registered number a specified distance and direction while preserving the sign of the original number. The sign bit value back-fills the left- most bit positions vacated by the arithmetic right shift. − Logical Shift: The Logical Shift instruction logically shifts a registered number a specified distance and direction. Logical shifts discard any bits shifted out of the register and backfill vacated bits with zeros.

slide-47
SLIDE 47

47

4-47

a

Barrel Barrel-

  • Shifter (Shifter)

Shifter (Shifter)

  • Functions

− Rotate: The Rotate instruction rotates a registered number through the CC bit a specified distance and direction. − Bit Operations − Field Extract and Deposit

slide-48
SLIDE 48

48

4-48

a

Shifter Instructions Shifter Instructions

Arithmetic Shift Logical Shift

slide-49
SLIDE 49

49

4-49

a

Arithmetic Shift Arithmetic Shift Example Instructions Example Instructions

  • Immediate Shift Magnitude

R3.L = R0.H >>> 7; /* arithmetic right shift, half word */ R5 = R2 << 24 (S); /* arithmetic left shift */

  • Registered Shift Magnitude

R3.L = ashift R0.H by R7.L; /* arithmetic shift, half-word */ A0 = ashift A0 by R7.L; /* arithmetic shift, accumulator */

slide-50
SLIDE 50

50

4-50

a

Logical Shift Logical Shift Example Instruction Example Instruction

  • Pointer shift, fixed magnitude

P3 = P2 >> 1; /* pointer right shift by 1 */ P0 = P1 << 2; /* pointer left shift by 2 */

  • Data shift, immediate shift magnitude

R3.L = R0.L >> 4; /* data right shift, half word register */ R3 = R0 << 12; /* data left shift, 32-bit word */ A0 = A0 << 7; /* accumulator left shift */

  • Data shift, registered shift magnitude

R3.H = lshift R0.L by R2.L; /* logical shift, half word register */ A1 = lshift A1 by R7.L; /* logical shift, accumulator */

slide-51
SLIDE 51

51

4-51

a

Rotate Rotate Example Instruction Example Instruction

  • Immediate Rotate Magnitude

R4 = rot R1 by 8; /* rotate left by 8 */ A0 = rot A0 by –5; /* rotate right by 5 */

  • Registered Rotate Magnitude

R4 = rot R1 by R2.L /* rotate by value in R2.L */ A1 = rot A1 by R7.L /* rotate by value in R7.L */

slide-52
SLIDE 52

52

4-52

a

Bit Operations Bit Operations Example Instructions Example Instructions

  • Bit Clear: BITCLR(Dreg, uimm5);

bitclr(R2, 3);

  • Bit Set: BITSET(Dreg, uimm5);

bitset(R2, 7);

  • Bit Toggle: BITTGL(Dreg, uimm5);

bittgl(R2, 24);

  • Bit Test: CC = BITTST (Dreg, uimm5);

cc = bittst(r7, 15);

  • Bit Test: CC = !BITTST (Dreg, uimm5);

cc = !bittst(r3, 0);

slide-53
SLIDE 53

53

4-53

a

Field Extract and Deposit Field Extract and Deposit Example Instructions Example Instructions

  • Bit Field Extraction

R7 = extract (R4, R3.L) (z); //zero-extended R7 = extract (R4, R3.L) (x); //sign-extended

  • Bit Field Deposit

R7 = deposit (R4, R3); R7 = deposit (R4, R3) (x); //sign-extended

slide-54
SLIDE 54

54

4-54

a

Shifter Instruction Summary Shifter Instruction Summary

slide-55
SLIDE 55

55

4-55

a

Shifter Instruction Summary Shifter Instruction Summary

slide-56
SLIDE 56

56

4-56

a

Shifter Instruction Summary Shifter Instruction Summary

slide-57
SLIDE 57

57

4-57

a

Barrel Shifter Barrel Shifter Exercise Exercise

LAB 5

slide-58
SLIDE 58

58

4-58

a

Reference Material Reference Material

Arithmetic Units

slide-59
SLIDE 59

59

4-59

a

Rounding Instructions Rounding Instructions

slide-60
SLIDE 60

60

4-60

a

Bitwise XOR Instructions Bitwise XOR Instructions

  • These instructions are used to implement Linear Feedback Shift

Registers (LFSR’s)

  • Applications include CRC (Cyclic Redundancy Check) calculations and

PRN (Pseudo Random Number) generators

slide-61
SLIDE 61

61

4-61

a

Bitwise XOR Instructions Bitwise XOR Instructions

Example (no feedback) Example (feedback)