CPE 335 CPE 335 Computer Organization MIPS Arithmetic Part II - - PowerPoint PPT Presentation

cpe 335 cpe 335 computer organization mips arithmetic
SMART_READER_LITE
LIVE PREVIEW

CPE 335 CPE 335 Computer Organization MIPS Arithmetic Part II - - PowerPoint PPT Presentation

CPE 335 CPE 335 Computer Organization MIPS Arithmetic Part II Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/Courses/CPE335_S08/index.html CPE 232 MIPS Arithmetic 1 Shift Operations Shifts move


slide-1
SLIDE 1

CPE 335 CPE 335 Computer Organization MIPS Arithmetic – Part II

  • Dr. Iyad Jafar

Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/Courses/CPE335_S08/index.html

CPE 232 MIPS Arithmetic 1

slide-2
SLIDE 2

Shift Operations

Shifts move all the bits in a word left or right

sll $t2, $s0, 8 #$t2 = $s0 << 8 bits , , srl $t2, $s0, 8 #$t2 = $s0 >> 8 bits

6 bits 5 bits 5 bits 5 bits 5 bits 6 bits Op code rs rt rd shamt funct Notice that a 5-bit shamt field is enough to shift a 32-

bit value 25 – 1 or 31 bit positions p

Such shifts are logical because they fill with zeros

CPE 232 MIPS Arithmetic 2

slide-3
SLIDE 3

Shift Operations, con’t

An arithmetic shift (sra) maintain the arithmetic

correctness of the shifted value (i.e., a number shifted ½ f right one bit should be ½ of its original value; a number shifted left should be 2 times its original value)

so sra uses the most significant bit (sign bit) as the bit so sra uses the most significant bit (sign bit) as the bit

shifted in

note that there is no need for a sla when using two’s

g complement number representation

sra $t2, $s0, 8 #$t2 = $s0 >> 8 bits

The shift operation is implemented by hardware

separate from the ALU separate from the ALU

Using a barrel shifter: is a digital circuit that can shift a data

word by a specified number of bits in one clock cycle !

CPE 232 MIPS Arithmetic 3

y y

Simply a set of multiplexers multiplexers !

slide-4
SLIDE 4

Shift Operations – Barrel Shifter

Example : 4-bit barrel shifter (rotate to the left)

D0 D3 D2 D Y0 4-bit Barrel 4 D 4 Y D1 D1 Shifter 4 4

1

D0 D3 D2 Y1 S1 S0 D2 D1 Y2 Shift Value Output

1

D0 D3 Y2 S1 S0 Y3 Y2 Y1 Y0 0 0 D3 D2 D1 D0 1 D2 D1 D0 D3 D3 D2 D1 Y3 0 1 D2 D1 D0 D3 1 D1 D0 D3 D2 1 1 D0 D3 D2 D1

CPE 232 MIPS Arithmetic 4

D0 1 1 D0 D3 D2 D1 S1 and S0 determine the shift amount 0,1,2, and 3

slide-5
SLIDE 5

Multiply

Binary multiplication is just a bunch of left shifts of the

and adds

Size of product is 2n. 4 bits 4 bits Partial 8 bits Products 8 bits

CPE 232 MIPS Arithmetic 5

slide-6
SLIDE 6

Multiplication Hardware

Hardware implementation of multiplication algorithm Operation

Initialize the lower 32 bits of the multiplicand register with the multiplicand Initialize the product register to 0. If the multiplier LSB is 1 add multiplicand to product If the multiplier LSB is 1, add multiplicand to product If the multiplier LSB is 0, don’t add Shift multiplicand left and multiplier right by one bit

R t 32 ti

Repeat 32 times

CPE 232 MIPS Arithmetic 6

slide-7
SLIDE 7

Multiplication Hardware

Flowchart for multiplication

algorithm

If each step takes one

cycle, we need almost 100 cycles for 32 bit lti li ti multiplication

Check the multiplication

example in page 180 for example in page 180 for better understanding !

CPE 232 MIPS Arithmetic 7

slide-8
SLIDE 8

Optimized Multiplication Hardware

ALU and multiplicand register are both 32 bits wide Multiplier register is omitted and multiplier is placed in the lower 32

bits of the product register

The product register is shifted to the right along with the multiplier

register until we have 32 repetitions register until we have 32 repetitions

CPE 232 MIPS Arithmetic 8

slide-9
SLIDE 9

Fast Multiplication Units

Use 31 32-bit adders to compute

the partial products

One input is the multiplicand

ANDed with a multiplier and the ANDed with a multiplier, and the

  • ther is the partial product from

previous step.

Example: show the multiplication

tree to compute 5 X 3. Assume unsigned numbers represented using 3 bits and we have 4-bit ALU.

CPE 232 MIPS Arithmetic 9

slide-10
SLIDE 10

Multiplication - Notes

Multiplies are done by fast, dedicated hardware and

are much more complex and slower than adders p

Multiplication by power of two can be performed by

p y p p y simple left shifts in hardware. It is the compiler responsibility to choose when to use left shifts for multiplication by power of 2 in order to reduce the multiplication by power of 2 in order to reduce the execution time

Signed multiplication can be performed in similar

  • manner. Convert the multiplicand and the multiplier to

positive numbers (if necessary), then determine the product sign from their signs. What is the logic required to compute the sign of the product ?

CPE 232 MIPS Arithmetic 10

required to compute the sign of the product ?

slide-11
SLIDE 11

MIPS Multiply Instructions

Multiply produces a double precision product

mult $s0 $s1 # hi||lo = $s0 * $s1

rt

mult $s0, $s1 # hi||lo $s0 $s1 multu $s0, $s1 # hi||lo = $s0 * $s1

rs 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits Op code rs rt rd shamt funct

Low-order word of the product is left in processor register lo

and the high-order word is left in register hi

Instructions mfhi rd and mflo rd are provided to move

the product to (user accessible) registers in the register file

Both instructions ignore overflow; it is the responsibility of the Both instructions ignore overflow; it is the responsibility of the

software to check if the result fits into 32 bits !

For multu, there is no overflow if hi is 0

CPE 232 MIPS Arithmetic 11

For mult, there is no overflow if hi is the replicated sign of lo

slide-12
SLIDE 12

Division

Division is just a bunch of quotient digit guesses and left

shifts and subtracts

Dividend = Quotient x Divisor + Remainder

CPE 232 MIPS Arithmetic 12

slide-13
SLIDE 13

Division Hardware

Division algorithm Divisor is placed in the upper Divisor is placed in the upper

32 bits and dividend is placed in the lower 32 bits of the remainder register

CPE 232 MIPS Arithmetic 13

remainder register

slide-14
SLIDE 14

MIPS Divide Instruction

Divide generates the reminder in hi and the quotient

in lo div $s0, $s1 # lo = $s0 / $s1 # hi = $s0 mod $s1 # hi = $s0 mod $s1 divu $s0, $s1

rs rt 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits Op code rs rt rd shamt funct

Instructions mfhi rd and mflo rd are provided to move

the quotient and reminder to (user accessible) registers in the register file

As with multiply, divide ignores overflow so software must

determine if the quotient is too large.

CPE 232 MIPS Arithmetic 14

determine if the quotient is too large.

  • Software must also check the divisor to avoid division by 0.
slide-15
SLIDE 15

Divide - Notes

Signed division

Remember the signs of the dividend and divisor and use to

determine the sign of the quotient

The sign of the remainder is always the same as the dividend

(Ch k b lf th di i i f 5/2 i diff t bi ti f (Check by yourself the division of 5/2 using different combinations of the signs of the dividend and the divisor) Fast division algorithms use look-up tables to guess

several quotient bits per step. The algorithms rely on q p p g y subsequent steps to correct wrong guesses

The Pentium bug in 1994 Cost for recall was about $500M

CPE 232 MIPS Arithmetic 15

slide-16
SLIDE 16

Representing Big (and Small) Numbers

H t d l b ?

How to encode real numbers ?

  • 4,600,000,000 or 4.6 x 109
  • 0.0000000000000000000000000166 or 1.6 x 10-27
  • There is no way we can encode either of the above in a 32-bit

integer.

Floating point representation (-1)sign x F x 2E

Still have to fit everything in 32 bits (single precision) Normalized representation (no leading zeros and one bit to the

left of binary point) More bits in the fraction (F) or the exponent (E) is a trade off

S E (exponent) Fraction 1 bit 8 bits 23 bits

More bits in the fraction (F) or the exponent (E) is a trade-off

between precision (accuracy of the number) and range (size of the number)

CPE 232 MIPS Arithmetic 16

Smallest number 2.0x10-38 and largest is 2.0x1038

slide-17
SLIDE 17

Representing Big (and Small) Numbers

Overflow and underflow ! Double precision format (use 64 bits instead of 32)

S E F 1 bit 11 bits 52 bits

Smallest number 2.0x10-308 and largest is 2.0x10308 Most computers these days conform to the IEEE 754 floating point

p y g p standard

To pack more bits into the significand, one bit of the normalized

bi b i i li itl d 1 binary numbers is implicitly assumed 1

Since 0 has no leading 1, it has a reserved exponent value of 0 so

that hardware won’t attach 1 to it

CPE 232 MIPS Arithmetic 17

that hardware won t attach 1 to it

slide-18
SLIDE 18

Representing Big (and Small) Numbers

Special numbers in the IEEE standard

Single Precision Double Precision Object Represented Single Precision Double Precision Object Represented E (8) F (23) E (11) F (52) true zero (0) nonzero nonzero ± denormalized number ± 1-254 anything ± 1-2046 anything ± floating point number 255 2047 i fi it ± 255 ± 2047 ± infinity 255 nonzero 2047 nonzero not a number (NaN)

CPE 232 MIPS Arithmetic 18

slide-19
SLIDE 19

IEEE 754 FP Standard Encoding

This representations is intended to simplify sorting of

floating numbers using integer comparison

Separate sign bit (sign and magnitude notation) Placing the exponent before the significand

U f bi d t t ti dd t t l t

Use of biased exponent notation; add a constant value to

represent all exponents with positive numbers

  • In single precision, bias is 127

– Exponent -3 is represented as -3 + 127 = 124 – Exponent 5 is represented as 5 + 127 = 132

  • While in double precision , the bias is 1023

So in biased notation, the decimal value represented by

th li d fl ti i t b i the normalized floating-point number is

(-1)S x (1+Fraction) x 2(Exponent – Bias)

CPE 232 MIPS Arithmetic 19

( ) ( )

slide-20
SLIDE 20

Floating-point Example

Example 1: Show the IEEE754 representation of -0.75

using single and double precision formats

  • (0.75)ten

= (0.11)two

  • (-0.75) ten = (-0.11)two

(we use sign and magnitude) i bi i tifi t ti 0 11 20

  • in binary scientific notation -0.11two x 20
  • in normalized binary scientific notation -1.1two x 2-1
  • add the bias to the exponent
  • add the bias to the exponent
  • In single precision add 127
  • 1.1two x 2126
  • In double precision add 1023
  • 1.1two x 21022

p

two

  • convert the exponent into binary
  • 126 = (01111110)2
  • 1022 = (01111111110)2
  • drop the 1 on the left of the binary point and fill the

corresponding fields

CPE 232 MIPS Arithmetic 20

corresponding fields

slide-21
SLIDE 21

Floating-point Example

Example 1: Show the IEEE754 representation of -0.75

using single and double precision formats

Single precision Double precision Double precision

CPE 232 MIPS Arithmetic 21

slide-22
SLIDE 22

Floating-point Example

Example 2: What decimal number N is represented by

the following float ? N = (-1)S x (1+Fraction) x 2(Exponent – Bias) = (-1)1 x (1+0.25) x 2(129 – 127) = -1 x 1 25 x 22 = -1 x 1.25 x 2 = -1.25 x 4 = -5

CPE 232 MIPS Arithmetic 22

slide-23
SLIDE 23

Floating Point Addition

Analogy to adding floating decimals (Example: 9.999x101

+ 1.610 x 10-1using four digits)

Steps to perform (±F1 × 2E1) + (±F2 × 2E2) = ±F3 × 2E3

Step 1: Restore the hidden bit in F1 and in F2

p

Step 1: Align fractions by right shifting F2 by E1 - E2 positions

(assuming E1 ≥ E2) S 2 2 1 f 3

Step 2: Add the resulting F2 to F1 to form F3 Step 3: Normalize F3 (so it is in the form 1.XXXXX …) and check

for overflow/underflow in the exponent p

Step 4: Round F3 and possibly normalize F3 again Step 5: Rehide the most significant bit of F3 before storing the

result

CPE 232 MIPS Arithmetic 23

slide-24
SLIDE 24

Floating-Point Adder Hardware

CPE 232 MIPS Arithmetic 24

slide-25
SLIDE 25

Floating Point Addition

Example: show how to add 0.625 and -0.125 using

floating point binary representation

In normalized scientific notation this is equivalent

1.100 x 2-1 + -1.000 x 2-3 Ali t

Align exponents

1.100 x 2-1 + -0.010 x 2-1

Add significands Add significands

1.010 x 2-1

Normalize the sum (if necessary) and check for Normalize the sum (if necessary) and check for

  • verflow/underflow

Round the sum and normalize again

CPE 232 MIPS Arithmetic 25

slide-26
SLIDE 26

Accurate Arithmetic

In arithmetic we are restricted with the number of bits. Thus we may

need to truncate the operand with smallest power to fit into the available bits

IEEE754 standards define two extra bits to the right of the numbers;

the guard and round bits.

Decimal example: 2.56 x 100 + 2.34 x 102 Assume significand is represented in 3 digits only Without guard and round digits (truncation occurs for two digits)

(2.34 + 0.02) x 102 = 2.36 x 102

With guard digit, we don’t have to truncate the small number

when shifted to the right to match the large number (2 3400 + 0 0256) x 102 = 2 3656 x 102 (2.3400 + 0.0256) x 10 2.3656 x 10 = 2.37 x 102 (after rounding)

Sticky bit !

CPE 232 MIPS Arithmetic 26

Sticky bit !

slide-27
SLIDE 27

MIPS Floating Point Instructions

MIPS has a separate Floating Point Register File

($f0, $f1, …, $f31) (those registers are used in pairs for double precision values) with special instructions to for double precision values) with special instructions to load to and store from them lwc1 $f1 54($s2) #$f1 = Memory[$s2+54] lwc1 $f1,54($s2) #$f1 = Memory[$s2+54] swc1 $f1,58($s4) #Memory[$s4+58] = $f1

And supports IEEE 754 single

add.s $f2,$f4,$f6 #$f2 = $f4 + $f6 and double precision operations add.d $f2,$f4,$f6 #$f2||$f3 = $f4||$f5 + $f6||$f7 $ ,$ ,$ $ ||$ $ ||$ $ ||$ similarly for sub.s, sub.d, mul.s, mul.d, div.s, div.d

CPE 232 MIPS Arithmetic 27

div.d

slide-28
SLIDE 28

MIPS Floating Point Instructions, Con’t

And floating point single precision comparison operations

c.x.s $f2,$f4 #if($f2 x $f4) cond=1; else cond=0 where x may be eq, neq, lt, le, gt, ge y q q g g and branch operations bclt 25 #if(cond==1) go to PC+4+100 bclf 25 #if(cond==0) go to PC+4+100

And double precision comparison operations

c.x.d $f2,$f4 #$f2||$f3 x $f4||$f5

CPE 232 MIPS Arithmetic 28

cond=1; else cond=0