Designing memory-reference instructions: lw & sw, - - PowerPoint PPT Presentation

designing
SMART_READER_LITE
LIVE PREVIEW

Designing memory-reference instructions: lw & sw, - - PowerPoint PPT Presentation

Introduction CSE 675.02: Introduction to Computer Architecture We're now ready to look at an implementation of the system that includes MIPS processor and memory. The design will include support for execution of only: Designing


slide-1
SLIDE 1

08/01/2005

Designing MIPS Processor

(Single-Cycle)

Presentation G CSE 675.02: Introduction to Computer Architecture

Gojko Babić

  • g. babic

Presentation G 2

  • We're now ready to look at an implementation of the system

that includes MIPS processor and memory.

  • The design will include support for execution of only:

– memory-reference instructions: lw & sw, – arithmetic-logical instructions: add, sub, and, or, slt & nor, – control flow instructions: beq & j, – exception handling: illegal instruction & overflow.

  • But that design will provide us with principles, so many more

instructions could be easily added such as: addu, lb, lbu, lui, addi, adiu, sltu, slti, andi, ori, xor, xori, jal, jr, jalr, bne, beqz, bgtz, bltz, nop, mfhi, mflo, mfepc, mfco, lwc1, swc1, etc.

Introduction

  • g. babic

Presentation G 3

  • We shall first design a simpler processor that executes each

instruction in only one clock cycle time.

  • This is not efficient from performance point of view, since:

– a clock cycle time (i.e. clock rate) must be chosen such that the longest instruction can be executed in one clock cycle and – that makes shorter instructions execute in one unnecessary long cycle.

  • Additionally, no resource in the design may be used more than
  • nce per instruction, thus some resources will be duplicated.
  • Because of that, the singe cycle design will require:

– two memories (instruction and data), – two additional adders.

Single Cycle Design

  • g. babic

Presentation G 4

Elements for Datapath Design

16 32 Sign extend

  • g. Sign-extension unit

32 32

  • h. Shift left 2

Shift Left 2 P C a . P ro g ra m c o u n te r 32 32 RegWrite Registers Write register Read data 1 Read data 2 Read register 1 Read register 2 Write data Data Data Register numbers

  • b. Register File

5 5 5 32 32 32 c . A L U A L U c o n tr o l A L U r e s u lt A L U Z e r o 4 32 32 32 A dd S u m

  • d. A d d e r

32 32 32 MemRead MemWrite Data memory Write data Read data

  • e. Data memory unit

Address 32 32 32 In struction m em ory In struc tio n ad dres s Instru ctio n f . In struction m em ory 32 32 MemRead=1 MemWrite =0

slide-2
SLIDE 2
  • g. babic

Presentation G 5

  • Generic implementation:

– use the program counter (PC) to supply instruction address, – get the instruction from memory, – read registers, – use the instruction to decide exactly what to do.

Registers Register # Data Register # Data memory Address Data Register # PC Instruction ALU Instruction memory Address

Abstract /Simplified View (1st look)

  • g. babic

Presentation G 6

Abstract /Simplified View (2nd look)

Figure 5.1

  • PC is incremented by 4, by most instructions, and by 4 + 4×offset,

by branch instructions.

  • Jump instructions change PC differently (not shown).
  • g. babic

Presentation G 7

  • An edge triggered methodology
  • Typical execution:

– read contents of some state elements at the beginning of the clock cycle, – send values through some combinational logic, – write results to one or more state elements at the end of the clock cycle.

Our Implementation

Clock cycle State element 1 Combinational logic State element 2

  • An edge triggered methodology allows a state element to be read

and written in the same clock cycle without creating a race that could to indeterminate data.

Figure 5.5

  • g. babic

Presentation G 8

P C I n s t r u c t i o n m e m

  • r y

R e a d a d d r e s s I n s t r u c t i o n 4 A d d

Incrementing PC & Fetching Instruction

Clock Figure 5.6

with addition in red

slide-3
SLIDE 3
  • g. babic

Presentation G 9

Datapath for R-type Instructions

R-type 000000 rs rt rd 00000 funct

31 26 25 21 20 16 15 11 10 6 5 0

add = 32 sub = 34 slt = 42 and = 36

  • r = 37

nor = 39

I n s t r u c t i o n R e g i s t e r s W r i t e r e g i s t e r R e a d d a t a 1 R e a d d a t a 2 R e a d r e g i s t e r 1 R e a d r e g i s t e r 2 W r i t e d a t a A L U r e s u l t A L U Z e r o R e g W r i t e

4

I25-21 I20-16 I15-11

Clock

ALU control

  • g. babic

Presentation G 10

Complete Datapath for R-type Instructions

PC Instruction memory Read address Instruction 4 Add

clock

Based on contents of op-code and funct fields, Control Unit sets ALU control appropriately and asserts RegWrite, i.e. RegWrite = 1.

R e g i s t e r s W r i t e r e g i s t e r R e a d d a t a 1 R e a d d a t a 2 R e a d r e g i s t e r 1 R e a d r e g i s t e r 2 W r i t e d a t a A L U r e s u l t A L U Z e r o R e g W r i t e

4

I25-21 I20-16 I15-11

Clock

ALU control

  • g. babic

Presentation G 11

Datapath for LW and SW Instructions

Control Unit sets:

  • ALU control = 0010 (add) for address calculation for both lw and sw
  • MemRead=0, MemWrite=1 and RegWrite=0 for sw
  • MemRead=1, MemWrite=0 and RegWrite=1 for lw

31 26 25 21 20 16 15

sw or lw

  • pcode

rs rt

  • ffset

In s tru c tio n 1 6 3 2 R e g is te rs W rite re g is te r R e a d d a ta 1 R e a d d a ta 2 R e a d re g is te r 1 R e a d re g is te r 2 D a ta m e m o ry W rite d a ta R e a d d a ta W rite d a ta S ig n e x te n d A L U re s u lt Z e ro A L U A d d re s s M e m R e a d M e m W rite R e g W rit e A L U 4

I25-21 I20-16 I20-16 I15-0

control M e m W r ite Clock

  • g. babic

Presentation G 12

Datapath for R-type, LW & SW Instructions

Let us determine setting of control lines for R-type, lw & sw instructions.

P C In s tru c t io n m e m o ry R e a d a d d re s s In s tru c tio n 1 6 3 2 R e g is te r s W r ite re g is te r W r ite d a ta R e a d d a ta 1 R e a d d a ta 2 R e a d re g is te r 1 R e a d re g is te r 2 S ig n e x te n d A L U re s u lt Z e ro D a ta m e m o ry A d d r e s s W rite d a ta R e a d d a ta 4 A d d A L U A L U control

4

M e m R e a d M e m W rite A L U S rc M e m to R e g

1 1 1

RegDst Clock rs rt rd MemRead =1 MemWrite =0 Clock

  • ffset

W r ite

Clock

R e g

slide-4
SLIDE 4
  • g. babic

Presentation G 13

31 26 25 21 20 16 15

beq rs rt

  • ffset

Datapath for BEQ Instruction

Branch target = [PC] + 4 + 4×offset

1 6 3 2 S ig n e x te n d Z e ro A L U S u m S h ift le ft 2 T o b ra n c h c o n tro l lo g ic B ra n c h ta rg e t P C + 4 fro m in s tru c tio n d a ta p a th In s tru c tio n A d d R e g is te rs W rite re g is te r R e a d d a ta 1 R e a d d a ta 2 R e a d re g is te r 1 R e a d re g is te r 2 W rite d a ta R e g W rite A L U control

4

rs rt

Figure 5.9

with additions in red

  • ffset
  • g. babic

Presentation G 14

Datapath for R-type, LW, SW & BEQ

Figure 5.15

with additions in red

M em toR eg M em Read AL US rc R egD st PC Instructio n m e m ory Read address Instruction [31– 0] Instruction [20– 1 6] Instruction [25– 2 1] Add 4 16 32 Instruction [15– 0 ] Registers W rite register W rite data W rite data R ead data 1 R ead data 2 R ead register 1 R ead register 2 S ig n extend ALU result Zero D ata m em ory Add ress R ead data M u x 1 1 M u x 1 M u x 1 M u x Instruction [15– 1 1] S hift left 2 PC Src ALU A dd ALU result

ALU control clock 4

MemRead=1 MemWrite=0

rs rt rd

  • ffset

M e m W rite

Clock

W r ite

Clock

R e g

  • g. babic

Presentation G 15

PC Instruction memory Read address Instruction [31– 0] Instruction [20 16] Instruction [25 21] Add Instruction [5 0] M emtoReg ALUO p M emW rite R egW rite M emRead Branch R egDst ALUS rc Instruction [31 26] 4 16 32 Instruction [15 0] M u x 1 Control Add ALU result M u x 1 Registers W rite register W rite data R ead data 1 R ead data 2 Read register 1 Read register 2 Sign extend M u x 1 ALU result Zero PCS rc Data mem ory W rite data Read data M u x 1 Instruction [15 11] A LU control Shift left 2 A LU Address

Control Unit and Datapath

Clock MemRead=1 MemWrite=0 Clock anded Clock anded

Figure 5.17

with additions in red

rs rt rd funct

  • ffset
  • pcode
  • g. babic

Presentation G 16 Op-code RegDst ALUSrc Memto- Reg Reg Write Mem Read Mem Write Branch ALUOp1 ALUp0 000000 1 1 d 1 100011 1 1 1 1 101011 d 1 d 1 000100 d d d 1 1

Truth Table for (Main) Control Unit

  • ALUOp[1-0] = 00 signal to ALU Control unit for ALU to perform add

function, i.e. set Ainvert = 0, Binvert=0 and Operation=10

  • ALUOp[1-0] = 01 signal to ALU Control unit for ALU to perform subtract

function, i.e. set Ainvert = 0, Binvert=1 and Operation=10

  • ALUOp[1-0] = 10 signal to ALU Control unit to look at bits I[5-0] and based
  • n its pattern to set Ainvert, Binvert and Operation so

that ALU performs appropriate function, i.e. add, sub, slt, and, or & nor

Input Output

R-type lw sw beq

slide-5
SLIDE 5
  • g. babic

17

Truth Table of ALU Control Unit

ALUOp Funct field ALU Control ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0 d d d d d d 0 0 10 1 d d d d d d 0 1 10 1 1 0 0 10 1 1 1 0 1 10 1 1 1 0 0 00 1 1 1 1 0 0 01 1 1 1 1 0 1 11

add sub add sub and

  • r

slt nor Input Output

1 1 1 1 1 1 00

Ainvert Bivert Operation

  • g. babic

18

R -fo r m a t Iw s w b e q O p 0 O p 1 O p 2 O p 3 O p 4 O p 5 In p u ts O u tp u ts R e g D s t A LU S rc M e m to R e g R e g W rite M e m R e a d M e m W rite B ra n c h A LU O p 1 A LU O p O

Op-code bits 5 4 3 2 1 0 RegDst ALUSrc Memto- Reg Reg Write Mem Read Mem Write Branch ALUOp1 ALUp0 0 0 0 0 0 0 1 1 d 1 1 0 0 0 1 1 1 1 1 1 1 0 1 0 1 1 d 1 d 1 0 0 0 1 0 0 d d d 1 1

Design of (Main) Control Unit

RegDst =Op5Op4Op3Op2Op1Op0 ALUSrc= Op5Op4Op3Op2Op1Op0 +Op5Op4Op3Op2Op1Op0

… …

Figure C.2.5

  • g. babic

19

PC+4 [31– 28]

Datapath for R-type, LW, SW, BEQ & J

PC PC31-28 || jump_target || 00

31 26 25

j jump_target

PC[31-28] Add 2 zeros

Figure 5.24

with correction in red

PC Instruction memory Read address Instruction [31– 0] Data memory Read data Write data Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Instruction [15– 11] Instruction [20– 16] Instruction [25– 21] Add ALU result Zero Instruction [5– 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch Jump RegDst ALUSrc Instruction [31– 26] 4 M u x Instruction [25– 0] Jump address [31– 0] Sign extend 16 32 Instruction [15– 0] 1 M u x 1 M u x 1 M u x 1 ALU control Control Add ALU result M u x 1 ALU Shift left 2 26 28 Address

shift left 2

  • g. babic

20

Design of Control Unit (J included)

… J 0 0 0 0 1 0 d d d 0 d 0 d d d Op-code bits 5 4 3 2 1 0 RegDst ALUSrc Memto- Reg Reg Write Mem Read Mem Write Branch ALUOp1 ALUp0 0 0 0 0 0 0 1 1 d 1 1 0 0 0 1 1 1 1 1 1 1 0 1 0 1 1 d 1 d 1 0 0 0 1 0 0 d d d 1 1 Jump 1

R -fo r m a t Iw s w b e q O p 0 O p 1 O p 2 O p 3 O p 4 O p 5 In p u ts R e g D s t A LU S rc M e m to R e g R e g W rite M e m R e a d M e m W rite B ra n c h A LU O p 1 A LU O p O Jump

Jump =Op5Op4Op3Op2Op1Op0

No changes in ALU Control unit

slide-6
SLIDE 6
  • g. babic

21

ALUOp Funct field ALU Control ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0 d d d d d d 010 1 d d d d d d 110 1 1 010 1 1 1 110 1 1 1 000 1 1 1 1 001 1 1 1 1 111

Design of 7-Function ALU Control Unit

add sub add sub and

  • r

slt ALU Control Lines (Binvert & Operation) Input Output

Figure C.2.3

with improvements Bivert Operation

F 3 F 2 F 1 F 0 A L U O P 1 A L U O P 0 F 5 F 4

  • g. babic

Presentation G 22

  • Let us assume that the only delays introduced are by the

following tasks: – Memory access (read and write time = 3 nsec) – Register file access (read and write time = 1 nsec) – ALU to perform function (= 2 nsec)

  • Under those assumption here are instruction execution times:

Instr Reg ALU Data Reg fetch read oper memory write Total R-type 3 + 1 + 2 + 1 = 7 nsec lw 3 + 1 + 2 + 3 + 1 = 10 nsec sw 3 + 1 + 2 + 3 = 9 nsec branch 3 + 1 + 2 = 6 nsec jump 3 = 3 nsec

  • Thus a clock cycle time has to be 10nsec, and

clock rate = 1/10 nsec = 100MHz

Cycle Time Calculation

  • g. babic

Presentation G 23

  • Single Cycle Problems:

– what if we had a more complicated instruction like floating point? – a clock cycle would be much longer, – thus for shorter and more often used instructions, such as add & lw, wasteful of time.

  • One Solution:

– use a “smaller” cycle time, and – have different instructions take different numbers of cycles.

  • And that is a “multi-cycle” processor.

Single Cycle Processor: Conclusion

Datapath for R-type, LW, SW, BEQ & J

PC[31-28] Add 2 zeros

PC Instruction memory Read address Instruction [31– 0] Data memory Read data Write data Registers Write register Write data Read data 1 Read data 2 Read register 1 Read register 2 Instruction [15– 11] Instruction [20– 16] Instruction [25– 21] Add ALU result Zero Instruction [5– 0] MemtoReg ALUOp MemWrite RegWrite MemRead Branch Jump RegDst ALUSrc Instruction [31– 26] 4 M u x Instruction [25– 0] Jump address [31– 0] Sign extend 16 32 Instruction [15– 0] 1 M u x 1 M u x 1 M u x 1 ALU control Control Add ALU result M u x 1 ALU 26 28 Address

shift left 2

slide-7
SLIDE 7

Control Unit Truth Table and Design

0 0 0 0 1 0 d d d 0 d 0 d d d Op-code bits 5 4 3 2 1 0 RegDst ALUSrc Memto- Reg Reg Write Mem Read Mem Write Branch ALUOp1 ALUp0 0 0 0 0 0 0 1 1 d 1 1 0 0 0 1 1 1 1 1 1 1 0 1 0 1 1 d 1 d 1 0 0 0 1 0 0 d d d 1 1 Jump 1

R -fo r m a t Iw s w b e q O p 0 O p 1 O p 2 O p 3 O p 4 O p 5 In p u ts R e g D s t A L U S rc M e m to R e g R e g W rite M e m R e a d M e m W rite B ra n c h A L U O p 1 A L U O p O Jump