 
              CPE 335 Computer Organization Computer Organization Basic MIPS Architecture – Part II Dr. Iyad Jafar Ad Adapted from Dr. Gheith Abandah slides t d f D Gh ith Ab d h lid http://www.abandah.com/gheith/Courses/CPE335_S08/index.html CPE232 Basic MIPS Architecture 1
Multicycle Datapath Approach � Let an instruction take more than 1 clock cycle to complete � Let an instruction take more than 1 clock cycle to complete � Break up instructions into steps where - each step takes a cycle while trying to balance the amount of work to be done in each step - restrict each cycle to use only one major functional unit; unless used in parallel � Not every instruction takes the same number of clock cycles � In addition to faster clock rates multicycle allows functional � In addition to faster clock rates, multicycle allows functional units that can be used more than once per instruction as long as they are used on different clock cycles, as a result � Need one memory only– but only one memory access per cycle � Need one ALU/adder only – but only one ALU operation per cycle CPE232 Basic MIPS Architecture 2
Multicycle Datapath Approach, con’t � At the end of a cycle � At the end of a cycle � Store values needed in a later cycle by the current instruction in internal registers (A,B, IR, and MDR) . These registers are invisible to the programmer. � All of these registers, except IR, hold data only between a pair of adjacent clock � All of these registers, except IR, hold data only between a pair of adjacent clock cycles thus they don’t need write control signal. IR Memory Read Addr 1 PC A Address Register Read out Read Addr 2 Read Addr 2 Data 1 Data 1 Read Data Read Data ALUo File (Instr. or Data) ALU Write Addr Read B Write Data Data 2 MDR Write Data IR – Instruction Register MDR – Memory Data Register A, B – regfile read data registers g g ALUout – ALU output register p g , � Data used by subsequent instructions are stored in programmer visible registers (i.e., register file, PC, or memory) CPE232 Basic MIPS Architecture 3
Multicycle Datapath Approach, con’t � Similar to single cycle, shared functional units should have multiplexers at their inputs. � There is only one adder that will be used to update PC perform ALU � There is only one adder that will be used to update PC, perform ALU operations, comparison for beq, memory address computation, and branch address computation. CPE232 Basic MIPS Architecture 4
Multicycle Datapath Approach- Control Signals CPE232 Basic MIPS Architecture 5
The Multicycle Datapath with Control Signals PCWriteCond PCWrite PCSource IorD ALUOp MemRead Control ALUSrcB MemWrite ALUSrcA MemtoReg RegWrite IRWrite RegDst Instr[31- PC[31-28] 28 28 Shift Shift 26] Instr[25-0] left 2 2 0 1 0 Memory PC 0 0 Read Addr 1 Read Addr 1 P Address A A Read IR 1 Register 1 zero Data 1 Read Addr 2 ALUout Read Data 0 File (Instr. or Data) ALU Write Addr 1 Read Write Data Write Data Data 2 Data 2 B B 0 0 1 Write Data 4 MDR 1 0 2 Instr[15-0] 3 Sign Shift Extend left 2 32 32 ALU ALU control Instr[5-0] CPE232 Basic MIPS Architecture 6
Multicycle Machine: 1-bit Control Signals Signal Effect when deasserted Effect when asserted The destination register number comes The destination register number comes from RegDst from the rt field the rd field Write is enabled to selected destination None RegWrite register The first ALU operand is the PC The first ALU operand is register A ALUSrcA Content of memory address is placed on None MemRead Memory data out Memory location specified by the address is None MemWrtite replaced by the value on Write data input replaced by the value on Write data input The value fed to register file is from The value fed to register file is from memory MemtoReg ALUOut PC is used as an address to memory ALUOut is used to supply the address to the IorD IorD unit memory unit None The output of memory is written into IR IRWrite PC is written; the source is controlled by y None None PCWrite PCWrite PCSource PC is written if Zero output from ALU is also None PCWriteCond active CPE232 Basic MIPS Architecture 7
Multicycle Machine: 2-bit Control Signals Signal Value Effect ALU performs add operation 00 ALU performs subtract operation ALUOp 01 The funct field of the instruction determines the ALU operation 10 The second input to the ALU comes from register B 00 The second input to the ALU is 4 (to increment PC) 01 ALUSrcB ALUSrcB The second input to the ALU is the sign extended offset The second input to the ALU is the sign extended offset , lower 16 lower 16 10 bits of IR. The second input to the ALU is the sign extended , lower 16 bits of 11 the IR shifted left by two bits Output of ALU (PC +4) is sent to the PC for writing 00 The content of ALUOut are sent to the PC for writing (Branch PCSource 01 address) ) The jump address is sent to the PC for writing 10 CPE232 Basic MIPS Architecture 8
Breaking Instruction Execution into Clock Cycles Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 IFetch IFetch Dec Dec Exec Exec Mem Mem WB WB 1. IFetch: Instruction Fetch and Update PC (Same for all instructions) instructions) Operations � 1.1 Instruction Fetch: IR <= Memory[PC] 1 1 I t ti F t h IR M [PC] 1.2 Update PC : PC <= PC + 4 Control signals values Control signals values � � - IorD = 0 , MemRead = 1 , IRWrite = 1 - ALUSrcA = 0 , ALUSrcB = 01 , ALUOp = 00 , PCWrite = 1 - PCSrc = 00 CPE232 Basic MIPS Architecture 9
Breaking Instruction Execution into Clock Cycles 2. Decode - Instruction decode and register fetch (same f ( for all instructions) We don’t know the instruction yet do non harmful We don t know the instruction yet, do non harmful operations Operations p � 2.1 read the two source registers rs and rt and place them in registers A and B , respectively. A <= Reg[IR[25:21]] B <= Reg[IR[20:16]] 2.2 Compute the branch address ALUOut <= PC + (sign-extend(IR[15:0]) <<2) C Control signals values t l i l l � - ALUSrcA = 0 , ALUSrcB = 11 , ALUOp = 00 CPE232 Basic MIPS Architecture 10
Breaking Instruction Execution into Clock Cycles 3. Execution, Memory address computation, or branch completion Operation in this cycle depends on instruction type Operation in this cycle depends on instruction type Operations � * if * if memory reference, compute address f dd ALUOut <= A + sign-extend(IR[15:0]) ALUSrcA = 1 , ALUSrcB = 10 , ALUOp = 00 ALUS A 1 ALUS B 10 ALUO 00 * if arithmetic logic instruction perform operation * if arithmetic-logic instruction, perform operation ALUOut <= A op B ALUSrcA = 1 ALUSrcB = 00 ALUOp = 10 ALUSrcA = 1 , ALUSrcB = 00 , ALUOp = 10 CPE232 Basic MIPS Architecture 11
Breaking Instruction Execution into Clock Cycles 3. Execution, Memory address computation, or branch completion (continued) operation depends on instruction type operation depends on instruction type Operations � * if b * if branch instruction h i i if (A == B) PC<= ALUOut ALUSrcA = 1 , ALUSrcB = 00 , ALUOp = 01 , ALUS A 1 ALUS B 00 ALUO 01 PCWriteCond = 1 , PCSrc = 01 * if jump instruction PC <= {PC[31:28], (IR[25:0],2’b00)} PCSource = 10 , PCWrite = 1 CPE232 Basic MIPS Architecture 12
Breaking Instruction Execution into Clock Cycles 4. Memory access or R-type completion operation in this cycle depends on instruction type Operations � * if load instruction : read value from memory into MDR MDR <= Memory[ALUOut] MemRead = 1 , IorD = 1 * if store instruction: store rt into memory Memory[ALUOut] <= B MemWrite = 1 , IorD = 1 M W it 1 I D 1 * if arithmetic-logical instruction: write ALU result into rd Reg[IR[15:11]] <= ALUOut MemtoReg = 0 , RegDst = 1 , RegWrite = 1 CPE232 Basic MIPS Architecture 13
Breaking Instruction Execution into Clock Cycles 5. Memory read completion Needed for the load instruction only Operations � 5.1 store the loaded value in MDR into rt Reg[IR[20:16]] <= MDR RegWrite = 1 , MemtoReg = 1 , RegDst = 0 CPE232 Basic MIPS Architecture 14
Breaking Instruction Execution into Clock Cycles In this implementation, not all instructions take 5 � cycles Instruction Class Clock Cycles Required Load 5 Store 4 Branch Branch 3 3 Arithmetic-logical 4 Jump 3 CPE232 Basic MIPS Architecture 15
Multicycle Performance � Compute the average CPI for multicycle implementation for C C f f SPECINT2000 program which has the following instruction mix: 25% loads, 10% stores, 11% branches, 2% jumps, 52% ALU. Assume the CPI for each instruction class as given in the previous table � CPI = Σ CPIi x ICi / IC � CPI = Σ CPIi x ICi / IC = 0.25 x 5 + 0.1 x 4 + 0.11 x 3 + 0.02 x 3 + 0.52 x 4 = 4 12 = 4.12 � Compare to CPI = 1 for single cycle ?!! � Assume CC M = 1/5 CC S M S � Then Performance M / Performance S = (IC x 1 x CC S ) / (IC x 4.12 x (1/5) CC S ) = 1.21 1 21 � Multicycle is also cost-effective in terms of hardware. CPE232 Basic MIPS Architecture 16
Recommend
More recommend