the big picture the performance perspective computer
play

The Big Picture: The Performance Perspective Computer System - PowerPoint PPT Presentation

The Big Picture: The Performance Perspective Computer System Architecture Performance of a machine is Processor Part I determined by: CPI Instruction count Clock cycle time Chalermek Intanagonwiwat Inst. Count Cycle Time


  1. The Big Picture: The Performance Perspective Computer System Architecture • Performance of a machine is Processor Part I determined by: CPI – Instruction count – Clock cycle time Chalermek Intanagonwiwat Inst. Count Cycle Time – Clock cycles per instruction Slides courtesy of John Hennessy and David Patterson The Big Picture (cont.) How to Design a Processor: step- by-step • Processor design (datapath and control) will determine: 1. Analyze instruction set => datapath requirements – Clock cycle time – the meaning of each instruction is given – Clock cycles per instruction by the Register Transfer Language (RTL) • Today: – datapath must include storage element – Single cycle processor: for ISA registers • possibly more • Advantage: One clock cycle per – datapath must support each register instruction transfer • Disadvantage: long cycle time 1

  2. How to Design a Processor The MIPS Instruction Formats (cont.) • All MIPS instructions are 32 bits long. 2. Select set of datapath components The three instruction formats: and establish clocking methodology 3. Assemble datapath meeting the – R-type 31 26 21 16 11 6 0 requirements op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits 31 26 21 16 0 – I-type 4. Analyze implementation of each op rs rt immediate 6 bits 5 bits 5 bits 16 bits instruction to determine setting of 31 26 0 – J-type op target address control points that effects the 6 bits 26 bits register transfer. 5. Assemble the control logic The MIPS Instruction Formats Step 1a: The MIPS-lite Subset (cont.) • ADD and SUB • The different fields are: – addU rd, rs, rt – op: operation of the instruction – subU rd, rs, rt – rs, rt, rd: the source and destination register specifiers 31 26 21 16 11 6 0 op rs rt rd shamt funct – shamt: shift amount 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits – funct: selects the variant of the operation in • OR Immediate: the “op” field – ori rt, rs, imm16 – address / immediate: address offset or immediate value 31 26 21 16 0 op rs rt immediate – target address: target address of the jump 6 bits 5 bits 5 bits 16 bits instruction 2

  3. Step 1a: The MIPS-lite Subset Logical Register Transfers (cont.) • RTL gives the meaning of the • LOAD and STORE Word instructions – lw rt, rs, imm16 – sw rt, rs, imm16 • All start by fetching the 31 26 21 16 0 instruction op rs rt immediate 6 bits 5 bits 5 bits 16 bits • BRANCH: – beq rs, rt, imm16 31 26 21 16 0 op rs rt immediate 6 bits 5 bits 5 bits 16 bits Logical Register Transfers Step 1: Requirements of the (cont.) Instruction Set • Memory op | rs | rt | rd | shamt | funct = MEM[ PC ] op | rs | rt | Imm16 = MEM[ PC ] – instruction & data inst Register Transfers • Registers (32 x 32) ADDU R[rd] <– R[rs] + R[rt]; PC <– PC + 4 SUBU R[rd] <– R[rs] – R[rt]; PC <– PC + 4 – read RS ORi R[rt] <– R[rs] | zero_ext(Imm16); PC <– PC + 4 – read RT LOAD R[rt] <– MEM[ R[rs] + sign_ext(Imm16)]; PC <– PC + 4 STORE MEM[ R[rs] + sign_ext(Imm16) ] <– R[rt]; PC <– PC + 4 – Write RT or RD BEQ if ( R[rs] == R[rt] ) then • PC PC <– PC + sign_ext(Imm16)] || 00 else PC <– PC + 4 3

  4. Step 1: Requirements of the Step 2: Components of the Instruction Set (cont.) Datapath • Extender • Combinational Elements • Add and Sub register or • Storage Elements extended immediate – Clocking methodology • Add 4 or extended immediate to PC Combinational Logic Elements Storage Element: Register • Adder CarryIn • Similar to the D Flip Flop A 32 except Adder Sum 32 Write Enable B Carry 32 – N-bit input and output Data In Data Out Sele N N – Write Enable input • MUX ct A 32 • Write Enable: MUX Y Clk 32 B 32 – negated (0): Data Out will O P not change A 32 • ALU ALU – asserted (1): Data Out will Result 32 B become Data In 32 4

  5. Register File Register File (cont.) • Register File consists of 32 registers: • Register is selected by: – Two 32-bit output busses: – RA (number) selects the register to put on busA (data) busA and busB – RB (number) selects the register to put – One 32-bit input bus: busW on busB (data) RW RARB – RW (number) selects the register to be Write Enable 5 5 5 written busA busW 32 32 32-bit via busW (data) when Write Enable is 1 32 Registers busB Clk 32 Register File (cont.) Register File (cont.) • Built using D flip-flops Read register number 1 Read data 1 • Clock input (CLK) Read register number 2 Read register Register file Write number 1 register – The CLK input is a factor ONLY during Read Register 0 data 2 Write Register 1 M data Write write operation u Read data 1 x Register n – 1 – During read operation, behaves as a Register n Read register combinational logic block: number 2 • RA or RB valid => busA or busB valid after M u Read data 2 “access time.” x 5

  6. Register File (cont.) Storage Element: Idealized Memory • Note: we still use the real clock to determine when to write • Memory (idealized) Write – One input bus: Data In C 0 Register 0 – One output bus: Data Out 1 D n-to-1 C Register number decoder Register 1 D n – 1 Write Enable Address n Data In DataOut C Register n – 1 32 32 D Clk C Register n Register data D Storage Element: Idealized Storage Element: Idealized Memory (cont.) Memory (cont.) • Memory word is selected by: • Clock input (CLK) – Address selects the word to put on – The CLK input is a factor ONLY Data Out during write operation – Write Enable = 1: address selects – During read operation, behaves as a the memory combinational logic block: word to be written via the Data In • Address valid => Data Out valid after bus “access time.” 6

  7. Step 3 3a: Overview of the Instruction Fetch Unit • Register Transfer Requirements • The common RTL operations –> Datapath Assembly – Fetch the Instruction: mem[PC] • Instruction Fetch – Update the program counter: • Sequential Code: PC <- PC + 4 • Read Operands and Execute • Branch and Jump: PC <- “something else” Operation 3a: Overview of the Instruction 3b: Add & Subtract Fetch Unit (cont.) • R[rd] <- R[rs] op R[rt] Example: addU rd, rs, rt – Ra, Rb, and Rw come from instruction’s rs, rt, and rd fields PC Clk – ALUctr and RegWr: control logic Next Address Logic after decoding the instruction Address Instruction Word Instruction 32 Memory 7

  8. Register-Register Timing 3b: Add & Subtract (cont.) Clk 31 26 21 16 11 6 0 Clk-to-Q Old New Value PC op rs rt rd shamt funct Value Instruction Memory Access Time 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits Rs, Rt, Rd, Old New Value Op, Func Value Delay through Control Logic Rd Rs Rt ALUct Old New Value ALUctr RegWr r Value 5 5 5 RegWr Old New Value busA Rw Ra Rb Value Register File Access Time busW 32 32 32-bit Result ALU busA, Old New Value Registers 32 B Value 32 ALU Delay Clk busB busW Old New Value 32 Value Register Write Occurs Here 3c: Logical Operations with 3c: Logical Operations with Immediate (cont.) Immediate Rd Rt • R[rt] <- R[rs] op ZeroExt[imm16] ] RegDst Mux Rs ALUctr RegWr 5 5 5 11 31 26 21 16 0 busA Rw Ra Rb op rs rt immediate busW 32 Result 32 32-bit ALU 6 bits 5 bits 5 bits 16 bits Registers rd? 32 32 busB Clk 32 31 16 15 0 Mux ZeroExt immediate 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 imm16 32 16 16 bits 16 bits ALUSrc 8

  9. 3d: Load Operations (cont.) 3d: Load Operations Rd Rt RegDst Mux • R[rt] <- Mem[R[rs] + SignExt[imm16]] Rs ALUctr RegWr 5 5 5 Example: lw rt, rs, imm16 busA W_Src Rw Ra Rb busW 32 32 32-bit ALU Registers 32 32 busB 11 Clk 31 26 21 16 0 MemWr 32 Mux op rs rt immediate Mux WrEn Adr 6 bits 5 bits 5 bits 16 bits Extender rd Data In 32 Data 32 imm16 32 Memory 16 Clk ALUSrc ExtOp 3e: Store Operations 3e: Store Operations (cont.) Rd Rt ALUctr MemWr W_Src RegDst • Mem[ R[rs] + SignExt[imm16]] <- R[rt] Mux Rs Rt Example: sw rt, rs, imm16 RegWr 5 5 5 busA Rw Ra Rb busW 32 32 32-bit ALU Registers 32 32 31 26 21 16 0 Clk busB op rs rt immediate Mux Mux 32 WrEn Adr 6 bits 5 bits 5 bits 16 bits Extender Data In 32 32 Data imm16 32 Memory 16 Clk ALUSrc ExtOp 9

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend