CS 35101 Computer Architecture Spring 2008 Week 10: Chapter - PowerPoint PPT Presentation

CS 35101 Computer Architecture Spring 2008 Week 10: Chapter 5.1-5.3 Materials adapated from Mary Jane Irwin (www.cse.psu.edu/~mji) and Kevin Schaffer [ adapted from D. Patterson slides ] CS 35101 Ch 5.1 Steinfadt, SP08 KSU

Head’s Up  Last course week’s material  Understanding performance, Ch. 4.1-4.6  This week’s material  Designing a MIPS single cycle datapath - Reading assignment – PH 5.1-5.3  Next week’s material  More on single and multi-cycle datapath design - Reading assignment – PH: 5.4-5.6  Reminders  HW 3 is due Thursday, 3/27 by the start of class  Project 2 is posted and due on 4/22  Exam #2 is Tuesday, April 15 CS 35101 Ch 5.2 Steinfadt, SP08 KSU

Datapath design tended to just work … Control paths are where the system complexity lives. Bugs spawned from control path design errors reside in the microcode flow, the finite-state machines, and all the special exceptions that inevitably spring up in a machine design like thistles in a flower garden. The Pentium Chronicles , Colwell, pg. 64 CS 35101 Ch 5.3 Steinfadt, SP08 KSU

Review: Design Principles  Simplicity favors regularity  fixed size instructions – 32-bits  only three instruction formats  Good design demands good compromises  three instruction formats  Smaller is faster  limited instruction set  limited number of registers in register file  limited number of addressing modes  Make the common case fast  arithmetic operands from the register file (load-store machine)  allow instructions to contain immediate operands CS 35101 Ch 5.4 Steinfadt, SP08 KSU

The Processor: Datapath & Control  We're ready to look at an implementation of the MIPS  Simplified to contain only:  memory-reference instructions: lw, sw  arithmetic-logical instructions: add, addu, sub, subu, and, or, xor, nor, slt, sltu  arithmetic-logical immediate instructions: addi, addiu, andi, ori, xori, slti, sltiu  control flow instructions: beq, j  Generic implementation:  use the program counter (PC) to supply Fetch PC = PC+4 the instruction address and fetch the instruction from memory Exec Decode (and update the PC)  decode the instruction (and read registers)  execute the instruction CS 35101 Ch 5.5 Steinfadt, SP08 KSU

Abstract Implementation View  Two types of functional units:  elements that operate on data values (combinational)  elements that contain state (sequential) Write Data Instruction Address Read Register Memory Data Reg Addr Data Read Data Memory File PC Address Instruction ALU Reg Addr Read Write Data Data Reg Addr  Single cycle operation  Split memory model - one memory for instructions and one for data CS 35101 Ch 5.6 Steinfadt, SP08 KSU

Clocking Methodologies  Clocking methodology defines when signals can be read and when they can be written falling (negative) edge clock cycle rising (positive) edge clock rate = 1/(clock cycle) e.g., 10 nsec clock cycle = 100 MHz clock rate 1 nsec clock cycle = 1 GHz clock rate  State element design choices  level sensitive latch  master-slave and edge-triggered flipflops CS 35101 Ch 5.7 Steinfadt, SP08 KSU

State Elements  Set-reset latch R S Q(t+1) !Q(t+1) 1 0 0 1 R Q 0 1 1 0 0 0 Q(t) !Q(t) !Q S 1 1 0 0  Level sensitive D latch clock D Q clock !Q D Q  latch is transparent when clock is high (copies input to output) CS 35101 Ch 5.8 Steinfadt, SP08 KSU

Two-Sided Clock Constraint  Race problem with latch based design … D Q D Q D-latch0 D-latch1 clock !Q clock !Q clock  Consider the case when D-latch0 holds a 0 and D- latch1 holds a 1 and you want to transfer the contents of D-latch0 to D-latch1 and vica versa  must have the clock high long enough for the transfer to take place  must not leave the clock high so long that the transferred data is copied back into the original latch  Two-sided clock constraint CS 35101 Ch 5.9 Steinfadt, SP08 KSU

State Elements, con’t  Solution is to use flipflops that change state (Q) only on clock edge (master-slave) D D D Q D Q Q D-latch D-latch clock clock clock !Q clock !Q !Q Q  master (first D-latch) copies the input when the clock is high (the slave (second D-latch) is locked in its memory state and the output does not change)  slave copies the master when the clock goes low (the master is now locked in its memory state so changes at the input are not loaded into the master D-latch) CS 35101 Ch 5.10 Steinfadt, SP08 KSU

One-Slided Clock Constraint  Master-slave (edge-triggered) flipflops removes one of the clock constraints D Q D Q MS-ff0 MS-ff1 clock !Q clock !Q clock  Consider the case when MS-ff0 holds a 0 and MS-ff1 holds a 1 and you want to transfer the contents of MS-ff0 to MS-ff1 and vica versa  must have the clock cycle time long enough to accommodate the worst case delay path  One-sided clock constraint CS 35101 Ch 5.11 Steinfadt, SP08 KSU

Latches vs Flipflops  Output is equal to the stored value inside the element  Change of state (value) is based on the clock  Latches: output changes whenever the inputs change and the clock is asserted (level sensitive methodology) - Two-sided timing constraint  Flip-flop: output changes only on a clock edge (edge- triggered methodology) - One-sided timing constraint A clocking methodology defines when signals can be read and written – wouldn’t want to read a signal at the same time it was being written CS 35101 Ch 5.12 Steinfadt, SP08 KSU

Our Implementation  An edge-triggered methodology, typical execution  read contents of some state elements (combinational activity, so no clock control signal needed)  send values through some combinational logic  write results to one or more state elements on clock edge State State Combinational element element logic 1 2 clock one clock cycle  Assumes state elements are written on every clock cycle; if not, need explicit write control signal  write occurs only when both the write control is asserted and the clock edge occurs CS 35101 Ch 5.13 Steinfadt, SP08 KSU

Fetching Instructions  Fetching instructions involves  reading the instruction from the Instruction Memory  updating the PC value to be the address of the next (sequential) instruction clock Add 4 Fetch PC = PC+4 Instruction Memory Exec Decode Read PC Instruction Address  PC is updated every clock cycle, so it does not need an explicit write control signal just a clock signal  Reading from the Instruction Memory is a combinational activity, so it doesn’t need an explicit read control signal CS 35101 Ch 5.14 Steinfadt, SP08 KSU

Instruction Formats Review op rs rt rd shamt funct 31:26 25:21 20:16 15:11 10:6 5:0 op rs rt immed 31:26 25:21 20:16 15:0 op address 31:26 25:0 CS 35101 Ch 5.15 Steinfadt, SP08 KSU

Decoding Instructions  Decoding instructions involves  sending the fetched instruction’s opcode and function field bits to the control unit Fetch Control PC = PC+4 Unit Exec Decode Read Addr 1 Read Register Data 1 Read Addr 2 Instruction File Write Addr Read Data 2 and Write Data  reading two values from the Register File - Register File addresses are contained in the instruction CS 35101 Ch 5.16 Steinfadt, SP08 KSU

Reading Registers “Just in Case”  Note that both RegFile read ports are active for all instructions during the Decode cycle using the rs and rt instruction field addresses  Since haven’t decoded the instruction yet, don’t know what the instruction is !  Just in case the instruction uses values from the RegFile do “work ahead” by reading the two source operands Which instructions do make use of the RegFile values?  Also, all instructions (except j ) use the ALU after reading the registers Why? memory-reference? arithmetic? control flow? CS 35101 Ch 5.17 Steinfadt, SP08 KSU

Executing R Format Operations  R format operations ( add, sub, slt, and, or ) 31 25 20 15 10 5 0 R-type: op rs rt rd shamt funct  perform operation (op and funct) on values in rs and rt  store the result back into the Register File (into location rd) RegWrite ALU control Read Addr 1 Fetch Read Register Data 1 PC = PC+4 overflow Read Addr 2 Instruction File zero ALU Write Addr Exec Decode Read Data 2 Write Data  Note that Register File is not written every cycle (e.g. sw ), so we need an explicit write control signal for the Register File CS 35101 Ch 5.18 Steinfadt, SP08 KSU

Consider slt Instruction  R format operations ( add, sub, slt, and, or ) 31 25 20 15 10 5 0 R-type: op rs rt rd shamt funct  perform operation (op and funct) on values in rs and rt  store the result back into the Register File (into location rd) RegWrite ALU control Read Addr 1 Fetch Read Register Data 1 PC = PC+4 overflow Read Addr 2 Instruction File zero ALU Write Addr Exec Decode Read Data 2 Write Data  Note that Register File is not written every cycle (e.g. sw ), so we need an explicit write control signal for the Register File CS 35101 Ch 5.19 Steinfadt, SP08 KSU

CS 35101 Computer Architecture Spring 2008 Week 10: Chapter - PowerPoint PPT Presentation

CS 35101 Computer Architecture Spring 2008 Week 10: Chapter 5.1-5.3 Materials adapated from Mary Jane Irwin (www.cse.psu.edu/~mji) and Kevin Schaffer [ adapted from D. Patterson slides ] CS 35101 Ch 5.1 Steinfadt, SP08 KSU Heads Up

CS 35101 Computer Architecture Spring 2008 2.9 and 2.10 Taken from Mary Jane Irwin

CS 35101 Computer Architecture Spring 2008 Chapter 3 Part 1 (3.1-3.3) Taken from Mary Jane

CS 35101 Computer Architecture Spring 2008 Chapter 3 Part 2 (3.4-3.6, Apndx B) Taken from Mary

MATH2130-F17 Week 13 Week 14 Week 15, Inner Farid Aliniaeifard Product Space CU BOULDER

Time Matters Week 7 Week 6 Prototyping + Needfinding Week 7 Week 8 Implementation Week 9

Math 610 Section 700 - Recitation week 3 week 4 week 6 week 8 TA: Peng Wei Office: Blocker

Galatians: week 3 Galatians 3:1-29 Week 1: Galatians 1:1-2:14 Week 2: Galatians 2:15-21 Week 3:

Vermont M nt Marble: A e: Americas s nt Stone Monument Sto Class S s Schedule e Week

Week 1: Christ: The Source of True Happiness Week 2: Happiness, the Gospel and Living Well Week

Review and Fundamentals Nima Honarmand Spring 2016 :: CSE 502 Computer Architecture

Processor Pipeline Instructor: Nima Honarmand Spring 2015 :: CSE 502 Computer Architecture

Mat 3770 Relationships Week 7 Mary Trees Examples Balanced Prufer Exercises Spring 2014

BRDF BRDF Computer Graphics (Spring 2008) Computer Graphics (Spring 2008) Reflected Radiance

Goals Goals Computer Graphics (Spring 2008) Computer Graphics (Spring 2008) Systems: Be able

To Do To Do Computer Graphics (Spring 2008) Computer Graphics (Spring 2008) Start thinking

To Do To Do Computer Graphics (Spring 2008) Computer Graphics (Spring 2008) Start working on

4. Performance Analysis of Parallel Programs 4.1 Performance Evaluation of Computer User

DUNE DAQ Data format inside FPGA David Cussans 14 th June 2018 Introduction Format for

CS31001 COMPUTER ORGANIZATION AND ARCHITECTURE Debdeep Mukhopadhyay, CSE, IIT Kharagpur

Multi-Cycle CPU: Datapath and Control CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU?

Performance Eric McCreath Increasing Word Size A simple way of improving performance is to

1 Response Time Det tar 4 mnader att odla fram en tomat How long does it take for my job

Lecture 10: Processor design pipelining Overlapping the execution of instructions

PATMOS 2010 An On-Chip Flip-flop Characterization Circuit Andrea Veggetti (ST Agrate) Abhishek