Chapter 12 CPU Structure and Function Contents Processor - PowerPoint PPT Presentation

Chapter 12 CPU Structure and Function

Contents • Processor organization • Register organization • Instruction cycle • Instruction pipelining • Pentium processor • PowerPC processor

12.1 Processor Organization • Requirements on CPU —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data • CPU consists of —ALU —Control unit —Registers —Internal bus

CPU With Systems Bus

CPU Internal Structure

12.2 Register Organization • Design issues — Completely GPRs vs specialized registers – Specialized registers for particular operands + only BX, SI, and DI used for storing offset address in 80x86 + saving bits to represent them – Specialization limits programmer’s flexibility — Number of registers – For CISC, between 8 and 32 regarded as optimum + Fewer registers result in more memory references + More registers do not noticeably reduce memory references – RISC uses hundreds of registers — Register length – Address register must be long enough to hold the target address – Data register must be long enough to hold values of most data types + Some machine allow consecutive registers to hold double-length values

User Visible Registers • GPR • Data register • Address register —Segment pointers —Index registers —Stack pointer • Condition codes(flags) —Set according to the result of operations —Used for checking certain condition —Can be read (implicitly) by programs – e.g. Jump if zero —Can not (usually) be set by programs

Control & Status Registers • Program Counter — Updated after each instruction fetch — Updated when branch instruction is met • Instruction Register • Memory Address Register — Connected directly to address bus • Memory Buffer Register — Connected directly to data bus • Program Status Word — Sign, zero, carry, equal, overflow, interrupt enable/disable, supervisor mode • Others — Pointer to PCB (Process Control Block) , Interrupt vector register — Stack-related registers, Page table pointer

Supervisor Mode • Intel microprocessor has 4 modes —Ring zero – Kernel functions —Ring one – Operating system functions —Ring three – User programs —Ring two – May be used for DBMS

Example Register Organizations • Motorola MC68000 ( Not including purly internal regs ) —8 data registers – Used primarily for data manipulation + 8-, 16-, and 32-bit operations are possible – Also used as index registers —9 address registers – 32-bit wide – Includes two stack pointers + One for u ser and one for system —PC and status register There are no special perpose registers in this CPU.�

Example Register Organizations • Intel 8086 (Every register is special purpose) —4 16-bit data registers ( can be used as general in some instructions ) – AX, BX, CX, DX —4 pointer and index registers – SP, BP, SI, DI —4 segment registers – CS, DS, SS, ES —Instruction Pointer and flags Registers have general as well as special purposes.� � There is no universally accepted philosophy � concerning the best way to organize CPU registers.

Example Register Organizations

12.3 Instruction Cycle • Subcycles of instruction cycle —Fetch —Execute —Interrupt —Indirect(Newly added) • Indirect cycle —Indirect addressing requires additional memory access —Can be thought of as additional instruction subcycle

Instruction Cycle State Diagram

Data Flow • Fetch cycle —PC contains address of next instruction —Address moved to MAR —Address placed on address bus —Control unit requests memory read —Result placed on data bus, copied to MBR, then to IR —Meanwhile PC incremented by 1

Data Flow, Fetch Cycle

Data Flow • Indirect cycle —IR is examined —If indirect addressing, indirect cycle is performed – Right most N bits of MBR transferred to MAR – Control unit requests memory read – Result (address of operand) moved to MBR

Data Flow, Indirect Cycle

Data Flow • Execute cycle —May take many forms depending on instructions —May include – Memory read/write – Input/Output – Register transfers – ALU operations

Data Flow • Interrupt cycle —Current PC saved to allow resumption after interrupt – Contents of PC copied to MBR – Special memory location (e.g. stack pointer) loaded to MAR – MBR written to memory —PC loaded with address of interrupt handling routine —Next instruction (first of interrupt handler) can be fetched

Data Flow, Interrupt Cycle

12.4 Instruction Pipelining • Pipelining strategy —Similar to an assembly line in automobile factory – Instruction has a number of stages – Stages can be executed simultaneously • Simple two-stage pipelining —Fetch and execute stages —If two stages were of equal duration, instruction cycle time would be halved —But things are not that easy – Execution time is longer than fetch time + Fetch stage may have to wait – Conditional branch makes the next instruction unknown + Fetch stage wait or guess the branch

Two Stage Instruction Pipeline

Instruction Pipelining • More stages mean further speedup —Fetch Instruction(FI) —Decode Instruction(DI) —Calculate Operands (CO) —Fetch Operands(FO) —Execute Instructions(EI) —Write Operand(WO) • Characteristics(Equal duration assumed) —Reduced execution time for 9 inst. from 54 to 14 —Some instructions may not go through all 6 stages – LOAD does not need WO stage —Some stages may not be performed in parallel – FI, FO, and WO stages involve a memory access

Timing Diagram for Instruction Pipeline Operation

Instruction Pipelining • Factors that limit performance enhancement —Stages may not be of equal duration —Conditional branch instruction – Invalidate several instruction fetches —Interrupt —Data dependency – CO stage may depend on the contents of a register that could be altered by a previous instruction that is still in pipeline – System need to contain logic to solve this conflict

Effect of a Conditional Branch

Fetch Instruction FI Decode DI Instruction Calculate CO Operands Uncon- Yes ditional Branch? No Fetch FO Operands Execute EI Instruction Update Write WO PC Operands Empty Branch Pipe Yes No or Inter -rupt? Figure 12.12 Six-Stage Instruction Pipeline

FI DI CO FO EI WO FI DI CO FO EI WO 1 I1 1 I1 2 I2 I1 2 I2 I1 3 I3 I2 I1 3 I3 I2 I1 4 I4 I3 I2 I1 4 I4 I3 I2 I1 5 I5 I4 I3 I2 I1 5 I5 I4 I3 I2 I1 6 I6 I5 I4 I3 I2 I1 6 I6 I5 I4 I3 I2 I1 Time 7 I7 I6 I5 I4 I3 I2 7 I7 I6 I5 I4 I3 I2 8 I8 I7 I6 I5 I4 I3 8 I15 I3 9 I9 I8 I7 I6 I5 I4 9 I16 I15 10 I9 I8 I7 I6 I5 10 I16 I15 11 I9 I8 I7 I6 11 I16 I15 12 I9 I8 I7 12 I16 I15 13 I9 I8 13 I16 I15 14 I9 14 I16 (a) No branches (b) With conditional branch Figure 12.13 An Alternative Pipeline Depiction

Pipeline Performance • Measures of performance —Cycle time can be determined as τ = max[ τ i ] + d = τ m + d 1 <= i <= k where τ m = maximum stage delay k = number of stages in the instruction pipeline d = time delay of a latch, needed to advance signals and data from one stage to the next —We can ignore d since τ m >> d —Total time T k to execute n instructions is T k = [k + (n - 1)] τ —Thus speedup factor is defined as S k = T 1 /T k = nk τ /[k +(n - 1)] τ = nk/ [k +(n - 1)]

Speedup Factors with Pipelining

Dealing with Branches • Approaches for dealing with branches —Multiple streams —Prefetch branch target —Loop buffer —Branch prediction —Delayed branch • Multiple streams —Have two pipelines – Prefetch each branch into a separate pipeline – Use appropriate pipeline —Problems – There may be contention delays for accessing data – Additional branch instruction needs an additional stream

Dealing with Branches • Prefetch branch target —Target of branch is prefetched in addition to the instruction following branch —Keep target until branch is executed —Used in IBM 360/91 • Loop buffer —Contains n most recently fetched instructions, in sequence —Whenever a branch is to be taken, buffer is checked —Well suited to dealing with loops – If loop buffer is large enough to contain all the instructions in a loop, we need to fetch them only once —Used in CDC and CRAY-1

Loop Buffer Diagram

Dealing with Branches • Branch prediction —Predict never taken —Predict always taken —Predict by opcode —Taken/not taken switch —Branch history table • Predict never taken —Assume that jump will not happen – Always fetch next instruction —Used in MC68020 & VAX 11/780 —VAX will not prefetch the instruction after branch if a page fault would result

Chapter 12 CPU Structure and Function Contents Processor - PowerPoint PPT Presentation

Chapter 12 CPU Structure and Function Contents Processor organization Register organization Instruction cycle Instruction pipelining Pentium processor PowerPC processor 12.1 Processor Organization Requirements

Router Architectures CPU CPU Memory Memory packets NFE NFE Processor Processor Line Card

TXN/SEC CPU CORES TXN/SEC CPU CORES TXN/SEC CPU CORES TXN/SEC CPU CORES TXN/SEC CPU CORES

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

CPU Scheduling CPU Scheduling CPU Scheduling 101 CPU Scheduling 101 The CPU scheduler makes a

CPU Scheduling CPU Scheduling CPU Scheduling 101 CPU Scheduling 101 The CPU scheduler makes a

CPU scheduling CPU 1 P k P 3 P 2 P 1 . . . CPU 2 . . . CPU n The scheduling problem: - Have

CPU Scheduling Heechul Yun 1 Agenda Introduction to CPU scheduling Classical CPU

FPGA co-processor Patrick Dunne for the co-processor group Introduction Co-processor will

CHAPTER IX IX CHAPTER Radial Basis Function Networks Radial Basis Function Networks CHAPTER IX

CPSC 410/611: Week 4 Threads CPU Scheduling Synchronization (Part I) CPU

Processor Design Pipelined Processor Hung-Wei Tseng Drawbacks of a single-cycle processor

Systems Architecture The ARM Processor The ARM Processor p. 1/14 The ARM Processor ARM:

Data Representation Computers and Programs A computer is CPU basically a processor (CPU)

CPU Scheduling Eric McCreath Introduction CPU scheduling is at the heart of a multiprogrammed

CPU Scheduling Mehdi Kargahi School of ECE University of Tehran Spring 2008 CPU and I/O Bursts

Lecture 16: Basic CPU Design Todays topics: Single-cycle CPU Multi-cycle CPU

ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Control Part 4 Other control

Inversion in optimal control. Principles and examples Nicolas Petit Centre Automatique et

PRACTICAL CONTROL FLOW INTEGRITY & RANDOMIZATION FOR BINARY EXECUTABLES Christos Tselas,

Incremental Computation of Warranted Arguments in Dynamic Defeasible Argumentation: The Rule

Mexico City 24 oct - 5 nov International Business Tour 2019 What is the International Business

1 Building a culture of great service Trends around the challenges: Employees are not

Billy Spitzer New England Aquarium Why informal education? The 95% solution Who

Boosting Climate Education Mid-Review, 31.03.2020 Education Team A, Design for Government

Sambuz

Useful Links

Newsletter

Mail Us

Chapter 12 CPU Structure and Function Contents Processor - PowerPoint PPT Presentation

Chapter 12 CPU Structure and Function Contents Processor organization Register organization Instruction cycle Instruction pipelining Pentium processor PowerPC processor 12.1 Processor Organization Requirements

Router Architectures CPU CPU Memory Memory packets NFE NFE Processor Processor Line Card

TXN/SEC CPU CORES TXN/SEC CPU CORES TXN/SEC CPU CORES TXN/SEC CPU CORES TXN/SEC CPU CORES

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

CPU Scheduling CPU Scheduling CPU Scheduling 101 CPU Scheduling 101 The CPU scheduler makes a

CPU Scheduling CPU Scheduling CPU Scheduling 101 CPU Scheduling 101 The CPU scheduler makes a

CPU scheduling CPU 1 P k P 3 P 2 P 1 . . . CPU 2 . . . CPU n The scheduling problem: - Have

CPU Scheduling Heechul Yun 1 Agenda Introduction to CPU scheduling Classical CPU

FPGA co-processor Patrick Dunne for the co-processor group Introduction Co-processor will

CHAPTER IX IX CHAPTER Radial Basis Function Networks Radial Basis Function Networks CHAPTER IX

CPSC 410/611: Week 4 Threads CPU Scheduling Synchronization (Part I) CPU

Processor Design Pipelined Processor Hung-Wei Tseng Drawbacks of a single-cycle processor

Systems Architecture The ARM Processor The ARM Processor p. 1/14 The ARM Processor ARM:

Data Representation Computers and Programs A computer is CPU basically a processor (CPU)

CPU Scheduling Eric McCreath Introduction CPU scheduling is at the heart of a multiprogrammed

CPU Scheduling Mehdi Kargahi School of ECE University of Tehran Spring 2008 CPU and I/O Bursts

Lecture 16: Basic CPU Design Todays topics: Single-cycle CPU Multi-cycle CPU

ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Control Part 4 Other control

Inversion in optimal control. Principles and examples Nicolas Petit Centre Automatique et

PRACTICAL CONTROL FLOW INTEGRITY &amp; RANDOMIZATION FOR BINARY EXECUTABLES Christos Tselas,

Incremental Computation of Warranted Arguments in Dynamic Defeasible Argumentation: The Rule

Mexico City 24 oct - 5 nov International Business Tour 2019 What is the International Business

1 Building a culture of great service Trends around the challenges: Employees are not

Billy Spitzer New England Aquarium Why informal education? The 95% solution Who

Boosting Climate Education Mid-Review, 31.03.2020 Education Team A, Design for Government

Sambuz

Useful Links

Newsletter

Mail Us

PRACTICAL CONTROL FLOW INTEGRITY & RANDOMIZATION FOR BINARY EXECUTABLES Christos Tselas,