Chapter 13 Reduced Instruction Set Computers Contents Instruction - PowerPoint PPT Presentation

Chapter 13 Reduced Instruction Set Computers

Contents • Instruction execution characteristics • Use of a large register file • Compiler-based register optimization • Reduced instruction set architecture • RISC pipelining • MIPS R4000 • SPARC • RISC vs CISC controversy

Major Advances in Computers • The family concept —IBM System/360 in 1964 —DEC PDP-8 —Separates architecture from implementation • Microprogrammed control unit —Idea by Wilkes in 1951 —Produced by IBM S/360 in 1964 —Each machine instruction is interpreted as a sequence of microinstructions • Cache memory —IBM S/360 model 85 in 1968

Major Advances in Computers • Microprocessors —Intel 4004 in 1971 • Pipelining —Introduces parallelism into instruction execution • Multiple processors • RISC architecture —Large number of GPRs – Use of compiler technique to optimize register usage —Limited and simple instruction set —Emphasis on optimizing the instruction pipeline

Comparison of processors

13.1 Execution Characteristics • Evolution of program execution —Problems with software development – High cost and unreliability —More powerful and complex languages are developed – Can express algorithms more concisely – Take care of the detail —Semantic gap problem – Difference between HLL operations and machine instructions – Execution inefficiency, excessive program size, and compiler complexity —Architecture to close this gap – Large instruction set, many addressing modes, and HLL statements implemented in hardware —Simple architecture : RISC

Execution Characteristics • Aspects of instruction execution —Operations performed – Functions performed by CPU —Operands used – Operand characteristics determine memory organization and addressing modes —Execution sequencing – Determines control and pipeline organization

Relative Dynamic Frequency of HLL Operations

Analysis • Assignment statements predominate —Simple movement of data is highly important • Quite of conditional statements —Sequence control mechanism is important • Results do not reveal which statements use the most time in execution of a typical program —Which statement causes the execution of the most machine-language instructions?

Weighted Relative Dynamic Frequency For the data in columns four and five, each value in the second and third columns is multiplied by the number of machine instructions produced by the compiler and then normalized. The sixth and seventh columns are obtained by multiplying the frequency of occurrence of each statement type by the relative number of memory references caused by each statement.

Operands • Dynamic frequency of occurrence of classes of variables —Mainly local scalar variables – Optimization should concentrate on accessing local variables —One study shows that each instruction references 0.5 operand in memory and 1.4 registers – Fast operand accessing is important

Procedure Calls Procedure Arguments and Local Scalar Variables

Procedure Calls • Most time-consuming operations • Two important aspects —Number of parameters passed —Depth of nesting • Studies show —Number of words required per procedure is not large —Nesting is not that deep • Operand references are highly localized

Implications • Conclusions of experiments —Instruction set architecture close to HLL is not the most effective design —We need to optimize performance of the most time consuming parts of programs • Characteristics of RISC architecture —Large number of registers – To optimize operand referencing —Careful design of pipelines – There are many conditional branch and call instructions —Simplified (reduced) instruction set

13.2 Use of Large Register File • Register size is limited, so —We need to keep most frequently accessed operands —We need to minimize register-memory operations • Software solution —Require compiler to allocate registers – Allocate based on most used variables in a given time – Requires sophisticated program analysis • Hardware solution —Have more registers —Thus more variables will be in registers

Register Windows • Organization of large set of registers —Most registers for local scalars – A few for global variables —Definition of local changes with each procedure call – Only a few parameters and local variables – Depth of nested call is relatively narrow – So multiple small set of registers can be used —Register window – Consists of 3 fixed-size areas + Parameter, local, and temporary registers —Circular buffer of register windows – Used in SPARC, and IA-64 – N-window register file can hold N-1 procedure activations – Berkeley RISC use 8 windows of 16 registers each

Overlapping Register Windows

Circular Buffer diagram

Operation of Circular Buffer • When a call is made, a current window pointer is moved to show the currently active register window • If all windows are in use, an interrupt is generated and the oldest window (the one furthest back in the call nesting) is saved to memory • A saved window pointer indicates where the next saved windows should restore to

Global Variables • Two options —Assign memory locations – There may be frequently accessed global variables —Assign a set of global registers – Fixed and available to all procedure

Large Register File vs Cache

Referencing a Scalar - Window Based Register File

Referencing a Scalar - Cache

13.3 Compiler Based Register Optimizing • Case when only a small number of registers is available —Compiler is responsible for the optimized usage • Approach —Variable is assigned to a symbolic(virtual) register —Symbolic registers whose usage does not overlap can share the same real register —Which symbolic registers to which real registers? – Graph coloring can be used here – If two symbolic registers are live during the same program fragment, they are joined by an edge to depict interference – Try to color the graph with n colors, where n is the number of registers

Graph Coloring Problem • Given a graph consists of nodes and edges, assign colors to nodes such that adjacent nodes have different colors and do this in such a way as to minimize the number of different colors • Konigsburg bridges problem —Introduced by Euler

Graph Coloring Approach

13.4 RISC Architecture • Why CISC? —Richer instruction sets – Larger number and more complex instructions —Reasons – To simplify compilers – To improve performance —Can CISC simplify compilers? – If there are machine instructions that resemble HLL statements, task can be simplified – But complex instructions are hard to exploit – Optimizing the code is much more difficult with a complex instruction set

RISC Architecture • Why CISC? —Is CISC program smaller? – Not quite – CISC program may have fewer instructions – But each instruction is longer + Longer opcodes and address fields —Is CISC program faster? – A complex HLL operation will execute more quickly as a single machine instruction rather than as a series of more primitive instructions + Use of complex instructions are quite limited – Control unit must be made more complex

Code Size Relative to RISC I

RISC Architecture • Characteristics of RISC —Common properties – One instruction per cycle – Register-to-register operations – Simple addressing modes – Simple instruction formats —One machine instruction per machine cycle – Machine instruction is directly executed by H/W —Register-to-register operations – Only LOAD/STORE instructions access memory – Instruction set and control unit can be simplified + RISC may have only one or two ADD instructions + VAX(CISC) has 25 different ADD instructions

RISC Architecture —Simple addressing modes – Almost all instructions use simple register addressing —Simple instruction formats – Only a few formats are used – Instruction length is fixed and aligned on word boundaries

Two Comparisons

RISC Architecture • Benefits of RISC approach —Performance – More effective optimizing compilers can be developed – Most instructions are directly executed by H/W – Instruction pipelining can be applied more effectively – More responsive to interrupts —VLSI implementation – Single-chip processor – Devote chip area to those activities that occur frequently + Simple instructions and local scalars + RISC I devotes 6% of its area to control unit + Typical CISC devotes about 50% – Design-and-implementation time

Design and Layout Effort MMU: Short for memory management unit, the hardware component that manages virtual � memory systems. Typically, the MMU is part of the CPU, though in some designs it is a � separate chip. The MMU includes a small amount of memory that holds a table matching � virtual addresses to physical addresses. This table is called the Translation Look-aside � Buffer (TLB). All requests for data are sent to the MMU, which determines whether the data � is in RAM or needs to be fetched from the mass storage device. If the data is not in memory, � the MMU issues a page fault interrupt.

Chapter 13 Reduced Instruction Set Computers Contents Instruction - PowerPoint PPT Presentation

Chapter 13 Reduced Instruction Set Computers Contents Instruction execution characteristics Use of a large register file Compiler-based register optimization Reduced instruction set architecture RISC pipelining MIPS

CISC / RISC Complex / Reduced Instruction Set Computers CISC / RISC p. 1/12 Instruction

Reduced Instruction Set Computers Raul Queiroz Feitosa Parts of these slides are from the

DLX computer Electronic Computers M 1 RISC architectures RISC vs CISC (Reduced Instruction Set

Chapter 2 Instructions: Language of the Computer 2.1 Introduction Instruction Set The

Chapter 2 Instructions: Language of the Computer 2.1 Introduction Instruction Set The

Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 3) ILP vs. Parallel

Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 3) ILP vs. Parallel

Chapter 7 Assembly Language Computing Layers Problems Algorithms Language Instruction Set

Chapter 5 A Closer Look at Instruction Set Architectures Chapter 5 Objectives Understand

Pipelining Instruction Pipelining is the use of pipelining to allow more than one instruction to

Chapter 2 Chapter 2 Instruction-Level Parallelism and Its Exploitation p 1 Overview

Chapter 12 CPU Structure and Function Contents Processor organization Register

Chapter 5 A Closer Look at Instruction Set Architectures Objectives Understand the factors

Chapter 1: Introduction to Computers and Java Chapter Topics Chapter 1 discusses the following

The Case for the Reduced Instruction Set Computer David Patterson and David Ditzel 1 Context

Chapter 11 Instruction Sets: Addressing Modes and Formats Contents Addressing Pentium

Chapter Thirteen: Stack Machines Formal Language, chapter 13, slide 1 1 Stacks are

Topics Simple Recursion Chapter 13 Recursion with a Return Value Binary Search

Configuration Management Chapter 13 Outline of the Lecture Purpose of Software Configuration

Software Quality Engineering: Testing, Quality Assurance, and Quantifiable Improvement Jeff

The Essentials of CAGD Chapter 13: NURBS Gerald Farin & Dianne Hansford CRC Press, Taylor

Reduce Items and A.ributes Han-Wei Shen Five Major

Some Standard Classes Chapter 13 1 For Next Time Read Chapter 13 2 Packages Classes

Chapter 13. Newtons Theory of Gravity Chapter Goal: To use Newtons theory of gravity to

Chapter 13 Reduced Instruction Set Computers Contents Instruction - PowerPoint PPT Presentation

Chapter 13 Reduced Instruction Set Computers Contents Instruction execution characteristics Use of a large register file Compiler-based register optimization Reduced instruction set architecture RISC pipelining MIPS

CISC / RISC Complex / Reduced Instruction Set Computers CISC / RISC p. 1/12 Instruction

Reduced Instruction Set Computers Raul Queiroz Feitosa Parts of these slides are from the

DLX computer Electronic Computers M 1 RISC architectures RISC vs CISC (Reduced Instruction Set

Chapter 2 Instructions: Language of the Computer 2.1 Introduction Instruction Set The

Chapter 2 Instructions: Language of the Computer 2.1 Introduction Instruction Set The

Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 3) ILP vs. Parallel

Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 3) ILP vs. Parallel

Chapter 7 Assembly Language Computing Layers Problems Algorithms Language Instruction Set

Chapter 5 A Closer Look at Instruction Set Architectures Chapter 5 Objectives Understand

Pipelining Instruction Pipelining is the use of pipelining to allow more than one instruction to

Chapter 2 Chapter 2 Instruction-Level Parallelism and Its Exploitation p 1 Overview

Chapter 12 CPU Structure and Function Contents Processor organization Register

Chapter 5 A Closer Look at Instruction Set Architectures Objectives Understand the factors

Chapter 1: Introduction to Computers and Java Chapter Topics Chapter 1 discusses the following

The Case for the Reduced Instruction Set Computer David Patterson and David Ditzel 1 Context

Chapter 11 Instruction Sets: Addressing Modes and Formats Contents Addressing Pentium

Chapter Thirteen: Stack Machines Formal Language, chapter 13, slide 1 1 Stacks are

Topics Simple Recursion Chapter 13 Recursion with a Return Value Binary Search

Configuration Management Chapter 13 Outline of the Lecture Purpose of Software Configuration

Software Quality Engineering: Testing, Quality Assurance, and Quantifiable Improvement Jeff

The Essentials of CAGD Chapter 13: NURBS Gerald Farin &amp; Dianne Hansford CRC Press, Taylor

Reduce Items and A.ributes Han-Wei Shen Five Major

Some Standard Classes Chapter 13 1 For Next Time Read Chapter 13 2 Packages Classes

Chapter 13. Newtons Theory of Gravity Chapter Goal: To use Newtons theory of gravity to

The Essentials of CAGD Chapter 13: NURBS Gerald Farin & Dianne Hansford CRC Press, Taylor