Previous Lecture Slides for Lecture 5 ENCM 501: Principles of - PDF document

/19 ENCM 501 W14 Slides for Lecture 5 Previous Lecture Slides for Lecture 5 ENCM 501: Principles of Computer Architecture Winter 2014 Term ◮ a little more about die yield Steve Norman, PhD, PEng ◮ measuring and reporting computer performance Electrical & Computer Engineering ◮ quantitative principles of computer design Schulich School of Engineering University of Calgary 23 January, 2014 ENCM 501 W14 Slides for Lecture 5 slide 3/19 ENCM 501 W14 Slides for Lecture 5 slide 4/19 Today’s Lecture ISA versus Microarchitecture As stated before, here are two important parts of computer architecture: ◮ What instructions are available to applications programmers? (Usually indirectly, via compilers.) This is ◮ ISA design ideas often called instruction set architecture , or ISA . ◮ the ISA view of memory ◮ Given the ISA, how exactly are instructions handled by ◮ addressing modes processors—how deep are pipelines; can instructions be executed out-of-order? Related reading in Hennessy & Patterson: Sections A.1–A.3 How is the memory system organized to minimize loss of clock cycles in fetching instructions and reading and writing data? This category of concern is sometimes called microarchitecture or organization . slide 5/19 slide 6/19 ENCM 501 W14 Slides for Lecture 5 ENCM 501 W14 Slides for Lecture 5 In designing a new microarchitechture for an existing ISA, the Classification of ISAs goals are ◮ correct machine-language programs must continue to run correctly; Section A.2 of the textbook identifies four ISA classes: ◮ performance—perhaps running times of tasks, perhaps ◮ stack energy spent per task, perhaps something else—should be ◮ accumulator improved. ◮ register-memory In designing a brand-new ISA or extending an existing ISA in a ◮ register-register/load-store (which is often shortened to major way, the concerns are load-store ◮ (one level up, software) making performance wins as Stack and accumulator ISAs are now mostly of historical straightforward as possible for compiler writers; interest, there are important register-memory and load-store ◮ (one level down, microarchitecture) making hardware architectures in use today. implementation reasonable in terms of design and fabrication costs per chip, chip area, energy and power concerns, and so on.

/19 slide 8/19 ENCM 501 W14 Slides for Lecture 5 ENCM 501 W14 Slides for Lecture 5 Register-memory architectures Note that if one operand is in memory, the instruction require a memory read, a memory write, or both. Here an add follows In this kind of architecture arithmetic/logical instructions such a read: as ADD, SUB, AND, OR, and SHIFT generally have two operands , one of which is allowed to be in memory . # %rax += Mem[-32 + %rbp] x86 and x86-64 are register-memory ISAs. Example addq -32(%rbp), %rax instructions from gcc output for Linux on x86-64 (operands This needs a read, then an add, then a write: are source first, destination second): # Mem[-8 + %rbp] += %rax mov -12(%rbp), %eax addq %rax, -8(%rbp) salq $2, %rax addq -32(%rbp), %rax Some instructions don’t access data memory at all: movl (%rax), %eax mov %eax, %eax addq $4, %rdx # %rdx += 4 addq %rax, -8(%rbp) addq %rcx, %rax # %rax += %rcx addl $1, -12(%rbp) ENCM 501 W14 Slides for Lecture 5 slide 9/19 ENCM 501 W14 Slides for Lecture 5 slide 10/19 Load-store architectures RISC versus CISC CISC: Complex Instruction Set Computer In this kind of architecture, arithmetic/logical instructions do RISC: Reduced Instruction Set Computer not access memory. The terms RISC and CISC were introduced in the early 1980’s Instead data memory access is isolated within load (memory to contrast instruction set design ideas that had dominant up read) and store (memory write) instructions. until that time (CISC) with new design ideas (RISC). Arithmetic/logical instructions typically have three operands. It’s impossible give a fair portrayal of CISC ideas in one or two The destination is a register, and sources are either two slides, but some important design goals in CISC were: registers or one register and one immediate value. Here are MIPS64 examples: ◮ helpfulness to humans writing assembly language code; ◮ helping compiler writers by providing instructions that DADDIU R17, R16, 800 # R17 = R16 + 800 were close matches to expressions in higher-level DADDU R18, R18, R8 # R18 = R18 + R8 languages. slide 11/19 slide 12/19 ENCM 501 W14 Slides for Lecture 5 ENCM 501 W14 Slides for Lecture 5 Classic CISC architectures A classic CISC instruction VAX and 68000 were both “memory-memory” architectures. VAX : Produced by Digital Equipment Corporation. For a long Example 68000 memory-to-memory copy: time this was the dominant machine in university research labs. A lot of important development of the Unix operating MOV.W (A0)+, (A1)+ system was done on “VAXen”. This would read memory using address register A0, write Motorola MC68000: A really important series of memory using address register A1, and update both registers microprocessor designs. Used in the first Apple Macintosh to point to the next 16-bit word in memory! (1984), and used in Macs until about 1996. Used in the first workstations from Sun Microsystems. Used in a huge number How many instructions would a similar operation in MIPS64 of embedded applications. require?

/19 slide 14/19 ENCM 501 W14 Slides for Lecture 5 ENCM 501 W14 Slides for Lecture 5 RISC design goals MIPS: A classic RISC architecture MIPS is one of the earliest RISC architectures, and of the Here are one non-goal and two important goals: RISC architectures that got serious commercial use, perhaps ◮ Convenience for humans writing assembly language is not the “purest” RISC ISA. important. Notably, almost every MIPS instructions does an update in ◮ Supporting efficient microarchitecture, especially pipelined only one place: processing of instructions, is very important. ◮ a register—arithmetic/logical instructions, and loads ◮ Making the microarchitecture reasonably simple and clear ◮ memory—stores is important to writers of optimizing compilers—compilers should be able to schedule ◮ special update to program counter—branches and jumps instructions efficiently. What common MIPS instruction makes two updates? ENCM 501 W14 Slides for Lecture 5 slide 15/19 ENCM 501 W14 Slides for Lecture 5 slide 16/19 The basic CISC versus RISC tradeoff But x86-64 rules in 2014 . . . What happened? x86 and its successor x86-64 are most definitely CISC ISAs. Suppose a program is compiled for a CISC machine and a However . . . RISC machine, assuming similar integrated circuit technology ◮ Intel made enormous revenue from sales to PC for both processors. manufacturers, and spent it well on R and D. ◮ Moore’s Law has held from its first expression until at CPU time = IC × CPI × clock period least the present day. The RISC machine code will tend to have a larger IC than the ◮ It would have been ridiculous in the 1980’s to dedicate CISC machine code, but if the RISC microarchitecture and hardware to translating CISC instructions to RISC compiler are good, the CPI for the RISC machine will be “micro-ops” that are relatively easy to pipeline. much lower than the CPI of the RISC machine. ◮ By sometime in the 1990’s chip area needed to translate CISC instructions to RISC micro-ops became small This turned out to be true for most practical applications, and enough to make translation practical. by 1990 or so, RISC designs dominated the market for “compute-intensive” applications. (Of course, putting the story on one slide is oversimplifying things!) slide 17/19 slide 18/19 ENCM 501 W14 Slides for Lecture 5 ENCM 501 W14 Slides for Lecture 5 A surprising benefit of a CISC ISA The rest of the lecture IC (instruction count) for a program compiled for CISC tends to be lower than IC for the same program compiled for RISC. This will continue under the document camera, and will cover as much of the following as time permits: (Also, in the specific case of x86, some commonly-used instructions are smaller than the usual 4-byte size of a RISC ◮ the ISA view of data memory—alignment and endianness instruction.) ◮ the concept of addressing modes Why does this matter, in an era of very large disk capacity and very large DRAM capacity?

/19 ENCM 501 W14 Slides for Lecture 5 Upcoming Topics ◮ more about ISA design ◮ basics of caches Related reading in Hennessy & Patterson: Sections A.4–A.7, B.1–B.2

Previous Lecture Slides for Lecture 5 ENCM 501: Principles of - PDF document

slide 2/19 ENCM 501 W14 Slides for Lecture 5 Previous Lecture Slides for Lecture 5 ENCM 501: Principles of Computer Architecture Winter 2014 Term a little more about die yield Steve Norman, PhD, PEng measuring and reporting computer

MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN

Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides

SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides

Previous Lecture Todays Lecture Slides for Lecture 5 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 30 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 28 Completion of divide-by-3 counter

Previous Lecture Todays Lecture Slides for Lecture 12 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 3 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 2 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 35 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 32 Completion of a timing analysis

Previous Lecture Todays Lecture Slides for Lecture 26 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 33 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 6 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 27 ENEL 353: Digital Circuits Fall

Knape &Vogt Slides Last Updated: 07/02/10 M averick Hardware KV Slides Medium Duty Slides

CSE 351: The Hardware/Software Interface Section 4 Procedure calls Procedure calls In x86

Guest Speaker: Ray Mondragon Genesis 1 and Science Part 2 June 12, 2018 Dean Bible

Reverse Ray-Tracing in Urchin Cosmological Radiative Transfer Comparison Project December - 2012

GPU Ray-tracing using Irregular Grids Arsne Prard-Gayot, Javor Kalojanov, Philipp Slusallek

and Threads CS 4411 Spring 2020 Outline for Today Intro to EGOS and GitHub Address Space

Binarylevel program analysis: A discussion of x8664 Gang Tan CSE 597 Spring 2019 Penn

LLVM Backend for HHVM Brett Simmers Maksim Panchenko Facebook HHVM JIT for PHP/Hack

Removing ROP Gadgets from OpenBSD AsiaBSDCon 2019 Todd Mortimer mortimer@openbsd.org Overview

Previous Lecture Slides for Lecture 5 ENCM 501: Principles of - PDF document

slide 2/19 ENCM 501 W14 Slides for Lecture 5 Previous Lecture Slides for Lecture 5 ENCM 501: Principles of Computer Architecture Winter 2014 Term a little more about die yield Steve Norman, PhD, PEng measuring and reporting computer

MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN

Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides

SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides

Previous Lecture Todays Lecture Slides for Lecture 5 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 30 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 28 Completion of divide-by-3 counter

Previous Lecture Todays Lecture Slides for Lecture 12 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 3 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 2 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 35 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 32 Completion of a timing analysis

Previous Lecture Todays Lecture Slides for Lecture 26 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 33 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 6 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 27 ENEL 353: Digital Circuits Fall

Knape &amp;Vogt Slides Last Updated: 07/02/10 M averick Hardware KV Slides Medium Duty Slides

CSE 351: The Hardware/Software Interface Section 4 Procedure calls Procedure calls In x86

Guest Speaker: Ray Mondragon Genesis 1 and Science Part 2 June 12, 2018 Dean Bible

Reverse Ray-Tracing in Urchin Cosmological Radiative Transfer Comparison Project December - 2012

GPU Ray-tracing using Irregular Grids Arsne Prard-Gayot, Javor Kalojanov, Philipp Slusallek

and Threads CS 4411 Spring 2020 Outline for Today Intro to EGOS and GitHub Address Space

Binarylevel program analysis: A discussion of x8664 Gang Tan CSE 597 Spring 2019 Penn

LLVM Backend for HHVM Brett Simmers Maksim Panchenko Facebook HHVM JIT for PHP/Hack

Removing ROP Gadgets from OpenBSD AsiaBSDCon 2019 Todd Mortimer mortimer@openbsd.org Overview

Knape &Vogt Slides Last Updated: 07/02/10 M averick Hardware KV Slides Medium Duty Slides