Previous Lecture Slides for Lecture 12 ENCM 501: Principles of - PDF document

/19 ENCM 501 W14 Slides for Lecture 12 Previous Lecture Slides for Lecture 12 ENCM 501: Principles of Computer Architecture Winter 2014 Term ◮ more about multi-level caches Steve Norman, PhD, PEng ◮ classifying cache misses: the 3 C’s Electrical & Computer Engineering ◮ introduction to virtual memory Schulich School of Engineering University of Calgary 25 February, 2014 ENCM 501 W14 Slides for Lecture 12 slide 3/19 ENCM 501 W14 Slides for Lecture 12 slide 4/19 Today’s Lecture Quick review of address translation virtual address virtual page page number offset straight copy translation (no translation!) ◮ Continued explanation of virtual memory. physical page page Related reading in Hennessy & Patterson: Sections B.4–B.5 number offset physical address The master list of VPN-to-PPN translations for a single process is maintained by the O/S kernel in a data structure called a page table. TLBs are circuits capable of doing some of these translations very quickly. slide 5/19 slide 6/19 ENCM 501 W14 Slides for Lecture 12 ENCM 501 W14 Slides for Lecture 12 A couple of questions about address translation (1) A couple of questions about address translation (2) As on the previous slide, process 98 and 99 are running at the Process 98 and 99 are running at the same time. same time. Suppose that 0x7fffff567 is the VPN for a page for Suppose that 0x000000400 is the VPN for a page for process process 98’s stack , and the corresponding PPN is 0x13579bd . 98’s instructions , and the corresponding PPN is 0x1234567 . Suppose that 0x7fffff567 is also the VPN for a page for Suppose that 0x000000400 is also the VPN for a page for process 99’s stack . process 99’s instructions . What can we conclude about the VPN-to-PPN translation for What can we conclude about the VPN-to-PPN translation for VPN 0x7fffff567 in process 99? VPN 0x000000400 in in process 99?

/19 slide 8/19 ENCM 501 W14 Slides for Lecture 12 ENCM 501 W14 Slides for Lecture 12 Linux / Mac OS X virtual address spaces on x86-64 A page table for an x86-64 Linux process Pointers are 64 bits wide, but only the least significant 48 bits are used in a virtual address. The normal page size is 4 KB. So bits 11–0 of an address are byte address page offset, and bits 46–12 of a virtual address are VPN 0xffff ffff ffff ffff (virtual page number). virtual address 0xffff ffff ffff fffe . space for . Conceptually, a page table is just an array of PTEs (page . O/S kernel table entries) , where the indexes are VPNs: 0xffff 8000 0000 0000 HUGE range of VPN invalid addresses 64-bit PTE 0x7fff ffff f 64-bit PTE 0x0000 7fff ffff ffff 0x7fff ffff e virtual address 0x0000 7fff ffff fffe . . . . . . . space for . . user processes 64-bit PTE 0x0000 0000 0000 0000 0x0000 0000 1 64-bit PTE 0x0000 0000 0 (For 64-bit Microsoft Windows, the picture is either identical, or not quite the same but very similar.) ENCM 501 W14 Slides for Lecture 12 slide 9/19 ENCM 501 W14 Slides for Lecture 12 slide 10/19 What information is in a PTE? Suppose that a page table really is just a big array, as shown A PTE answers several different questions about a virtual on the previous slide. page. Here is an incomplete list: ◮ First, does the virtual page even exist? (For a typical How much space would such a page table occupy? x86-64 Linux process, the vast majority of VPNs in the The answer to the above question is a totally unreasonable range from 0x0000 0000 0 from 0x7fff ffff f number, so we’ll need to use more complex and much more correspond to non-existent virtual pages.) space-efficient data structures for page tables. ◮ If the page exists, is it present in physical memory? Let’s worry about the data structures later, and continue for a ◮ If the page is present, what is the PPN (physical page while with the simple model that a page table is just a big number)? array of PTEs. ◮ What are the permissions for the page—can the process write to the page, and can it fetch instructions from the page? slide 11/19 slide 12/19 ENCM 501 W14 Slides for Lecture 12 ENCM 501 W14 Slides for Lecture 12 PTE formats in x86-64 Linux (1) PTE formats in x86-64 Linux (2) Now let’s look at a PTE for a page that does exist, and is present in physical memory. First, let’s look at a PTE for a page that does not exist. How can a page exist but NOT be present in physical memory? I haven’t found documentation to confirm this, but I’m pretty Okay, back to the PTE format for a page that is present . . . sure that 64 zeros indicate that there is no page bit numbers within PTE corresponding to a VPN: 63 51 12 8 2 1 0 bit numbers within PTE up to 40 bits for PPN 1 63 0 0 0 0 0 · · · XD more page status bits R/W P : unused bits Let’s make some notes about the P, R/W and XD bits.

/19 slide 14/19 ENCM 501 W14 Slides for Lecture 12 ENCM 501 W14 Slides for Lecture 12 PTE formats in x86-64 Linux (3) Review of P3/P4 memory system structure And here is a PTE for a page that exists, but is not present in I-TLB DRAM CONTROLLER L1 I- physical memory. CACHE UNIFIED DRAM 63 1 0 CORE L2 MODULES page location on disk, other info about page 0 CACHE D-TLB L1 D- CACHE P We won’t go into detail about bits 63–1, but if the assumption On every instruction fetch, the I-TLB must attempt to on slide 11 is correct, they must not all be zero. translate a virtual instruction address into a physical instruction address. Source for information on this slide and slide 12: Bryant, R. E. and O’Hallaron, D. R., Computer Systems: A Programmer’s On every data read or write, the D-TLB must attempt to Perspective, 2nd ed. , published by Prentice Hall. translate a virtual data address into a physical data address. ENCM 501 W14 Slides for Lecture 12 slide 15/19 ENCM 501 W14 Slides for Lecture 12 slide 16/19 TLB structure TLB hits A TLB is essentially a cache for page table information. Let’s outline: A page table is a complete list of the statuses of all of the ◮ how a TLB hit is detected; virtual pages belonging to a process. ◮ what happens as a result of a TLB hit. A TLB contains some of the most recently accessed information in a page table. slide 17/19 slide 18/19 ENCM 501 W14 Slides for Lecture 12 ENCM 501 W14 Slides for Lecture 12 Simple TLB misses DRAM, disk storage and flash memory Here’s a story that is simple, easy to understand, but not actually true . . . ◮ Instructions and data belonging to the kernel and to The simplest form of a TLB miss occurs when there is a valid processses are in DRAM. VPN-to-PPN translation, which is in the page table, but not in ◮ I-caches and D-caches allow processor cores to access the TLB. instructions and data much faster than if all such accesses really had to go to DRAM. Let’s describe how such a TLB miss is handled. ◮ Non-volatile storage, such as magnetic disks and flash memory arrays, are used for file storage. That’s actually a good model to start with, but it’s wrong! What is a more accurate model?

/19 ENCM 501 W14 Slides for Lecture 12 Upcoming Topics Short-term: ◮ Completion of material on virtual memory. ◮ Simple pipelining. Related reading in Hennessy & Patterson: Sections B.4–B.5, Appendix C. Big topics for the second half of the course: ◮ Instruction-level parallelism. ◮ Thread-level parallelism. Related reading in Hennessy & Patterson: Appendix C, Chapters 3 and 5.

Previous Lecture Slides for Lecture 12 ENCM 501: Principles of - PDF document

slide 2/19 ENCM 501 W14 Slides for Lecture 12 Previous Lecture Slides for Lecture 12 ENCM 501: Principles of Computer Architecture Winter 2014 Term more about multi-level caches Steve Norman, PhD, PEng classifying cache misses: the 3

MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN

Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides

SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides

Previous Lecture Todays Lecture Slides for Lecture 5 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 30 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 28 Completion of divide-by-3 counter

Previous Lecture Todays Lecture Slides for Lecture 12 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 3 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 2 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 35 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 32 Completion of a timing analysis

Previous Lecture Todays Lecture Slides for Lecture 26 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 33 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 6 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 27 ENEL 353: Digital Circuits Fall

Knape &Vogt Slides Last Updated: 07/02/10 M averick Hardware KV Slides Medium Duty Slides

Pipelining Philipp Koehn 7 October 2019 Philipp Koehn Computer Systems Fundamentals: Pipelining

Final Exam Review Slides Fall 2017 1 Review Topics Number Representation C Programming LC-3

Mental Maths Strategies Workshop 1: Addition and Subtraction 2014 1 Overview Workshop 1

CS 35101 Computer Architecture Spring 2008 Chapter 3 Part 2 (3.4-3.6, Apndx B) Taken from Mary

Introduction The goal of neuromorphic engineering is to design and implement micro- electronic

Addressing modes, Procedure calls and the Stack Frame Eric McCreath Indirect load/store

SI232 push(2) SlideSet #4: Procedures push(1) (more Chapter 2) pop() pop() push(6) pop()

Time-aware Large Kernel Convolutions Vasileios Lioutas and Yuhong Guo ICML | 2020 Brief Overview

Sambuz

Useful Links

Newsletter

Mail Us

Previous Lecture Slides for Lecture 12 ENCM 501: Principles of - PDF document

slide 2/19 ENCM 501 W14 Slides for Lecture 12 Previous Lecture Slides for Lecture 12 ENCM 501: Principles of Computer Architecture Winter 2014 Term more about multi-level caches Steve Norman, PhD, PEng classifying cache misses: the 3

MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN SLIDES [EN] MARKDOWN

Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides

SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides SBF AGM 2017 CEO Slides

Previous Lecture Todays Lecture Slides for Lecture 5 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 30 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 28 Completion of divide-by-3 counter

Previous Lecture Todays Lecture Slides for Lecture 12 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 3 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 2 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 35 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 32 Completion of a timing analysis

Previous Lecture Todays Lecture Slides for Lecture 26 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 33 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 6 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 27 ENEL 353: Digital Circuits Fall

Knape &amp;Vogt Slides Last Updated: 07/02/10 M averick Hardware KV Slides Medium Duty Slides

Pipelining Philipp Koehn 7 October 2019 Philipp Koehn Computer Systems Fundamentals: Pipelining

Final Exam Review Slides Fall 2017 1 Review Topics Number Representation C Programming LC-3

Mental Maths Strategies Workshop 1: Addition and Subtraction 2014 1 Overview Workshop 1

CS 35101 Computer Architecture Spring 2008 Chapter 3 Part 2 (3.4-3.6, Apndx B) Taken from Mary

Introduction The goal of neuromorphic engineering is to design and implement micro- electronic

Addressing modes, Procedure calls and the Stack Frame Eric McCreath Indirect load/store

SI232 push(2) SlideSet #4: Procedures push(1) (more Chapter 2) pop() pop() push(6) pop()

Time-aware Large Kernel Convolutions Vasileios Lioutas and Yuhong Guo ICML | 2020 Brief Overview

Sambuz

Useful Links

Newsletter

Mail Us

Knape &Vogt Slides Last Updated: 07/02/10 M averick Hardware KV Slides Medium Duty Slides