Administrivia Mini project deadline: today Attach the capture of - PowerPoint PPT Presentation

Administrivia • Mini project deadline: today – Attach the capture of the evaluation run output • Guest lecture on Friday – Algorithmic Verification of Stability of Hybrid Systems by Dr. Pavithra Prabhakar. K-State 1

Administrivia • Project proposal due: 2/27 – Original research • Related to real-time embedded systems/CPS – Building a cyber-physical system (robot) • Must include real-time performance evaluation on a selected hardware platform – Repeating the evaluation of a chosen paper • Any one of the suggested papers. 2

Real-Time DRAM Controller Heechul Yun 3

Multicore for Embedded Systems • Benefits of multicore processors – Lots of sensor data to process – More performance, less cost – Save space, weight, power (SWaP) 4

Challenges: Shared Resources T T T T T T T T T1 T2 1 2 3 4 5 6 7 8 Core Core Core Core CPU 4 1 2 3 Memory Hierarchy Memory Hierarchy Unicore Multicore Performance Impact 5

Why is DRAM Important? • Why do we need bigger and faster memory? • Data intensive computing – Bigger, more complex application – Large amount of data processing 6

Why is DRAM Important? • Parallelism – Out-of-order core • A single core can generate many memory requests – Multicore • Multiple cores share DRAM – Accelerator • GPU 7

Memory Performance Isolation Part 2 Part 3 Part 4 Part 1 Core3 Core1 Core2 Core4 LLC LLC LLC LLC Memory Controller DRAM • Q. How to guarantee predictable memory performance? 8

Memory System Architecture L2 CACHE 0 L2 CACHE 1 SHARED L3 CACHE DRAM INTERFACE DRAM BANKS CORE 0 CORE 1 DR DRAM MEM EMORY CONTRO ROLLER L2 CACHE 2 L2 CACHE 3 CORE 2 CORE 3 This slide is from Prof. Onur Mutlu 9

DRAM Organization • Channel • Rank • Chip • Bank • Row • Row/Column 10

The DRAM subsystem “ Channel ” DIMM (Dual in-line memory module) Processor Memory channel Memory channel This slide is from Prof. Onur Mutlu

Breaking down a DIMM DIMM (Dual in-line memory module) Side view Front of DIMM Back of DIMM This slide is from Prof. Onur Mutlu

Breaking down a DIMM DIMM (Dual in-line memory module) Side view Front of DIMM Back of DIMM Rank 0: collection of 8 chips Rank 1 This slide is from Prof. Onur Mutlu

Rank Rank 0 (Front) Rank 1 (Back) <0:63> <0:63> Addr/Cmd CS <0:1> Data <0:63> Memory channel This slide is from Prof. Onur Mutlu

Breaking down a Rank . . . Chip 0 Chip 1 Chip 7 Rank 0 <56:63> <8:15> <0:7> <0:63> Data <0:63> This slide is from Prof. Onur Mutlu

Breaking down a Chip Chip 0 Bank 0 <0:7> <0:7> <0:7> ... <0:7> <0:7> This slide is from Prof. Onur Mutlu

Breaking down a Bank 2kB 1B (column) row 16k-1 ... Bank 0 row 0 <0:7> Row-buffer 1B 1B 1B ... <0:7> This slide is from Prof. Onur Mutlu

Example: Transferring a cache block Physical memory space 0xFFFF…F Channel 0 ... DIMM 0 0x40 Rank 0 64B cache block 0x00 This slide is from Prof. Onur Mutlu

Example: Transferring a cache block Physical memory space Chip 0 Chip 1 Chip 7 Rank 0 0xFFFF…F . . . ... <56:63> <8:15> <0:7> 0x40 64B Data <0:63> cache block 0x00 This slide is from Prof. Onur Mutlu

Example: Transferring a cache block Physical memory space Chip 0 Chip 1 Chip 7 Rank 0 0xFFFF…F . . . Row 0 Col 0 ... <56:63> <8:15> <0:7> 0x40 64B Data <0:63> cache block 0x00 This slide is from Prof. Onur Mutlu

Example: Transferring a cache block Physical memory space Chip 0 Chip 1 Chip 7 Rank 0 0xFFFF…F . . . Row 0 Col 0 ... <56:63> <8:15> <0:7> 0x40 64B Data <0:63> cache block 8B 0x00 8B This slide is from Prof. Onur Mutlu

Example: Transferring a cache block Physical memory space Chip 0 Chip 1 Chip 7 Rank 0 0xFFFF…F . . . Row 0 Col 1 ... <56:63> <8:15> <0:7> 0x40 64B Data <0:63> cache block 8B 0x00 This slide is from Prof. Onur Mutlu

Example: Transferring a cache block Physical memory space Chip 0 Chip 1 Chip 7 Rank 0 0xFFFF…F . . . Row 0 Col 1 ... <56:63> <8:15> <0:7> 0x40 64B Data <0:63> 8B cache block 8B 0x00 8B This slide is from Prof. Onur Mutlu

Example: Transferring a cache block Physical memory space Chip 0 Chip 1 Chip 7 Rank 0 0xFFFF…F . . . Row 0 Col 1 ... <56:63> <8:15> <0:7> 0x40 64B Data <0:63> 8B cache block 8B 0x00 A 64B cache block takes 8 I/O cycles to transfer. During the process, 8 columns are read sequentially. This slide is from Prof. Onur Mutlu

DRAM Organization Core1 Core2 Core3 Core4 L3 Memory Controller (MC) • DRAM DIMM Have multiple banks • Different banks can be accessed in parallel Bank Bank Bank Bank 1 2 3 4

Best-case Core1 Core2 Core3 Core4 L3 Memory Controller (MC) Fast DRAM DIMM • Peak = 10.6 GB/s Bank Bank Bank Bank – DDR3 1333Mhz 1 2 3 4

Best-case Core1 Core2 Core3 Core4 L3 Memory Controller (MC) Fast DRAM DIMM • Peak = 10.6 GB/s Bank Bank Bank Bank – DDR3 1333Mhz 1 2 3 4 • Out-of-order processors

Most-cases Core1 Core2 Core3 Core4 L3 Memory Controller (MC) Mess DRAM DIMM • Performance = ?? Bank Bank Bank Bank 1 2 3 4

Worst-case Core1 Core2 Core3 Core4 L3 Memory Controller (MC) Slow DRAM DIMM • 1bank b/w Bank Bank Bank Bank – Less than peak b/w 1 2 3 4 – How much?

DRAM Chip Bank 4 Bank 3 Bank 2 READ (Bank 1, Row 3, Col 7) Bank 1 Row 1 precharge Row 2 Col7 Row 3 Row 4 Row 5 activate Row Buffer Read/write • State dependent access latency – Row miss: 19 cycles, Row hit: 9 cycles (*) PC6400-DDR2 with 5-5-5 (RAS-CAS-CL latency setting)

DDR3 Timing Parameters Kim et al., “Bounding Memory Interference Delay in COTS-based Multi-Core Systems,” RTAS’14 31

DRAM Controller • Service DRAM requests (from CPU) while obeying timing/resource constraints – Translate requests to DRAM command sequences – Timing constraints: e.g., minimum write-to-read delay, activation time, … – Resource conflicts: bank, bus, channel • Maximize performance – Buffering, reordering, pipelining in scheduling requests 32

DRAM Controller Bruce Jacob et al, “Memory Systems: Cache, DRAM, Disk” Fig 13.1. • Request queue – Buffer read/write requests from CPU cores – Unpredictable queuing delay due to reordering 33

Request Reordering Initial Queue Reordered Queue Core1: READ Row 1, Col 1 Core1: READ Row 1, Col 1 Core2: READ Row 2, Col 1 Core1: READ Row 1, Col 2 Core1: READ Row 1, Col 2 Core2: READ Row 2, Col 1 DRAM DRAM 2 Row Switch 1 Row Switch • Improve row hit ratio and throughput • Unpredictable queuing delay 34

Row Management Policy • Open row – Keep the row open after an access • If next access targets the same row: CAS • If next access targets a different row: PRE + ACT + CAS • Close row – Close the row after an access • always pay the same (longer) cost: ACT + CAS • Adaptive policies 35

Real-Time Memory Controllers • Provided guaranteed performance in accessing DRAM. 36

Real-Time Memory Controllers • Bank grouping – Each mem req. access ALL banks • Private banking – Each core has dedicated DRAM banks • Scheduling – Use analysis friendly scheduling (e.g., round-robin) 37

Real-Time Memory Controllers (RTMC) • Predator • AMC • PRET-MC • DcMc • MEDUSA • Bundling 38

RTMC References • Predator: a predictable sdram memory controller”. CODES+ISSS 2007. • An analyzable memory controller for hard real-time CMPs, IEEE Embedded Systems Letters, 2009 • PRET DRAM controller: Bank privatization for predictability and temporal isolation, CODES+ISSS, 2011 • A dual-criticality memory controller (dcmc): Proposal and evaluation of a space case study, RTAS, 2015 • Improved DRAM Timing Bounds for Real-Time DRAM Controllers with Read/Write Bundling, 2016 • A Comprehensive Study of DRAM Controllers in Real-Time Systems. Danlu Guo, MS Thesis, University of Waterloo, 2016 39

Administrivia Mini project deadline: today Attach the capture of - PowerPoint PPT Presentation

Administrivia Mini project deadline: today Attach the capture of the evaluation run output Guest lecture on Friday Algorithmic Verification of Stability of Hybrid Systems by Dr. Pavithra Prabhakar. K-State 1 Administrivia

Administrivia CSCE150A CSCE150A Computer Science & Engineering 150A Administrivia Problem

Outline Administrivia Introduction to Machine Learning Greg Mori - CMPT 419/726 Machine

More Threads and Synchronization More Threads and Synchronization Administrivia Administrivia

Introduction to Machine Learning Greg Mori - CMPT 419/726 Bishop PRML Ch. 1 Administrivia

Administrivia Administrivia Nachos guide and Lab #1 are on the web.

CSCE 471/871 Lecture 0: Stephen Scott Administrivia Welcome Introduction What is Bioin-

Project 2 Soumya Basu Department of Computer Science Cornell University September 18, 2015

Ontology Engineering Administrivia and general information Maria Keet email: mkeet@cs.uct.ac.za

Administrivia Mini project is graded 1 st place: Justin (75.45) 2 nd place: Liia

Administrivia Website. cis.poly.edu/jsterling/cs3224 Text: Modern Operating Systems ;

CS 744: Big Data Systems Shivaram Venkataraman Fall 2018 ADMINISTRIVIA - Waitlist/Enrollment

Modern Programming Languages (Seminar) Guido Salvaneschi Joscha Drechsler Outline

Provider EVV System Training December 21, 21, 2017 2017 Zoom Webinar - Administrivia

Plan for Today Administrivia come to office hours to start talking about possible projects

EECS E6870 - Speech Recognition Administrivia Lecture 11 Linear Discriminant Analysis

Administrivia CS 188: Artificial Intelligence Spring 2007 http://inst.cs.berkeley.edu/~cs188

Lecture 2: Parallel Architectures Lecture 2: Parallel Architectures and Programming Models

Snoop-based Multiprocessor Design Design Goals Performance and cost depend on design and

Introduction Questions answered in this lecture: What is an OS and why do you want one? Why

Module 1: Introduction What is an operating system? Simple Batch Systems

A CnC-driven Implementation of Medical Imaging Algorithms on Heterogeneous Processors Yi Zou * ,

Parallel Programming and High-Performance Computing Part 6: Dynamic Load Balancing Dr.

The Time-Triggered Architecture Peter Bhm 28.9.05 Overview 1. Introduction 2. Network

EDGE, TRIPS, and CLP Bending architecture to fit workload Zachary Weinberg 22 Jan 2009 The

Administrivia Mini project deadline: today Attach the capture of - PowerPoint PPT Presentation

Administrivia Mini project deadline: today Attach the capture of the evaluation run output Guest lecture on Friday Algorithmic Verification of Stability of Hybrid Systems by Dr. Pavithra Prabhakar. K-State 1 Administrivia

Administrivia CSCE150A CSCE150A Computer Science &amp; Engineering 150A Administrivia Problem

Outline Administrivia Introduction to Machine Learning Greg Mori - CMPT 419/726 Machine

More Threads and Synchronization More Threads and Synchronization Administrivia Administrivia

Introduction to Machine Learning Greg Mori - CMPT 419/726 Bishop PRML Ch. 1 Administrivia

Administrivia Administrivia Nachos guide and Lab #1 are on the web.

CSCE 471/871 Lecture 0: Stephen Scott Administrivia Welcome Introduction What is Bioin-

Project 2 Soumya Basu Department of Computer Science Cornell University September 18, 2015

Ontology Engineering Administrivia and general information Maria Keet email: mkeet@cs.uct.ac.za

Administrivia Mini project is graded 1 st place: Justin (75.45) 2 nd place: Liia

Administrivia Website. cis.poly.edu/jsterling/cs3224 Text: Modern Operating Systems ;

CS 744: Big Data Systems Shivaram Venkataraman Fall 2018 ADMINISTRIVIA - Waitlist/Enrollment

Modern Programming Languages (Seminar) Guido Salvaneschi Joscha Drechsler Outline

Provider EVV System Training December 21, 21, 2017 2017 Zoom Webinar - Administrivia

Plan for Today Administrivia come to office hours to start talking about possible projects

EECS E6870 - Speech Recognition Administrivia Lecture 11 Linear Discriminant Analysis

Administrivia CS 188: Artificial Intelligence Spring 2007 http://inst.cs.berkeley.edu/~cs188

Lecture 2: Parallel Architectures Lecture 2: Parallel Architectures and Programming Models

Snoop-based Multiprocessor Design Design Goals Performance and cost depend on design and

Introduction Questions answered in this lecture: What is an OS and why do you want one? Why

Module 1: Introduction What is an operating system? Simple Batch Systems

A CnC-driven Implementation of Medical Imaging Algorithms on Heterogeneous Processors Yi Zou * ,

Parallel Programming and High-Performance Computing Part 6: Dynamic Load Balancing Dr.

The Time-Triggered Architecture Peter Bhm 28.9.05 Overview 1. Introduction 2. Network

EDGE, TRIPS, and CLP Bending architecture to fit workload Zachary Weinberg 22 Jan 2009 The

Administrivia CSCE150A CSCE150A Computer Science & Engineering 150A Administrivia Problem