Memory Access Scheduler Matthew Cohen, Alvin Lin 6.884 Complex - PowerPoint PPT Presentation

Memory Access Scheduler Matthew Cohen, Alvin Lin 6.884 – Complex Digital Systems May 6 th , 2005

Why Use Scheduling? � Sequential accesses to DRAM are wasteful � Improve latency and bandwidth of memory requests � Order requests to take advantage of DRAM characteristics

DRAM Bank FSM Reads, Writes Activate Row Idle Row Active Bank Precharge

Memory Access Scheduling Traditional Scheduling: Bank 0 Active R Precharge Idle Bank 1 Idle Active R Precharge Idle Bank 2 Idle Active R Precharge Idle Bank 3 Idle Active Memory Access Scheduling: Bank 0 Active R Idle Precharge Idle Bank 1 Active R Idle Precharge Idle Bank 2 Active R Idle Precharge Idle Bank 3 Active Idle R Idle Precharge Idle � Avoid data line conflicts (read/write) � Avoid control line conflicts

High-Level Architecture Instructions Inst. Cache Controller Memory CPU DRAM Scheduler Data Cache Controller Data

Instruction and Data Cache � Separate I- and D-caches � Fully parameterizable sizes � Direct mapped caches � Write-through, no-write-allocate � Four words per cache line V Tag Word 0 Word 1 Word 2 Word 3 V Tag Word 0 Word 1 Word 2 Word 3 V Tag Word 0 Word 1 Word 2 Word 3

Incremental Design � Fully blocking, single word per line � Fully blocking, four words per line � Hit under miss � Miss under miss � Necessary for full benefits of scheduling

Non-Blocking Cache Architecture BUFTAG V0 V1 V2 V3 Tag0Tag1Tag2Tag3 � On cache load miss, BUFTAG V0 V1 V2 V3 Tag0Tag1Tag2Tag3 add request to Pending BUFTAG V0 V1 V2 V3 Tag0Tag1Tag2Tag3 BUFTAG V0 V1 V2 V3 Tag0Tag1Tag2Tag3 Request Buffer (PRB) � Place µP tag in Tag location, set Valid, issue read request to scheduler with tag = PRB index � If another read to same line, set tag and valid but no new read request � On return of data, match tag to PRB line, retrieve µ P tag of valid entries, return data to µ P

Non-Blocking Cache Architecture BUFTAG V0 V1 V2 V3 Tag0Tag1Tag2Tag3 � On cache store BUFTAG V0 V1 V2 V3 Tag0Tag1Tag2Tag3 request, search PRB BUFTAG V0 V1 V2 V3 Tag0Tag1Tag2Tag3 BUFTAG V0 V1 V2 V3 Tag0Tag1Tag2Tag3 � If already issued read to this line, stall

High-Level Architecture Instructions Inst. Cache Controller Memory CPU DRAM Scheduler Data Cache Controller Data

Scheduler Overview � Cache misses are sent to the scheduler � Scheduler is responsible for interfacing with the DRAM � Requests may be honored out of order

Scheduler Tasks � Keep waiting buffers of pending memory requests � Prioritize accesses in waiting buffer � Respect timing of the DRAM � Capture data coming back from DRAM � Keep the DRAM busy!

Scheduler RTL Design Waiting Buffer Bank 0 Instructions Waiting Buffer Bank 1 From Cache DRAM Controllers Waiting Data Buffer Bank 2 Back to Cache Waiting Controllers Buffer Bank 3

Incremental Design � Blocking In-Order Scheduler � FIFOs as Waiting Buffers and In- Order Scheduling � Real Waiting Buffers and Interleaved Scheduling

Infinite Compile Time � Scheduler exploded in complexity � Huge amount of combinational logic � Memory access scheduling is a difficult problem � DRAM is not designed to work easily with scheduling

Architectural Exploration � Change cache size to adjust cache miss percentage � Change PRB size to allow for scheduling optimization � Larger sizes should yield better results but higher cost

Total Time to Make 6000 Random Accesses to 512 Addresses 60000 50000 40000 Time (ns) 128 Byte Cache 30000 256 Byte Cache 512 Byte Cache 20000 10000 0 1 10 100 PRB Lines

Synthesis Results (Area = 196,117.6 µ m 2 )

Conclusion � Memory becoming bottleneck for computer systems � In-order memory access is simple in logic but wasteful in performance � Memory access scheduling is much more efficient in theory, but complex in implementation

Acknowledgements � 6884-bluespec � 6884-staff � group1, for teaching us how to use Vector, even if you didn’t realize it…

Memory Access Scheduler Matthew Cohen, Alvin Lin 6.884 Complex - PowerPoint PPT Presentation

Memory Access Scheduler Matthew Cohen, Alvin Lin 6.884 Complex Digital Systems May 6 th , 2005 Why Use Scheduling? Sequential accesses to DRAM are wasteful Improve latency and bandwidth of memory requests Order requests to take

Preempting Scheduler Activations Scheduler activations are completely preemptable Deadlocks

WORK STEALING SCHEDULER 2 6/16/2010 Work Stealing Scheduler

Design and Implemention of a Plugin Scheduler for DIET March 11, 2005 Design and Implemention of

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Three-Level Scheduling CPU CPU scheduler Scheduling Arriving jobs How to choose which of the

Memory Management Memory Manager Requirements Minimize primary memory access time

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

LTE eNB Scheduler performance 3rd Fed4FIRE Engineering Conference experiments 14.03.2018

Avoiding Scheduler Subversion usin ing Scheduler-Cooperative Locks Yuvraj Patel, Leon Yang * ,

CPU Scheduling Schedulers Structure of a CPU scheduler Criteria for scheduling

GNU Radio Advanced Scheduler Dude: Josh Blum - New scheduler features and stuff GRAS - Project

scheduling 2 FCFS, RR, priority, SRTF 1 last time xv6 scheduler design separate scheduler

A Configurable Hardware Scheduler A Configurable Hardware Scheduler (CHS) for Real- -Time

Avoiding Scheduler Subversion using Scheduler - Cooperative Locks Yuvraj Patel , Leon Yang , Leo

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

CURZON ENERGY PLC October 2017 IMPORTANT NOTICE By attending this presentation and/or accepting

WE ARE ASEFA 2 ASEFA A SEFA is internationally recognized Certification Body for

SAN FRANCISCO SAN FRANCISCO PAID PARENTAL LEAVE PAID PARENTAL LEAVE ORDINANCE (PPLO) ORDINANCE

In Lostness Possibilities are Found: Isit possible to define the value of lostness through

SURFACE TRANSPORTATION BOARD - RETAC COAL PRODUCER UPDATE Jill Harrison, SVP & General

North Carolina Departm ent of Public I nstruction Overview LRC COMMITTEE ON INTELLECTUAL AND

2005/2006 Greenbelt Charleston County Greenbelt System Component Goals Urban Greenbelt Lands

Presentation Slides for Investor Roadshows/Forums January 2020 Centurion Corporation Limited

Memory Access Scheduler Matthew Cohen, Alvin Lin 6.884 Complex - PowerPoint PPT Presentation

Memory Access Scheduler Matthew Cohen, Alvin Lin 6.884 Complex Digital Systems May 6 th , 2005 Why Use Scheduling? Sequential accesses to DRAM are wasteful Improve latency and bandwidth of memory requests Order requests to take

Preempting Scheduler Activations Scheduler activations are completely preemptable Deadlocks

WORK STEALING SCHEDULER 2 6/16/2010 Work Stealing Scheduler

Design and Implemention of a Plugin Scheduler for DIET March 11, 2005 Design and Implemention of

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Three-Level Scheduling CPU CPU scheduler Scheduling Arriving jobs How to choose which of the

Memory Management Memory Manager Requirements Minimize primary memory access time

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

LTE eNB Scheduler performance 3rd Fed4FIRE Engineering Conference experiments 14.03.2018

Avoiding Scheduler Subversion usin ing Scheduler-Cooperative Locks Yuvraj Patel, Leon Yang * ,

CPU Scheduling Schedulers Structure of a CPU scheduler Criteria for scheduling

GNU Radio Advanced Scheduler Dude: Josh Blum - New scheduler features and stuff GRAS - Project

scheduling 2 FCFS, RR, priority, SRTF 1 last time xv6 scheduler design separate scheduler

A Configurable Hardware Scheduler A Configurable Hardware Scheduler (CHS) for Real- -Time

Avoiding Scheduler Subversion using Scheduler - Cooperative Locks Yuvraj Patel , Leon Yang , Leo

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

CURZON ENERGY PLC October 2017 IMPORTANT NOTICE By attending this presentation and/or accepting

WE ARE ASEFA 2 ASEFA A SEFA is internationally recognized Certification Body for

SAN FRANCISCO SAN FRANCISCO PAID PARENTAL LEAVE PAID PARENTAL LEAVE ORDINANCE (PPLO) ORDINANCE

In Lostness Possibilities are Found: Isit possible to define the value of lostness through

SURFACE TRANSPORTATION BOARD - RETAC COAL PRODUCER UPDATE Jill Harrison, SVP &amp; General

North Carolina Departm ent of Public I nstruction Overview LRC COMMITTEE ON INTELLECTUAL AND

2005/2006 Greenbelt Charleston County Greenbelt System Component Goals Urban Greenbelt Lands

Presentation Slides for Investor Roadshows/Forums January 2020 Centurion Corporation Limited

SURFACE TRANSPORTATION BOARD - RETAC COAL PRODUCER UPDATE Jill Harrison, SVP & General