A A Comprehensiv ive R Revie iew o of the Chal alle lenges an - PowerPoint PPT Presentation

A A Comprehensiv ive R Revie iew o of the Chal alle lenges an and Opportunit itie ies Con onfronti ting C g Cache M Memor ory System Perfor ormance ce Photograph of Intel Xeon processor 7500 series die showing cache memories [ 1 ] R. R. K Kramer er, M M. Elmlin linger, A. Ra Ramamurthy, S. S. Timmi mmireddy

That’s one estimate of how much a processor’s core bandwidth requirements exceed the ability of the cache to supply data [2,3]

Challenges Confronting C Cach che Memory S System ance Performan Some additional astonishing facts… That’s the estimated required increase in cache memory bandwidth that is required for every x10 increase in processor transistor count. That’s the estimated impact that cache memory has on the overall computer architecture’s power requirements. … in cost. Cache memory is said to be the most expensive memory in the overall computer system. And many of those estimates were predicted over 30 years ago! [2,3,4,5] 3 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

Today’s O Object ctives a and Contributions To take you to the forefront of cache memory research opportunities to improve performance  Advances in Cache Data Management: Prefetching, Bandwidth Management, Scheduling, and Data Placement (Abhishek Ramamurthy)  Energy Efficiency Opportunities (Mathias Elmlinger)  Advanced Topics in Cache Memory Research (Pranav Timmireddy) 4 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

Adv Advances i es in n Cache D e Data Mana nagem ement: Prefetching, Bandw ndwidt dth M h Managemen ent, and nd Data P Placement  Why is it necessary to have cache prefetching?  What is the bottle neck involved in prefetching data from cache memory?  How to improve the cache memory density? 5 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

Sandbox P x Prefetch ching M Mech chanism  Technique is based on bloom filter (Howard Bloom, 1970). Sandbox Prefetch Architecture [14] Sandbox Prefetch Action on L2 Access [14]  SandBox Prefetching (SBP) improves Address Mapped Pattern Matching performance by 3.9% in a multicore environment. 6 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

Incr creasing M Multicore E Effici ciency cy t through I Intelligent Bandwidth S Shifting  Technique provides better efficiency through assigning bandwidth for prefetching based on prefetch efficiency. Base Bandwidth shifting algorithm [16]  Improved multicore efficiencies by 7% for random workloads. Modified Base Bandwidth shifting algorithm [16] 7 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

Adaptive Pl Place cement P Polici cies f for Data in Cach che Me Memory S Sys yste tems  The technique provides a better way placing most frequently used data in cache and evicting the least used data block in cache. Read range and Depth Range[17] 2x more storage AND ~60% less area Area and read / write latency of SRAM and STT -RAM [17] 9 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

En Energy E Effici ciency cy  Crucial factor: energy efficiency  Why cache?  Large fraction of chip size  Estimated: 50% of energy dissipation by cache  Approaches to improve energy efficiency  Software Self Invalidation (L1) and Data Compression (L2)  Exploiting row access locality (DRAM)  Improve Error Correcting Codes and Error Detection Codes (L1)  Isolation nodes and dynamic memory partitioning techniques (L1/L2) 10 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

En Energy E Effici ciency cy Soft ftware S e Sel elf-Invalidation and D Data C Compression  Invalidation  Through request  Last-touch load/store instructions L1 cache memory structure [5] 11 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

En Energy E Effici ciency cy Soft ftware S e Sel elf-Invalidation and D Data C Compression  Invalidation  Through request  Last-touch load/store instructions (conceptual) L1 gated-Vdd control [5]  Reduction of up to 10% in terms of leakage energy 12 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

En Energy E Effici ciency cy Soft ftware S e Sel elf-Invalidation and D Data C Compression  Data compression  Less memory space used  More memory space can be turned off L2 gated-Vdd control [5]  Reduction of up to 25% in terms of leakage energy 13 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

Energy E En Effici ciency cy Exploiti ting R Row A Access Lo Locality DRAM Sub-Array (left) and DRAM cell (right) [6]  Timing to access rows based on amount of charge  Keep track of charge of recently accessed rows  Table in main memory controller  Hit: lower timing parameters 14 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

Energy E En Effici ciency cy Exploiti ting R Row A Access Lo Locality Effect of initial cell charge on bit line voltage [6]  Timing to access rows based on amount of charge  Keep track of charge of recently accessed rows  Table in main memory controller  Hit: lower timing parameters 15 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

En Energy E Effici ciency cy Exploiti ting R Row A Access Lo Locality DRAM energy reduction of ChargeCache [6]  Single-core: 1.8% average (max. 6.9%)  Eight-core: 7.9% average (max. 14.1%) 16 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

Advance ced T Topics cs i in Cach che Memory R Research ch  STM : Cloning the Spatial and Temporal Memory Access Behavior  RADAR (Runtime- Assisted Dead Region) Management for Last Level Caches 17 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

STM: S Spatial a and Temporal C Cloning  Transition probability table indexed by stride history pattern is used to capture the spatial locality  A combination of stack distance profile and stride pattern table STM Framework [19] Proxy application versus cloning [19] 18 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

STM M – Cont’d ’d STM on different benchmarks [19] Original vs clone L1 miss rate across different cache prefetchers and configurations 19 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

RADAR: R Runtime me- Assi Assisted D Dead Region Managem emen ent  Efficient management of LLCs is essential  Existing protocols use either dynamic or static techniques.  RADAR is a hybrid static/dynamic technique which improves LLC efficiency.  Look Ahead (LA), Look Back (LB), Conservative combined Scheme (CS = LA ∩ LB), Aggressive combined Scheme (AS = LA ∪ LB). CS = LA ∩ LB LA LB AS = LA ∪ LB LLC miss rate for different RADAR [21]  Aggressive Combined scheme performs best and more than 26% reduction in LLC misses over the baseline LRU. 21 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

Taking R Research ch i into R Reality Such opportunities do translate into results. An example: two cache bandwidth Quality of Service concepts called CMT (Cache Monitoring Technology) and CAT (Cache Allocation Technology) took over 10 years to go from research to silicon. Overview of CMT (Left) and CAT (Right) [24] On June 4, 2013, Intel introduced the Xeon “Haswell” 4th generation processor employing both CMT and CAT technologies [29]. …. providing as high as a 450% improvement [24]. 22 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

Concl clusion a and Future W Work While cache memory system advances continue to be made… these advances are consistently offset by the ever increasing requirements for multicore processors. One estimate is that by 2020, multicore processors will reach zetta-flop (10 21 ) speeds [25]. Envisioned optical RAM cache architecture [25] With such demands, the need for additional breakthroughs in the area of cache memory architectures remains critical. 23 A Compressive Review of the Challenges and Opportunities Confronting Cache Memory System Performance

A A Comprehensiv ive R Revie iew o of the Chal alle lenges an - PowerPoint PPT Presentation

A A Comprehensiv ive R Revie iew o of the Chal alle lenges an and Opportunit itie ies Con onfronti ting C g Cache M Memor ory System Perfor ormance ce Photograph of Intel Xeon processor 7500 series die showing cache memories [ 1 ]

Welcome! 1 Niagara Regions Greenbelt Gr eenbelt Plan an Revie iew Community

Library Library Comprehensiv Comprehensiv e Program e Program Review Review Spring 2019

Rat atio S io Stud udy S y Submis issio ion and and Revie iew Pr Proce ocess Deliverance

A Quadr drennia ial Revie iew of the Natio ional l Nanotec echn hnology ology Init itia

2017 Co Comprehensiv ive M Master Pla lan & & ECC Comp mpto ton Cen ente ter Sel

THE M MIN INIM IMUM S STANDARDS O ON COMPREHENSIV IVE S SERVIC ICES F FOR CHIL ILDREN A

Tow owards a a Com Comprehensiv ive & Ambit itious FT FTA Juli liana Nam Nam, ,

20 2016 16 LLS LLSA Revie iew Anticoagulants/Antithrombotics Arti ticles 1, 9, 11, 1, 12

Evide idence nce Revie iew: w: The he NC NCCEH EH App pproach Tina Chen, BSc., CPHI(C),

A Flo lood Management Hig igh Level Revie iew for The Broads Clim limate Partnership Broads

County ty of f Wellin llington Credit Revi view June 17, 2019 Cred edit it Revie iew 2019

Debt Perspective Jan January ry 20 2018 18 Debt Markets - Revie iew Bond Market Overview

Revie iew of cu current ent an and fut uture ure projec jects ts in Riyadh dh

80% 80% State e Agency cy Techni nical cal Revie iew CONS NSOLID LIDATED TED FU FUND

Debt Perspective April il 2018 2018 Debt Markets - Revie iew Bond Market Overview Bond

Debt Perspective Mar arch 20 2018 18 Debt Markets - Revie iew Bond Market Overview The

GEOCACHING MARKETING THE DESTINATION the sport where YOU are the search engine Link Agenda

CURB TAIL LATENCY WITH PELIKAN ABOUT ME 6 years at Twitter, on cache maintainer of

Authored by, Suyong Eum, Kiyohide Nakauchi, Yozo Shoji, Nozomu Nishinaga, Masayuki Murata It

DNS Session 2: DNS cache operation and DNS debugging Joe Abley AfNOG 2006 workshop How caching

Slide 2 Caching is both the most effective AND the most cost-effective method for schools to

DNS Rex Do you need an aggressive benchmark? Alex Rousskov The Measurement Factory DNS Rex At a

Cache Lab Implementation and Blocking Slides courtesy of: Aditya Shah, CMU 1 Carnegie Mellon

Shared Memory Bus for Multiprocessor Systems Mat Laibowitz and Albert Chiou Group 6 Shared

A A Comprehensiv ive R Revie iew o of the Chal alle lenges an - PowerPoint PPT Presentation

A A Comprehensiv ive R Revie iew o of the Chal alle lenges an and Opportunit itie ies Con onfronti ting C g Cache M Memor ory System Perfor ormance ce Photograph of Intel Xeon processor 7500 series die showing cache memories [ 1 ]

Welcome! 1 Niagara Regions Greenbelt Gr eenbelt Plan an Revie iew Community

Library Library Comprehensiv Comprehensiv e Program e Program Review Review Spring 2019

Rat atio S io Stud udy S y Submis issio ion and and Revie iew Pr Proce ocess Deliverance

A Quadr drennia ial Revie iew of the Natio ional l Nanotec echn hnology ology Init itia

2017 Co Comprehensiv ive M Master Pla lan &amp; &amp; ECC Comp mpto ton Cen ente ter Sel

THE M MIN INIM IMUM S STANDARDS O ON COMPREHENSIV IVE S SERVIC ICES F FOR CHIL ILDREN A

Tow owards a a Com Comprehensiv ive &amp; Ambit itious FT FTA Juli liana Nam Nam, ,

20 2016 16 LLS LLSA Revie iew Anticoagulants/Antithrombotics Arti ticles 1, 9, 11, 1, 12

Evide idence nce Revie iew: w: The he NC NCCEH EH App pproach Tina Chen, BSc., CPHI(C),

A Flo lood Management Hig igh Level Revie iew for The Broads Clim limate Partnership Broads

County ty of f Wellin llington Credit Revi view June 17, 2019 Cred edit it Revie iew 2019

Debt Perspective Jan January ry 20 2018 18 Debt Markets - Revie iew Bond Market Overview

Revie iew of cu current ent an and fut uture ure projec jects ts in Riyadh dh

80% 80% State e Agency cy Techni nical cal Revie iew CONS NSOLID LIDATED TED FU FUND

Debt Perspective April il 2018 2018 Debt Markets - Revie iew Bond Market Overview Bond

Debt Perspective Mar arch 20 2018 18 Debt Markets - Revie iew Bond Market Overview The

GEOCACHING MARKETING THE DESTINATION the sport where YOU are the search engine Link Agenda

CURB TAIL LATENCY WITH PELIKAN ABOUT ME 6 years at Twitter, on cache maintainer of

Authored by, Suyong Eum, Kiyohide Nakauchi, Yozo Shoji, Nozomu Nishinaga, Masayuki Murata It

DNS Session 2: DNS cache operation and DNS debugging Joe Abley AfNOG 2006 workshop How caching

Slide 2 Caching is both the most effective AND the most cost-effective method for schools to

DNS Rex Do you need an aggressive benchmark? Alex Rousskov The Measurement Factory DNS Rex At a

Cache Lab Implementation and Blocking Slides courtesy of: Aditya Shah, CMU 1 Carnegie Mellon

Shared Memory Bus for Multiprocessor Systems Mat Laibowitz and Albert Chiou Group 6 Shared

2017 Co Comprehensiv ive M Master Pla lan & & ECC Comp mpto ton Cen ente ter Sel

Tow owards a a Com Comprehensiv ive & Ambit itious FT FTA Juli liana Nam Nam, ,