CS 6958 LECTURE 12 WRAP-UP CACHES February 19, 2014 Creative - PowerPoint PPT Presentation

CS 6958 LECTURE 12 WRAP-UP CACHES February 19, 2014

Creative

Ray Coherence ¨ Processing coherent rays simultaneously results in data locality ¤ Lots of research involving collecting coherent rays ¤ More on this later Coherent Incoherent

Many-Core Shared Caches All processed simultaneously Suppose each of these nodes map to the same cache line (but different tag)

Line Size ¨ How big should lines be? ¤ 1 word (4 bytes) n equivalent to larger RF ¤ 64B n Typical (but seems pretty small) ¤ Why not 512B, 1KB?

Line Size ¨ Number of lines = cache size / line size ¤ What if only 1 line? ¤ Data access usually only contiguous to certain extent (8, 16 words at a time?) ¨ Especially true for tree traversal ¤ More lines à lower probability of conflict

Overfill / Underfill ¨ Overfill ¤ Transferring too much data from L1, L2, DRAM ¤ Locality only goes so far ¤ Wastes a lot of energy, occupies DRAM channels ¨ Underfill ¤ Transferring not enough data from L2, DRAM ¤ Doesn’t amortize expensive activation overheads ¨ Getting the right balance is tricky ¤ Very rarely do we transfer exactly what we need

LOAD Stalls ¨ Data dependence stalls ¤ Variable latency (1 – ??) ¤ With --disable-usimm, latency is function of hit rate (32 threads) 4KB 32KB Thread issue rate 53% 69% Data Stalls (LOAD) 76M 18M ¨ Resource conflicts ¤ Two threads trying to read same bank (32 threads) 1 bank 8 banks Thread issue rate 30% 69% Resource conflicts (LOAD) 268M 1M

Cache Areas ¨ Function of capacity and num banks

Caches (config-file) ¨ L1 / L2 L1 1 8192 4 4 � log_2(linesize) (words) name latency capacity (words) banks Example is 32KB with 64B line size

Cache Specifications ¨ samples/configs/dcacheparams.txt ¤ All reasonable cache capacity/numbanks/linesize configurations ¤ Some combinations not feasible and don’t exist ¤ Specified in bytes, not words! ¨ Area, energy estimates using Cacti ¤ http://www.hpl.hp.com/research/cacti/

L1 Hit Rates ¨ Diminishing returns? ¤ Not exactly

Hit Rates ¨ What’s the difference between 98% and 99%

Hit Rates ¨ What’s the difference between 98% and 99% ¤ How many fewer reads make it past the cache? ¤ ½ ¨ 0% à 10% == 10% better ¨ 70% à 80% == 33% better

Hit Rates (L1 + L2) ¨ What is the difference between: ¤ L1: 98% à 99% Vs. ¤ L1: 98% + L2: 50%

Hit Rates (L1 + L2) ¨ What is the difference between: ¤ L1: 98% à 99% Vs. ¤ L1: 98% + L2: 50% ¨ Which is easier to achieve? ¤ In terms of: ¤ design ¤ area ¤ energy

Cache Statistics System-wide L1 stats (sum of all TMs): L1 accesses: 14232064 L1 hits: 13630310 L1 misses: 601754 L1 bank conflicts: 761313 L1 stores: 49152 Doesn’t include hit under miss L1 hit rate: 0.957718 (Hit + H.U.M. rate = 98.3%) Hit under miss: 357529 � �

L1 à L2 Interaction ¨ For L2 to catch extra misses, they must contain different lines ¤ L2 much larger: address à line mapping changes L2 L1 line 0, tag 0 L1 line 0, tag 1 L2 line 0 tag 0 L2 line 4 tag 0 L1

L1 à L2 Interaction ¨ If we must evict green line from L1, it is not completely thrown away L2 LOAD L1

L1 à L2 Interaction ¨ Extra line (green) is still saved if needed later ¨ Cache hierarchy almost like extra associativity L2 L1

L1 à L2 Interaction ¨ L2 usually shared by multiple L1s ¤ Non-exclusive ¤ Lines contained in L2 may also be contained in L1 L2 L1_0 L1_1

L1 à L2 Interaction ¨ Shared cache interaction gets more intricate L2 load L1_0 L1_1

L1 à L2 Interaction ¨ L1_1 may benefit from someone else’s fetch L2 L1_0 L1_1

L1 à L2 Interaction ¨ If they disagree, L1_0 keeps its own copy L2 Tag mismatch load L1_0 L1_1

L1 à L2 Interaction ¨ L2 lines replicated in at least one L1 ¨ L1 lines not necessarily in L2 L2 L1_0 L1_1

CS 6958 LECTURE 12 WRAP-UP CACHES February 19, 2014 Creative - PowerPoint PPT Presentation

CS 6958 LECTURE 12 WRAP-UP CACHES February 19, 2014 Creative Creative Ray Coherence Processing coherent rays simultaneously results in data locality Lots of research involving collecting coherent rays More on this later

CS 6958 LECTURE 11 CACHES February 12, 2014 Fancy Machines baetis.cs.utah.edu

Multicore Workshop Caches Mark Bull David Henty EPCC, University of Edinburgh Overview

Trace Caches and optimizations therein CSE 240C - Rushi Chakrabarti - Winter 2009 Trace Caches

ECE232: Hardware Organization and Design Lecture 22: Introduction to Caches Adapted from Computer

Review: Why We Use Caches Caches Review Mechanism for transparent movement of Proc 1000

Say Goodbye to Off-heap Caches! On-heap Caches Using Memory-Mapped I/O Iacovos G. Kolokasis 1 ,

CSE 351: Week 7 Tom Bergan, TA 1 Today Cache geometries Lab 4 2 Caches they make

CS 136: Advanced Architecture Review of Caches 1 / 30 Introduction Why Caches? Basic goal:

CPUs Chapter 3.5 Caches. Memory management. Caches and CPUs address data cache

What You Must Know about Memory, Caches, and Shared Memory Kenjiro Taura 1 / 67 Contents 1

Caches Electronic Computers M Caches 1 Cache LOCALITY PRINCIPLE (SPATIAL AND TEMPORAL)

Caches & Memcache Example Client N. America Client System Asia + Caches Client Africa

CS 6958 LECTURE 16 PATHTRACING REVIEW, MATERIALS March 3, 2014 Recall 2 can split

CS 6958 LECTURE 8 TRIANGLES, BVH February 3, 2014 Last Time 2 derived ray-triangle

CS 6958 LECTURE 6 LIGHTS, CAMERAS January 27, 2014 Creative Creative Creative Creative

CS 6958 LECTURE 9 TRAX MEMORY MODEL February 5, 2014 Recap: TRaX Thread DRAM L2 L1 Thread

Exploring PropBanks for English and Hindi Ashwini Vaidya Dept of Linguistics University of

The state of the markets Paul Fisher Executive Director, Markets MPC and interim FPC member

Guidance and Resources for Santa Cruz County Businesses 3.0 COUNTY OF SANTA CRUZ OFFICE FOR

From Mathematical Finance to Quantitative Risk Management Paul Embrechts Department of

COLLECTIVE AGREEM ENT DEFINITION 1.- Written standard 2.- Stemmed from a bargaining process

Negotiating a Collective Bargaining Agreement A Case Study in Management Engagement Donna

Does union membership pay off? Evidence from Vietnamese SMEs UNU- WIDER PROJECT ON STRUCTURAL

Mining Expert Comments on the Application of ILO Conventions on Freedom of Association and

CS 6958 LECTURE 12 WRAP-UP CACHES February 19, 2014 Creative - PowerPoint PPT Presentation

CS 6958 LECTURE 12 WRAP-UP CACHES February 19, 2014 Creative Creative Ray Coherence Processing coherent rays simultaneously results in data locality Lots of research involving collecting coherent rays More on this later

CS 6958 LECTURE 11 CACHES February 12, 2014 Fancy Machines baetis.cs.utah.edu

Multicore Workshop Caches Mark Bull David Henty EPCC, University of Edinburgh Overview

Trace Caches and optimizations therein CSE 240C - Rushi Chakrabarti - Winter 2009 Trace Caches

ECE232: Hardware Organization and Design Lecture 22: Introduction to Caches Adapted from Computer

Review: Why We Use Caches Caches Review Mechanism for transparent movement of Proc 1000

Say Goodbye to Off-heap Caches! On-heap Caches Using Memory-Mapped I/O Iacovos G. Kolokasis 1 ,

CSE 351: Week 7 Tom Bergan, TA 1 Today Cache geometries Lab 4 2 Caches they make

CS 136: Advanced Architecture Review of Caches 1 / 30 Introduction Why Caches? Basic goal:

CPUs Chapter 3.5 Caches. Memory management. Caches and CPUs address data cache

What You Must Know about Memory, Caches, and Shared Memory Kenjiro Taura 1 / 67 Contents 1

Caches Electronic Computers M Caches 1 Cache LOCALITY PRINCIPLE (SPATIAL AND TEMPORAL)

Caches &amp; Memcache Example Client N. America Client System Asia + Caches Client Africa

CS 6958 LECTURE 16 PATHTRACING REVIEW, MATERIALS March 3, 2014 Recall 2 can split

CS 6958 LECTURE 8 TRIANGLES, BVH February 3, 2014 Last Time 2 derived ray-triangle

CS 6958 LECTURE 6 LIGHTS, CAMERAS January 27, 2014 Creative Creative Creative Creative

CS 6958 LECTURE 9 TRAX MEMORY MODEL February 5, 2014 Recap: TRaX Thread DRAM L2 L1 Thread

Exploring PropBanks for English and Hindi Ashwini Vaidya Dept of Linguistics University of

The state of the markets Paul Fisher Executive Director, Markets MPC and interim FPC member

Guidance and Resources for Santa Cruz County Businesses 3.0 COUNTY OF SANTA CRUZ OFFICE FOR

From Mathematical Finance to Quantitative Risk Management Paul Embrechts Department of

COLLECTIVE AGREEM ENT DEFINITION 1.- Written standard 2.- Stemmed from a bargaining process

Negotiating a Collective Bargaining Agreement A Case Study in Management Engagement Donna

Does union membership pay off? Evidence from Vietnamese SMEs UNU- WIDER PROJECT ON STRUCTURAL

Mining Expert Comments on the Application of ILO Conventions on Freedom of Association and

Caches & Memcache Example Client N. America Client System Asia + Caches Client Africa