Caching In Depth 1 Today Quiz Design choices in cache - PowerPoint PPT Presentation

Caching In Depth 1

Today • Quiz • Design choices in cache architecture 2

Basic Cache Organization • Some number of cache lines each with • Dirty bit -- does this data dirty valid Tag Data match what is in memory • Valid -- does this mean anything at all? • Tag -- The high order bits of the address • Data -- The program’s data • Note that the index of the line, combined with the tag, uniquely identify one cache line’s worth of memory 3

Cache Geometry Calculations • Addresses break down into: tag, index, and offset. • How they break down depends on the “cache geometry” • Cache lines = L • Cache line size = B • Address length = A (32 bits in our case) • Index bits = log2(L) • Offset bits = log2(B) • Tag bits = A - (index bits + offset bits) 4

Practice • 1024 cache lines. 32 Bytes per line. • Index bits: • Tag bits: • off set bits: 5

Practice • 1024 cache lines. 32 Bytes per line. • Index bits: 10 • Tag bits: • off set bits: 5

Practice • 1024 cache lines. 32 Bytes per line. • Index bits: 10 • Tag bits: • off set bits: 5 5

Practice • 1024 cache lines. 32 Bytes per line. • Index bits: 10 • Tag bits: 17 • off set bits: 5 5

Practice • 32KB cache. • 64byte lines. • Index • Offset • Tag 6

Practice • 32KB cache. • 64byte lines. • Index 9 • Offset • Tag 6

Practice • 32KB cache. • 64byte lines. • Index 9 • Offset • Tag 17 6

Practice • 32KB cache. • 64byte lines. • Index 9 • Offset 6 • Tag 17 6

The basic read algorithm {tag, index, offset} = address; if (isRead) { if (tags[index] == tag) { return data[index]; } else { l = chooseLine(...); if (l is dirty) { WriteBack(l); } Load address into line l; return data[l]; } } 7

The basic read algorithm {tag, index, offset} = address; if (isRead) { if (tags[index] == tag) { return data[index]; } else { l = chooseLine(...); if (l is dirty) { WriteBack(l); } Load address into line l; return data[l]; } } Which line to evict? 7

The basic write algorithm {tag, index, offset} = address; if (isWrite) { if (tags[index] == tag) { data[index] = data; // Should we just update locally? dirty[index] = true; } else { l = chooseLine(...); // maybe no line? if (l is dirty) { WriteBack(l); } if (l exists) { data[l] = data; } } } 8

The basic write algorithm {tag, index, offset} = address; if (isWrite) { if (tags[index] == tag) { data[index] = data; // Should we just update locally? dirty[index] = true; } else { l = chooseLine(...); // maybe no line? if (l is dirty) { WriteBack(l); } if (l exists) { data[l] = data; Where to write data? } } } 8

The basic write algorithm {tag, index, offset} = address; if (isWrite) { if (tags[index] == tag) { data[index] = data; // Should we just update locally? dirty[index] = true; } else { l = chooseLine(...); // maybe no line? if (l is dirty) { WriteBack(l); } if (l exists) { data[l] = data; Where to write data? } Should we evict something? } } 8

The basic write algorithm {tag, index, offset} = address; if (isWrite) { if (tags[index] == tag) { data[index] = data; // Should we just update locally? dirty[index] = true; } else { l = chooseLine(...); // maybe no line? if (l is dirty) { WriteBack(l); } if (l exists) { data[l] = data; Where to write data? } Should we evict something? } } What should we evict? 8

Write Design Choices • Remember all decisions are only for this cache. The lower levels of the hierarchy might make different decisions. • Where to write data? • Write-through -- Writes to this cache and the next lower level of the hierarchy. • No-write-through -- Writes only affect this level • Should we evict anything? • Write-allocate -- bring the modified line into the cache, then modify it. • No-write-allocate -- Update the cache line where you find it in the hierarchy. Do not bring it “closer” • What to evict? 9

Dealing the Interference • By bad luck or pathological happenstance a particular line in the cache may be highly contended. • How can we deal with this? 10

Associativity • (set) Associativity means providing more than one place for a cache line to live. • The level of associativity is the number of possible locations • 2-way set associative • 4-way set associative • One group of lines corresponds to each index • it is called a “set” • Each line in a set is called a “way” 11

Associativity dirty valid Tag Data Way 0 Set 0 Way 1 Set 1 Set 2 Set 3 12

New Cache Geometry Calculations • Addresses break down into: tag, index, and offset. • How they break down depends on the “cache geometry” • Cache lines = L • Cache line size = B • Address length = A (32 bits in our case) • Associativity = W • Index bits = log2(L/W) • Offset bits = log2(B) • Tag bits = A - (index bits + offset bits) 13

Practice • 32KB, 2048 Lines, 4-way associative. • Line size: • Sets: • Index bits: • Tag bits: • Offset bits: 14

Practice • 32KB, 2048 Lines, 4-way associative. • Line size: 16B • Sets: • Index bits: • Tag bits: • Offset bits: 14

Practice • 32KB, 2048 Lines, 4-way associative. • Line size: 16B • Sets: 512 • Index bits: • Tag bits: • Offset bits: 14

Practice • 32KB, 2048 Lines, 4-way associative. • Line size: 16B • Sets: 512 • Index bits: 9 • Tag bits: • Offset bits: 14

Practice • 32KB, 2048 Lines, 4-way associative. • Line size: 16B • Sets: 512 • Index bits: 9 • Tag bits: • Offset bits: 4 14

Practice • 32KB, 2048 Lines, 4-way associative. • Line size: 16B • Sets: 512 • Index bits: 9 • Tag bits: 19 • Offset bits: 4 14

Full Associativity • In the limit, a cache can have one, large set. • The cache is then fully associative • A one-way associative cache is also called -- direct mapped 15

Eviction in Associative caches • We must choose which line in a set to evict if we have associativity • How we make the choice is called the cache eviction policy • Random -- always a choice worth considering. Hard to implement true randomness. • Least recently used (LRU) -- evict the line that was last used the longest time ago. • Prefer clean -- try to evict clean lines to avoid the write back. • Farthest future use -- evict the line whose next access is farthest in the future. This is provably optimal. It is also difficult to implement. 16

The Cost of Associativity • Increased associativity requires multiple tag checks • N-Way associativity requires N parallel comparators • This is expensive in hardware and potentially slow. • The fastest way is to use a “content addressable memory” They embed comparators in the memory array. -- try instantiating one in Xlinix. • This limits associativity L1 caches to 2-4. In L2s to make 16 way. 17

Increasing Bandwidth • A single, standard cache can service only one operation at time. • We would like to have more bandwidth, especially in modern multi-issue processors • There are two choices • Extra ports • Banking 18

Extra Ports • Pros: Uniformly supports multiple accesses • Any N addresses can be accessed in parallel. • Costly in terms of area. • Remember: SRAM size increases quadratically with the number of ports 19

Banking • Multiple, independent caches, each assigned one part of the address space (use some bits of the address) • Pros: Efficient in terms of area. Four banks of size N/4 are only a bit bigger than one cache of size N. • Cons: Only one access per bank. If you are unlucky you don’t get the extra. 20

Caching In Depth 1 Today Quiz Design choices in cache - PowerPoint PPT Presentation

Caching In Depth 1 Today Quiz Design choices in cache architecture 2 Basic Cache Organization Some number of cache lines each with Dirty bit -- does this data dirty valid Tag Data match what is in memory Valid -- does

Agenda Caching Caching Gitlab Demo Caching Demos Mirroring Caching Limitations Manual

Web Proxy Web Proxy Caching Caching Caching Web Proxy Web Proxy Caching By Miquel Company

Cooperative Web Caching Cooperative Web Caching Cooperative Caching Cooperative Caching

Web Caching and Content Delivery Web Caching and Content Delivery Caching for a Better Web

Web Caching based on: Web Caching , Geoff Huston Web Caching and Zipf-like Distributions:

Scaling Your Cache & Caching at Scale Alex Miller @puredanger Mission Why does caching

Web Caching Web Caching and wireless networks Next generation Wireless Networks Helsinki

for each dst in my.out_edges if dst.depth > my.depth+1 then dst.depth = my.depth+1

Evolution of valley depth and width Evolution of valley depth and width Evolution of valley depth

Temporal Temporal Radiance Caching Radiance Caching Pascal Gautron R&D Engineer Thomson

1 Harvest Harvest- -Style ICP Hierarchies Style ICP Hierarchies Issues for Cache Hierarchies

1 Web Traffic Characterization Zipf Web Traffic Characterization Zipf [Breslau/Cao99] and

RGBD Tutorial 14210240041 Gu Pan Image RGB YUV Lab Depth Image RGB image Depth image Each pixel in

Slide 2 Caching is both the most effective AND the most cost-effective method for schools to

Slide 2 Caching is both the most effective AND the most cost-effective method for schools to

Understanding Optimal Caching and Opportunistic Caching at The Edge of Information Centric

School Partnership Conference Friday 10 th May 2019 Colchester Football Club Essex School-led

Pairwise Comparisons with Flexible Time-Dynamics Lucas Maystre , Victor Kristof, Matthias

RESULTS OF THE PROJECT NATIONAL SPORTS ACADEMY VASSIL LEVSKI INTEGRATED FOOTBALL_ T he

Meeting in Meeting in Rome Rome 2 29 9 30 30 September September 201

CACHE ARCHITECTURE Mahdi Nazm Bojnordi Assistant Professor School of Computing University of

haskell cons In haskell consing is done via the infix operator (:). For example: (cons 1 (cons 2

Associativity, preassociativity, and string functions Erkko Lehtonen Centro de lgebra da

Parsing Tools Aslan Askarov aslan@cs.au.dk Today: using parsing tools to parse 3 simple

Caching In Depth 1 Today Quiz Design choices in cache - PowerPoint PPT Presentation

Caching In Depth 1 Today Quiz Design choices in cache architecture 2 Basic Cache Organization Some number of cache lines each with Dirty bit -- does this data dirty valid Tag Data match what is in memory Valid -- does

Agenda Caching Caching Gitlab Demo Caching Demos Mirroring Caching Limitations Manual

Web Proxy Web Proxy Caching Caching Caching Web Proxy Web Proxy Caching By Miquel Company

Cooperative Web Caching Cooperative Web Caching Cooperative Caching Cooperative Caching

Web Caching and Content Delivery Web Caching and Content Delivery Caching for a Better Web

Web Caching based on: Web Caching , Geoff Huston Web Caching and Zipf-like Distributions:

Scaling Your Cache &amp; Caching at Scale Alex Miller @puredanger Mission Why does caching

Web Caching Web Caching and wireless networks Next generation Wireless Networks Helsinki

for each dst in my.out_edges if dst.depth &gt; my.depth+1 then dst.depth = my.depth+1

Evolution of valley depth and width Evolution of valley depth and width Evolution of valley depth

Temporal Temporal Radiance Caching Radiance Caching Pascal Gautron R&amp;D Engineer Thomson

1 Harvest Harvest- -Style ICP Hierarchies Style ICP Hierarchies Issues for Cache Hierarchies

1 Web Traffic Characterization Zipf Web Traffic Characterization Zipf [Breslau/Cao99] and

RGBD Tutorial 14210240041 Gu Pan Image RGB YUV Lab Depth Image RGB image Depth image Each pixel in

Slide 2 Caching is both the most effective AND the most cost-effective method for schools to

Slide 2 Caching is both the most effective AND the most cost-effective method for schools to

Understanding Optimal Caching and Opportunistic Caching at The Edge of Information Centric

School Partnership Conference Friday 10 th May 2019 Colchester Football Club Essex School-led

Pairwise Comparisons with Flexible Time-Dynamics Lucas Maystre , Victor Kristof, Matthias

RESULTS OF THE PROJECT NATIONAL SPORTS ACADEMY VASSIL LEVSKI INTEGRATED FOOTBALL_ T he

Meeting in Meeting in Rome Rome 2 29 9 30 30 September September 201

CACHE ARCHITECTURE Mahdi Nazm Bojnordi Assistant Professor School of Computing University of

haskell cons In haskell consing is done via the infix operator (:). For example: (cons 1 (cons 2

Associativity, preassociativity, and string functions Erkko Lehtonen Centro de lgebra da

Parsing Tools Aslan Askarov aslan@cs.au.dk Today: using parsing tools to parse 3 simple

Scaling Your Cache & Caching at Scale Alex Miller @puredanger Mission Why does caching

for each dst in my.out_edges if dst.depth > my.depth+1 then dst.depth = my.depth+1

Temporal Temporal Radiance Caching Radiance Caching Pascal Gautron R&D Engineer Thomson