Summary NUCA is giving us more capacity, but further away 40 - PowerPoint PPT Presentation

J IGSAW : S CALABLE S OFTWARE -D EFINED C ACHES N ATHAN B ECKMANN AND D ANIEL S ANCHEZ MIT CSAIL PACT’13 - E DINBURGH , S COTLAND S EP 11, 2013

Summary ¨ NUCA is giving us more capacity, but further away 40 ¨ Applications have widely libquantum MPKI zeusmp varying cache behavior sphinx3 0 16MB Cache Size ¨ Cache organization should adapt to application ¨ Jigsaw uses physical cache resources as building blocks of virtual caches, or shares

Approach 3 ¨ Jigsaw uses physical cache resources as building blocks of virtual caches, or shares libquantum zeusmp Bank 40 MPKI Tiled Multicore 0 16MB Cache Size sphinx3

Agenda 4 ¨ Introduction ¨ Background ¤ Goals ¤ Existing Approaches ¨ Jigsaw Design ¨ Evaluation

Goals 5 ¨ Make effective use of cache capacity ¨ Place data for low latency ¨ Provide capacity isolation for performance ¨ Have a simple implementation

Existing Approaches: S-NUCA 6 Spread lines evenly across banks ¨ High Capacity ¨ High Latency ¨ No Isolation ¨ Simple

Existing Approaches: Partitioning 7 Isolate regions of cache between applications. ¨ High Capacity ¨ High Latency ¨ Isolation ¨ Simple ¨ Jigsaw needs partitioning; uses Vantage to get strong guarantees with no loss in associativity

Existing Approaches: Private 8 Place lines in local bank ¨ Low Capacity ¨ Low Latency ¨ Isolation ¨ Complex – LLC directory

Existing Approaches: D-NUCA 9 Placement, migration, and replication heuristics ¨ High Capacity ¤ But beware of over-replication and restrictive mappings ¨ Low Latency ¤ Don’t fully exploit capacity vs. latency tradeoff ¨ No Isolation ¨ Complexity Varies ¤ Private-baseline schemes require LLC directory

Existing Approaches: Summary 10 S-NUCA Private D-NUCA Partitioning High Yes Yes No Yes Capacity Low No No Yes Yes Latency Isolation No Yes Yes No Simple Yes Yes No Depends

Jigsaw 11 ¨ High Capacity – Any share can take full capacity, no replication ¨ Low Latency – Shares allocated near cores that use them ¨ Isolation – Partitions within each bank ¨ Simple – Low overhead hardware, no LLC directory, software-managed

Agenda 12 ¨ Introduction ¨ Background ¨ Jigsaw Design ¤ Operation ¤ Monitoring ¤ Configuration ¨ Evaluation

Jigsaw Components 13 Accesses Size & Placement Operation Configuration Monitoring Miss Curves

Jigsaw Components 14 Operation Configuration Monitoring

Operation: Access 16 Data è shares, so no LLC coherence required Share 1 STB TLB Share 2 LLC Classifier Share 3 ... L2 Share N L1I L1D Core LD 0x5CA1AB1E

Data Classification 17 ¨ Jigsaw classifies data based on access pattern ¤ Thread, Process, Global, and Kernel • 6 thread shares • 2 process shares • 1 global share • 1 kernel share Operating System ¨ Data lazily re-classified on TLB miss ¤ Similar to R-NUCA but… n R-NUCA: Classification è Location n Jigsaw: Classification è Share (sized & placed dynamically) ¤ Negligible overhead

Operation: Share-bank Translation Buffer 18 ¨ Gives unique location of ¨ Hash lines proportionally the line in the LLC Share: q ¨ Address, Share è STB: A A B A A B q Bank, Partition ¨ 400 bytes; low overhead Address (from L1 miss) Share Id (from TLB) 0x5CA1AB1E 2706 H Bank/ Bank/ 1 Part 0 Part 63 STB Entry STB Entry … STB Entry 1/3 3/5 1/3 0/8 STB Entry Share Config Address 0x5CA1AB1E maps to 4 entries, associative, bank 3, partition 5 exception on miss

Monitoring 20 ¨ Software requires miss curves for each share ¨ Add utility monitors (UMONs) per tile to produce miss curves ¨ Dynamic sampling to model full LLC at each bank; see paper Misses Way 0 … Way N-1 … 0x3DF7AB 0xFE3D98 0xDD380B 0x3930EA … 0xB3D3GA 0x0E5A7B 0x123456 0x7890AB Tag … 0xCDEF00 0x3173FC 0xCDC911 0xBAD031 Array … … … 0x7A5744 0x7A4A70 0xADD235 0x541302 Size Hit Cache Size … 717,543 117,030 213,021 32,103 Counters

Configuration 21 ¨ Software decides share configuration ¨ Approach: Size è Place ¤ Solving independently is simple ¤ Sizing is hard, placing is easy Misses SIZE PLACE Size LLC

Configuration: Sizing 22 ¨ Partitioning problem: Divide cache capacity of S among P partitions/shares to maximize hits ¨ Use miss curves to describe partition behavior ¨ NP-complete in general ¨ Existing approaches: ¤ Hill climbing is fast but gets stuck in local optima ¤ UCP Lookahead is good but scales quadratically: O(P x S 2 ) Utility-based Cache Partitioning, Qureshi and Patt, MICRO’06 Can we scale Lookahead?

Configuration: Lookahead 23 ¨ UCP Lookahead: ¤ Scan miss curves to find allocation that maximizes average cache utility (hits per byte) Misses Size LLC Size

Configuration: Lookahead 37 ¨ UCP Lookahead: ¤ Scan miss curves to find allocation that maximizes average cache utility (hits per byte) Maximum cache utility Misses Size LLC Size

Configuration: Lookahead 40 ¨ Observation: Lookahead traces the convex hull of the miss curve Maximum cache utility Misses Size LLC Size

Convex Hulls 41 ¨ The convex hull of a curve is the set containing all lines between any two points on the curve, or “the curve connecting the points along the bottom” Misses Misses Size Size LLC Size LLC Size

Configuration: Peekahead 42 ¨ There are well-known linear algorithms to compute convex hulls ¨ Peekahead algorithm is an exact, linear-time implementation of UCP Lookahead Misses Misses Size Size LLC Size LLC Size

Configuration: Peekahead 43 ¨ Peekahead computes all convex hulls encountered during allocation in linear time ¤ Starting from every possible allocation ¤ Up to any remaining cache capacity Misses Misses Size Size LLC Size LLC Size

Configuration: Peekahead 44 ¨ Knowing the convex hull, each allocation step is O(log P) ¤ Convex hulls have decreasing slope è decreasing average cache utility è only consider next point on hull ¤ Use max-heap to compare between partitions Best Step?

Summary NUCA is giving us more capacity, but further away 40 - PowerPoint PPT Presentation

J IGSAW : S CALABLE S OFTWARE -D EFINED C ACHES N ATHAN B ECKMANN AND D ANIEL S ANCHEZ MIT CSAIL PACT13 - E DINBURGH , S COTLAND S EP 11, 2013 Summary NUCA is giving us more capacity, but further away 40 Applications have widely

Baldwin Space Summary October 25 1 Baldwin School Space Summary 2 Baldwin School Space Summary

1 Product Range Products 2 summary summary summary summary Relays with 8 and 11-Pins

An Ultramarathon Pie with Doge Glaze An Ultramarathon Pie with Doge Glaze Marathon: The Summary

SUMMARY OF 2 0 1 5 BRI TI SH EVENTI NG DATA DATA SUMMARY 2015 68,269 Cross Country Starters

summary(dsm_x_tw) summary(dsm_xyb_tw) summary(dsm_xy_tw) Overview Estimating smooths How

New patent case filings per year 1 Summary Judgment motions per year 2 All courts: 101 Summary

Search Summary Search Summary Some material from: D Lin, J You, JC Latombe 1 Search Summary #

Q3FY18 RESULTS Results Summary Operating Highlights Financial Summary Key Strategies Appendix

Summary 1. Summary of

Preliminary Results For year end 31st July 2019 6 November 2019 SUMMARY & OUTLOOK SUMMARY

EXECUTIVE SUMMARY ABOUT SEMPERTI Semperti Executive Summary Version: v1 // 2016 SEMPERTI

Q1FY18 RESULTS Results Summary Operating Highlights Financial Summary Key Strategies Appendix

How similar are these curves? Jessica Sherette EAPSI Research and Experience Summary of Proposal

Lecture 12: Summary Summary Advanced Digital Communications (EQ2410) 1 Standards Final Exam

Security Summary Michael McCool Intel Osaka, W3C Web of Things F2F, 17 May 2017 Summary

GDRSD FINANCIAL GDRSD FINANCIAL GDRSD FINANCIAL GDRSD FINANCIAL OVERVIEW SUMMARY OVERVIEW

Clustering-Based, Fully Automated Mixed-Bag Jigsaw Puzzle Solving Zayd Hammoudeh Chris Pollett

RED BANK CORRIDOR Community Partners Committee Meeting Madisonville Recreation Center May 21,

Investor Presentation - Year to 31 December 2016 March 2017 Hig High h div dividend dis distr

Where we're at: Syntax analysis of VSL Things needed to Submit homework (pdfs and

A Jigsaw Lesson for First-Order Logic Translations Using Identity Russell Marcus Hamilton

Creating mathematical jigsaw puzzles using T EX and friends Julian Gilbey Department of Pure

Un Unsupervised Visu Visual Re Representation Le Learn rning by by Co Context Pr

JIGSAW Model Webinar Assessing a multi-vector energy system and control Dr. Sagar Mody Technical

Summary NUCA is giving us more capacity, but further away 40 - PowerPoint PPT Presentation

J IGSAW : S CALABLE S OFTWARE -D EFINED C ACHES N ATHAN B ECKMANN AND D ANIEL S ANCHEZ MIT CSAIL PACT13 - E DINBURGH , S COTLAND S EP 11, 2013 Summary NUCA is giving us more capacity, but further away 40 Applications have widely

Baldwin Space Summary October 25 1 Baldwin School Space Summary 2 Baldwin School Space Summary

1 Product Range Products 2 summary summary summary summary Relays with 8 and 11-Pins

An Ultramarathon Pie with Doge Glaze An Ultramarathon Pie with Doge Glaze Marathon: The Summary

SUMMARY OF 2 0 1 5 BRI TI SH EVENTI NG DATA DATA SUMMARY 2015 68,269 Cross Country Starters

summary(dsm_x_tw) summary(dsm_xyb_tw) summary(dsm_xy_tw) Overview Estimating smooths How

New patent case filings per year 1 Summary Judgment motions per year 2 All courts: 101 Summary

Search Summary Search Summary Some material from: D Lin, J You, JC Latombe 1 Search Summary #

Q3FY18 RESULTS Results Summary Operating Highlights Financial Summary Key Strategies Appendix

Summary 1. Summary of

Preliminary Results For year end 31st July 2019 6 November 2019 SUMMARY &amp; OUTLOOK SUMMARY

EXECUTIVE SUMMARY ABOUT SEMPERTI Semperti Executive Summary Version: v1 // 2016 SEMPERTI

Q1FY18 RESULTS Results Summary Operating Highlights Financial Summary Key Strategies Appendix

How similar are these curves? Jessica Sherette EAPSI Research and Experience Summary of Proposal

Lecture 12: Summary Summary Advanced Digital Communications (EQ2410) 1 Standards Final Exam

Security Summary Michael McCool Intel Osaka, W3C Web of Things F2F, 17 May 2017 Summary

GDRSD FINANCIAL GDRSD FINANCIAL GDRSD FINANCIAL GDRSD FINANCIAL OVERVIEW SUMMARY OVERVIEW

Clustering-Based, Fully Automated Mixed-Bag Jigsaw Puzzle Solving Zayd Hammoudeh Chris Pollett

RED BANK CORRIDOR Community Partners Committee Meeting Madisonville Recreation Center May 21,

Investor Presentation - Year to 31 December 2016 March 2017 Hig High h div dividend dis distr

Where we're at: Syntax analysis of VSL Things needed to Submit homework (pdfs and

A Jigsaw Lesson for First-Order Logic Translations Using Identity Russell Marcus Hamilton

Creating mathematical jigsaw puzzles using T EX and friends Julian Gilbey Department of Pure

Un Unsupervised Visu Visual Re Representation Le Learn rning by by Co Context Pr

JIGSAW Model Webinar Assessing a multi-vector energy system and control Dr. Sagar Mody Technical

Preliminary Results For year end 31st July 2019 6 November 2019 SUMMARY & OUTLOOK SUMMARY