elastic cooperative caching
play

Elastic Cooperative Caching: An Autonomous Dynamically Adaptive - PowerPoint PPT Presentation

ACM IEEE 37 th International Symposium on Computer Architecture Elastic Cooperative Caching: An Autonomous Dynamically Adaptive Memory Hierarchy for Chip Multiprocessors Enric Herrero, Jos Gonzlez, Ramon Canal Universitat


  1. ACM IEEE 37 th International Symposium on Computer Architecture Elastic Cooperative Caching: An Autonomous Dynamically Adaptive Memory Hierarchy for Chip Multiprocessors Enric Herrero¹, José González², Ramon Canal¹ ¹Universitat Politècnica de Catalunya ²Intel Barcelona UNIVERSITAT POLITÈCNICA DE CATALUNYA

  2. Outline  Motivation  Related Work  Elastic Cooperative Caching  Evaluation  Conclusions

  3. Motivation  Find optimal cache organization for tiled microarchitectures Avoid centralized structures.  Desired behavior Data placement based  Scalable on proximity.  Minimize access latency  Minimize inter-thread Private cache partitions. interference Dynamic cache  Minimize off-chip misses allocation.

  4. Motivation  Application Taxonomy  Saturating Utility  Low Utility  Shared High Utility  Private High Utility Extended classification from Qureshi et al. [MICRO'06]

  5. Related Work  Reactive NUCA [ISCA'09]  Adaptive Selective Replication [MICRO'06]  Adaptive Shared/Private NUCA [HPCA'07]  OS-page granularity. More: Athena  Software based. Award Lecture Mary Jane Irwin  Common shared cache space.  Adjusts replication but not amount of cache per node.  Centralized structures.

  6. Elastic Cooperative Caching – Structure Herrero et al. [PACT’08] Allocates evicted blocks Only local core from all private can allocate regions Every N cycles repartitions cache based on Distributes LRU hits in S&P evicted blocks partitions. from private partition among nodes.

  7. Elastic Cooperative Caching – Adaptive Spilling  ElasticCC oportunity: Not only repartition but also decide which nodes can use shared partitions. Type Working Sharing Local Private Spilling Set Size Reuse Cache Size Saturating Small/ H/L H/L Small/ No Utility Medium Medium Low Utility Big Low Low Small No Shared Big High H/L Small Yes High Utility Big Yes Private Big Low High High Utility Spill shared blocks or blocks fromcaches with 75% or more private cache space

  8. Elastic Cooperative Caching – Structure  Desired behavior Distributed cache among nodes.  Scalable Local allocation.  Minimize access latency  Minimize inter- thread interference Private Regions.  Minimize off-chip misses Cache Partitioning. Dynamic Cache Independent local Allocation. repartitioning units.

  9. Evaluation – Studied Configurations  16 Processors  Pairs of SPEC OMP’01 benchmarks of each of previous categories.  Configurations  Shared Memory  Private Memory  Distributed Cooperative Caching (DCC)  Adaptive Selective Replication (ASR)  Elastic Cooperative Caching  ElasticCC + Adaptive Spilling  Ideal : Fixed Half Private/Half Shared 2xL2

  10. Evaluation – Performance & Efficiency +24% +12% Over Over ASR ASR

  11. Evaluation – Off-Chip Misses & Reuse 19% 16% Over Over DCC ASR

  12. Evaluation – Cache Behavior Evaluation – Cache Behavior Gafort – Low Utility Apsi, Art, Equake – Saturating Utility Ammp – Shared High Utility Swim – Private High Utility

  13. Evaluation – Cache Behavior Evaluation – Cache Behavior Gafort – Low Utility No reuse, does not benefit from caches.

  14. Evaluation – Cache Behavior Evaluation – Cache Behavior Apsi, Art, Equake – Saturating Utility Benefits from a given ammount of extra cache

  15. Evaluation – Cache Behavior Evaluation – Cache Behavior Ammp – Shared High Utility Benefits from shared cache space.

  16. Evaluation – Cache Behavior Evaluation – Cache Behavior Swim – Private High Utility Always benefits from extra cache

  17. Evaluation - Temporal Cache Behavior Gafort-Equake execution, Equake Thread 1

  18. Conclusions  Elastic Cooperative Caching  Distributed organization  Adaptive behavior to application requirements Performance Energy-Efficiency Off-Chip Misses -19% +27% +71% Over Over Over DCC -16% DCC DCC +12% +24% Over Over Over ASR ASR ASR

  19. ACM IEEE 37 th International Symposium on Computer Architecture Elastic Cooperative Caching: An Autonomous Dynamically Adaptive Memory Hierarchy for Chip Multiprocessors Enric Herrero¹, José González², Ramon Canal¹ ¹Universitat Politècnica de Catalunya ²Intel Barcelona eherrero@ac.upc.edu UNIVERSITAT POLITÈCNICA DE CATALUNYA

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend