Elastic Cooperative Caching: An Autonomous Dynamically Adaptive - - PowerPoint PPT Presentation

elastic cooperative caching
SMART_READER_LITE
LIVE PREVIEW

Elastic Cooperative Caching: An Autonomous Dynamically Adaptive - - PowerPoint PPT Presentation

ACM IEEE 37 th International Symposium on Computer Architecture Elastic Cooperative Caching: An Autonomous Dynamically Adaptive Memory Hierarchy for Chip Multiprocessors Enric Herrero, Jos Gonzlez, Ramon Canal Universitat


slide-1
SLIDE 1

An Autonomous Dynamically Adaptive Memory Hierarchy for Chip Multiprocessors Enric Herrero¹, José González², Ramon Canal¹

¹Universitat Politècnica de Catalunya ²Intel Barcelona

ACM IEEE 37th International Symposium on Computer Architecture

Cooperative Caching:

UNIVERSITAT POLITÈCNICA DE CATALUNYA

Elastic

slide-2
SLIDE 2

Outline

 Motivation  Related Work  Elastic Cooperative Caching  Evaluation  Conclusions

slide-3
SLIDE 3

Motivation

 Find optimal cache

  • rganization for tiled

microarchitectures

 Desired behavior

 Scalable  Minimize access latency  Minimize inter-thread

interference

 Minimize off-chip misses

Avoid centralized structures. Data placement based

  • n proximity.

Private cache partitions. Dynamic cache allocation.

slide-4
SLIDE 4

Motivation

 Application Taxonomy

 Saturating Utility  Low Utility  Shared High Utility  Private High Utility

Extended classification from Qureshi et al. [MICRO'06]

slide-5
SLIDE 5

Related Work

 Reactive NUCA [ISCA'09]  Adaptive Selective Replication [MICRO'06]  Adaptive Shared/Private NUCA [HPCA'07]  OS-page granularity.  Software based.  Common shared cache space.  Adjusts replication but not

amount of cache per node.

 Centralized structures.

More: Athena Award Lecture Mary Jane Irwin

slide-6
SLIDE 6

Elastic Cooperative Caching – Structure

Herrero et al. [PACT’08] Allocates evicted blocks from all private regions Only local core can allocate Distributes evicted blocks from private partition among nodes. Every N cycles repartitions cache based on LRU hits in S&P partitions.

slide-7
SLIDE 7

Private Cache Size Spilling Small/ Medium No Small No Small Yes Big Yes

Elastic Cooperative Caching – Adaptive Spilling

 ElasticCC oportunity: Not only repartition but also decide

which nodes can use shared partitions.

Type Working Set Size Sharing Local Reuse Saturating Utility Small/ Medium H/L H/L Low Utility Big Low Low Shared High Utility Big High H/L Private High Utility Big Low High

Spill shared blocks or blocks fromcaches with 75% or more private cache space

slide-8
SLIDE 8

Elastic Cooperative Caching – Structure

Cache Partitioning. Dynamic Cache Allocation. Independent local repartitioning units. Distributed cache among nodes. Local allocation. Private Regions.

 Desired behavior

 Scalable  Minimize access

latency

 Minimize inter-

thread interference

 Minimize off-chip

misses

slide-9
SLIDE 9

Evaluation – Studied Configurations

 16 Processors  Pairs of SPEC OMP’01 benchmarks of each of

previous categories.

 Configurations

 Shared Memory  Private Memory  Distributed Cooperative Caching (DCC)  Adaptive Selective Replication (ASR)  Elastic Cooperative Caching  ElasticCC + Adaptive Spilling  Ideal: Fixed Half Private/Half Shared 2xL2

slide-10
SLIDE 10

Evaluation – Performance & Efficiency

+12% Over ASR +24% Over ASR

slide-11
SLIDE 11

Evaluation – Off-Chip Misses & Reuse

19% Over DCC 16% Over ASR

slide-12
SLIDE 12

Evaluation – Cache Behavior Evaluation – Cache Behavior

Gafort – Low Utility Apsi, Art, Equake – Saturating Utility Ammp – Shared High Utility Swim – Private High Utility

slide-13
SLIDE 13

Evaluation – Cache Behavior Evaluation – Cache Behavior

Gafort – Low Utility No reuse, does not benefit from caches.

slide-14
SLIDE 14

Evaluation – Cache Behavior Evaluation – Cache Behavior

Apsi, Art, Equake – Saturating Utility Benefits from a given ammount of extra cache

slide-15
SLIDE 15

Evaluation – Cache Behavior Evaluation – Cache Behavior

Ammp – Shared High Utility Benefits from shared cache space.

slide-16
SLIDE 16

Evaluation – Cache Behavior Evaluation – Cache Behavior

Swim – Private High Utility Always benefits from extra cache

slide-17
SLIDE 17

Evaluation - Temporal Cache Behavior

Gafort-Equake execution, Equake Thread 1

slide-18
SLIDE 18

Conclusions

 Elastic Cooperative Caching

 Distributed organization  Adaptive behavior to application requirements

+27% Over DCC +12% Over ASR

Performance Off-Chip Misses

  • 19%

Over DCC

  • 16%

Over ASR +71% Over DCC +24% Over ASR

Energy-Efficiency

slide-19
SLIDE 19

An Autonomous Dynamically Adaptive Memory Hierarchy for Chip Multiprocessors Enric Herrero¹, José González², Ramon Canal¹

¹Universitat Politècnica de Catalunya ²Intel Barcelona eherrero@ac.upc.edu

ACM IEEE 37th International Symposium on Computer Architecture

Cooperative Caching:

UNIVERSITAT POLITÈCNICA DE CATALUNYA

Elastic