Low Power Cache Design Ching-Long Su and Alvin M Despain from - - PDF document

low power cache design
SMART_READER_LITE
LIVE PREVIEW

Low Power Cache Design Ching-Long Su and Alvin M Despain from - - PDF document

Acknowlegements Low Power Cache Design Ching-Long Su and Alvin M Despain from University of Southern California,Cache Design Trade-offs for Power and Performance Optimization:A Case Study C.L and Alvin M.Despain Cache Designs


slide-1
SLIDE 1

1 Low Power Cache Design

M.Bilal Paracha Hisham Chowdhury Ali Raza

Acknowlegements

Ching-Long Su and Alvin M Despain from University

  • f Southern California,”Cache Design Trade-offs for

Power and Performance Optimization:A Case Study”

C.L and Alvin M.Despain “ Cache Designs for Energy

and Efficiency”

Zhichun Zhu Xiadong Zhang, College of William and

Mary, “Access Mode predictions for low-power cache design”

  • M. D. Powell and A. Agrawal and T. N. Vijaykumar

and B. Falsafi and K. Roy, Reducing Set-Associative Cache Energy via selective Direct –Mapping and Way Prediction.”. MICRO 2001.

Today’s talk

  • Abstract
  • Introduction
  • Use of cache in microprocessors
  • Different designs to optimize cache energy and power

consumption

  • Design Trade-offs for Power & Performance Optimization

Vertical Cache Partitioning Horizontal Cache Partitioning Gray Code Addressing

  • Set-Associative Cache Energy Reduction

Way Prediction Selective direct-mapping

  • Access Mode Prediction (AMP)

Advantages over Way Prediction and Phased cache Different prediction techniques

  • Evaluation Results

Cache Access Times Miss Rates Cache Energy consumption

Today’s talk….

Conclusion Acknowledgements

Abstract

Usage of caches in modern

microprocessors.

Caches designed for high

performance, ignore power consumption

Research activities towards low

power cache design

Introduction

Cache uses 30-60% processor

energy in embedded systems

Use of caches in high performance

machines

Various designs to optimize energy

consumption

slide-2
SLIDE 2

2

Use of cache in microprocessors

High performance products go

mobile (Notebooks, PDA’s etc)

Cache’s as temporary storage

devices

Design of components with low

power consumption

Designs to optimize cache energy consumption Vertical Cache Partitioning

Block Buffer Block Hit/Miss Block Size

Horizontal Cache Partitioning

Cache segments Cache sub-banks Reduction cache accesses Hit time, an advantage

Gray Code Addressing

  • Gray code vs 2’s compliment
  • Minimizes bit switches
  • 2s Compliment:31 bits change
  • Gray Code:16 bits change

Evaluation Results

<dm,2> A direct mapped cache with block size 2 words <dm,4> A direct mapped cache with block size 4 words <dm,8> A direct mapped cache with block size 8 words

  • <2lru,2> A 2-way set associative cache with block size

2 words

  • <2lru,4> A 2-way set associative cache with block size 4

words

  • <2lru,8> A 2-way set associative cache with block size 8

words

  • <4lru,2> A 4-way set associative cache with block size 2

words

  • <4lru,4> A 4-way set associative cache with block size 4

words

  • <4lru,8> A 4-way set associative cache with block size 8

words

slide-3
SLIDE 3

3

Cache Access Time

  • Takes less time to access direct –mapped than set associative
  • Cache access of 1K byte for dm=4.79 ns, for set assoc=7.15 ns
  • 2 way set associative is approx 50% slower than dm cache

Energy consumption vs Cache Size

Energy Consumption

Reducing Set Associative Cache Energy Via Way Prediction and Selective Direct mapping

Cache Access Energy Reduction Techniques

Energy Dissipation in Data Array is

much larger than in Tag Array so Energy Optimizations in Data Array

  • nly are done.

Selective Direct Mapping for D-

Caches

Way Prediction for I-Caches

Different Design Techniques

a) Conventional Parallel Access

slide-4
SLIDE 4

4

b) Sequential Access c) Way Prediction d) Selective Direct Mapping (DM)

Prediction Framework for Selective Direct mapping (DM) Access Mode Prediction for Low Power Cache Design

Different Cache accessing mode

Phased Cache:

  • Compares tag with all the tag in a particular set, If the tag

matches only then, it accesses the data

  • Consumes energy, not efficient

Access the set Access all n tags Access the data corresponding to the tag

↓ ↓

slide-5
SLIDE 5

5

Way Prediction:

  • Access only the predicted tag and data
  • Efficient when hit rate is high
  • Not very efficient when there is a miss (has to access rest of

the tag and data elements)

Access the set Way Prediction Access the predicted data and tag sub array in the set Prediction Correct Proceed Compare the rest of the data and tag array Yes ↓→No

↓ ↓ ↓

Access Mode Prediction (AMP)

  • Prediction based approach
  • Better to use Way Prediction when hit rate is very high
  • When hit rate is low, it is preferable to use Phased

Cache approach

  • Predicts whether cache access will result in a hit or a
  • miss. If it predicts a hit then Way prediction is used,
  • ther wise use Phased Cache approach
  • Accuracy of the access mode determines the efficiency
  • f the approach

Power Consumption:

Perfect AMP and perfect Way Prediction has a power

consumption which is the lower bound of conventional set associative cache.

predicted hit in the way-prediction cache, the

energy consumed is Etag +Edata, compared with n × Etag+ Edata in the phased cache

miss in the way-prediction cache will consume (n + 1)

×Etag + (n + 1) × Edata, in comparison with (n +1) × Etag + Edata in the phased cache.

Different Predictors

  • Saturating Counter:
  • Similar to the saturating counter of branch prediction used in project2
  • Maintains a two bit counter which increments on a cache hit and decrements on a

cache miss

  • Two-level adaptive predictor:
  • Adaptive two level branch prediction using global pattern-history table (GAg)
  • K bit history register records the result of most recent K accesses
  • For a hit register records a 1, otherwise 0
  • This K bit is used to index global pattern history table which has 2^K entries, each entry is a

2 bit saturation counter

  • Per address two level global pattern history table (PAg)
  • Each set has its own access history register
  • All history register index a single history pattern table
  • Correlation predictor
  • Gshare predictor:
  • XOR of global access history with current reference set provides the

index for global pattern history table

Misprediction rate of different predictors

Conclusion

Cache Designs can be modified to obtain

maximum performance and optimal energy consumption

Experiments suggest that

direct-mapped caches (inst and data) consume less

energy for dynamic logic

Set Associative consume less energy for static logic

Circuit level techniques can no longer keep power

dissipation under a reasonable level.

Reduction of power is done on architectural level.

By producing different schemes for reducing on- chip cache power consumption

slide-6
SLIDE 6

6

Questions…???