low power cache design
play

Low Power Cache Design Ching-Long Su and Alvin M Despain from - PDF document

Acknowlegements Low Power Cache Design Ching-Long Su and Alvin M Despain from University of Southern California,Cache Design Trade-offs for Power and Performance Optimization:A Case Study C.L and Alvin M.Despain Cache Designs


  1. Acknowlegements Low Power Cache Design � Ching-Long Su and Alvin M Despain from University of Southern California,”Cache Design Trade-offs for Power and Performance Optimization:A Case Study” � C.L and Alvin M.Despain “ Cache Designs for Energy and Efficiency” M.Bilal Paracha � Zhichun Zhu Xiadong Zhang, College of William and Mary, “Access Mode predictions for low-power cache Hisham Chowdhury design” � M. D. Powell and A. Agrawal and T. N. Vijaykumar Ali Raza and B. Falsafi and K. Roy, Reducing Set-Associative Cache Energy via selective Direct –Mapping and Way Prediction.”. MICRO 2001. Today’s talk Today’s talk…. Abstract � Conclusion � Introduction � � Acknowledgements Use of cache in microprocessors � Different designs to optimize cache energy and power � consumption Design Trade-offs for Power & Performance Optimization � � Vertical Cache Partitioning � Horizontal Cache Partitioning � Gray Code Addressing Set-Associative Cache Energy Reduction � � Way Prediction � Selective direct-mapping Access Mode Prediction (AMP) � � Advantages over Way Prediction and Phased cache � Different prediction techniques Evaluation Results � � Cache Access Times � Miss Rates � Cache Energy consumption Abstract Introduction � Usage of caches in modern � Cache uses 30-60% processor microprocessors. energy in embedded systems � Caches designed for high � Use of caches in high performance performance, ignore power machines consumption � Various designs to optimize energy � Research activities towards low consumption power cache design 1

  2. Use of cache in microprocessors � High performance products go mobile (Notebooks, PDA’s etc) Designs to optimize cache � Cache’s as temporary storage energy consumption devices � Design of components with low power consumption Vertical Cache Partitioning Horizontal Cache Partitioning � Block Buffer � Block Hit/Miss � Cache segments � Block Size � Cache sub-banks � Reduction cache accesses � Hit time, an advantage Gray Code Addressing Evaluation Results � < dm,2> A direct mapped cache with block size 2 words •Gray code vs 2’s compliment � < dm,4> A direct mapped cache with block size 4 words •Minimizes bit switches � < dm,8> A direct mapped cache with block size 8 words •2s Compliment:31 bits change < 2lru,2> A 2-way set associative cache with block size •Gray Code:16 bits change � 2 words <2lru,4> A 2-way set associative cache with block size 4 � words <2lru,8> A 2-way set associative cache with block size 8 � words <4lru,2> A 4-way set associative cache with block size 2 � words <4lru,4> A 4-way set associative cache with block size 4 � words <4lru,8> A 4-way set associative cache with block size 8 � words 2

  3. Cache Access Time Energy consumption vs Cache Size oTakes less time to access direct –mapped than set associative oCache access of 1K byte for dm=4.79 ns, for set assoc=7.15 ns o2 way set associative is approx 50% slower than dm cache Energy Consumption Reducing Set Associative Cache Energy Via Way Prediction and Selective Direct mapping Cache Access Energy Reduction Different Design Techniques Techniques a) Conventional Parallel Access � Energy Dissipation in Data Array is much larger than in Tag Array so Energy Optimizations in Data Array only are done. � Selective Direct Mapping for D- Caches � Way Prediction for I-Caches 3

  4. b) Sequential Access c) Way Prediction Prediction Framework for Selective d) Selective Direct Mapping (DM) Direct mapping (DM) Different Cache accessing mode Access Mode Prediction for Low � Phased Cache: Compares tag with all the tag in a particular set, If the tag � Power Cache Design matches only then, it accesses the data Consumes energy, not efficient � Access the set ↓ Access all n tags ↓ Access the data corresponding to the tag 4

  5. � Way Prediction: � Access Mode Prediction (AMP) Access only the predicted tag and data Prediction based approach � � Efficient when hit rate is high Better to use Way Prediction when hit rate is very high � � Not very efficient when there is a miss (has to access rest of � When hit rate is low, it is preferable to use Phased � the tag and data elements) Cache approach Access the set Predicts whether cache access will result in a hit or a � ↓ miss. If it predicts a hit then Way prediction is used, Way Prediction other wise use Phased Cache approach ↓ Accuracy of the access mode determines the efficiency � Access the predicted data and tag sub array in the set ↓ of the approach Prediction Correct Yes ↓→ No Compare the rest of the data and tag array Proceed Different Predictors Saturating Counter: � Power Consumption: � Similar to the saturating counter of branch prediction used in project2 � Perfect AMP and perfect Way Prediction has a power � Maintains a two bit counter which increments on a cache hit and decrements on a � consumption which is the lower bound of conventional cache miss set associative cache. Two-level adaptive predictor: � � predicted hit in the way-prediction cache, the Adaptive two level branch prediction using global pattern-history table (GAg) � K bit history register records the result of most recent K accesses � energy consumed is E tag + E data, compared with n × For a hit register records a 1, otherwise 0 � This K bit is used to index global pattern history table which has 2^K entries, each entry is a � E tag+ E data in the phased cache 2 bit saturation counter Per address two level global pattern history table (PAg) � Each set has its own access history register � All history register index a single history pattern table � � miss in the way-prediction cache will consume ( n + 1) × E tag + ( n + 1) × E data, in comparison with ( n +1) × Correlation predictor � E tag + E data in the phased cache. Gshare predictor: � XOR of global access history with current reference set provides the � index for global pattern history table Conclusion Misprediction rate of different predictors � Cache Designs can be modified to obtain maximum performance and optimal energy consumption � Experiments suggest that � direct-mapped caches (inst and data) consume less energy for dynamic logic � Set Associative consume less energy for static logic � Circuit level techniques can no longer keep power dissipation under a reasonable level. � Reduction of power is done on architectural level. By producing different schemes for reducing on- chip cache power consumption 5

  6. Questions…??? 6

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend