green cm energy efficient contention management for
play

Green-CM: Energy efficient contention management for Transactional - PowerPoint PPT Presentation

Green-CM: Energy efficient contention management for Transactional Memory Shady Alaa Paolo Romano INESC-ID/IST Mats Brorsson - KTH Agenda Introduction Related work Architecture Green-CM Evaluation Conclusion ICPP


  1. Green-CM: Energy efficient contention management for Transactional Memory Shady Alaa Paolo Romano – INESC-ID/IST Mats Brorsson - KTH

  2. Agenda • Introduction • Related work • Architecture • Green-CM • Evaluation • Conclusion ICPP 2015 - Green-CM 2

  3. Introduction • Multicores are everywhere Main memory – Complex programming • Locks • Deadlocks Core 1 Core 2 Core 3 Core 4 – Transactional memory • Atomics blocks • Transparent from programmer atomic{ if(bal>amount) withdraw(amount); } ICPP 2015 - Green-CM 3

  4. Introduction • Energy efficiency – First order design choice – Battery based devices – Data centers • Goal – Energy efficient transactional memory in terms of both energy and performance ICPP 2015 - Green-CM 4

  5. Introduction • Contention Manager – minimize contention – which transaction to abort – when to restart an aborted transaction • Energy efficiency: – wait implementation – DVFS ICPP 2015 - Green-CM 5

  6. Related work • Few work in literature – Mainly HTM • Clock gating processors upon abort – Lowering frequency upon abort • Using simulator • Studies – HTM consume lower energy • Does not fit all workloads – Need for adaptability • Using DVFS in TM – Fastlane • Designed for low number of threads ICPP 2015 - Green-CM 6

  7. Architecture Throughput* Controller* * Energy* Tuning*of* Tuning*of* Β* α,*Τ* * End** backEoff* Hybrid* Asymmetric* Tx*abort* backEoff* dura.on* Wait** Conten.on (no.*of*retries,* Restart* Implementa.on* Manager* core*on*which* Tx* tx*is*execu.ng)* ICPP 2015 - Green-CM 7

  8. Architecture Throughput* Controller* * Energy* Tuning*of* Tuning*of* Β* α,*Τ* * End** backEoff* Hybrid* Asymmetric* Tx*abort* backEoff* dura.on* Wait** Conten.on (no.*of*retries,* Restart* Implementa.on* Manager* core*on*which* Tx* tx*is*execu.ng)* ICPP 2015 - Green-CM 8

  9. Implementing waits • Building block for contention managers • Drastic effect on energy consumption • Can be implemented in two ways: – Busy waiting – sleeping ICPP 2015 - Green-CM 9

  10. Implementing waits • Busy waiting • Sleeping – Fine granularity – Coarse granularity – Similar to real actual – Low energy work consumption – expensive ICPP 2015 - Green-CM 10

  11. Implementing waits • Hybrid approach – Either busy wait or sleep • Adaptive fashion – How to determine the threshold • Cost of sleep ICPP 2015 - Green-CM 11

  12. Implementing waits No one size fits all Static Thresholds 6 Intruder 5.5 Kmeans 5 EDP / best EDP 4.5 4 3.5 3 2.5 2 1.5 1 100 1000 10000 100000 1x10 6 1x10 7 Threshold � ICPP 2015 - Green-CM 12

  13. Architecture Throughput* Controller* * Energy* Tuning*of* Tuning*of* Β* α,*Τ* * End** backEoff* Hybrid* Asymmetric* Tx*abort* backEoff* dura.on* Wait** Conten.on (no.*of*retries,* Restart* Implementa.on* Manager* core*on*which* Tx* tx*is*execu.ng)* ICPP 2015 - Green-CM 13

  14. Asymmetric CM • DVFS P0 3.0 GHz – Variable operating frequency 2.4 GHz P1 P2 2.2 GHz • Exploiting DVFS – Boosting active threads P3 2.0 GHz – Reducing freq. of backing off P4 1.8 GHz threads P5 1.6 GHz • Enabling DVFS P6 1.4 GHz – Manual control is expensive – How to favor automatic boosting ICPP 2015 - Green-CM 14

  15. Asymmetric CM Linear Linear Busy Busy Boosted Boosted • Linear backoff cores: backoff backoff wait waiting – Shorter backoff periods Exp. Exp. – Mainly busy waiting Sleep Sleep Sleep Sleep Backoff Backoff backoffs • Exp. Backoff cores: Exp. Exp. – Longer backoff periods Sleep Sleep Sleep Sleep Backoff Backoff – Mainly sleep waiting • Favor boosting Exp. Exp. Sleep Sleep Sleep Sleep – When enough cores are Backoff Backoff in sleep states 8 core processor ICPP 2015 - Green-CM 15

  16. Asymmetric CM • Increased contention? – Cores not backing off exponentially • Control number of cores to be boosted ICPP 2015 - Green-CM 16

  17. Asymmetric CM Intruder Genome STM7 Kmeans Memcached Static No. of Boosted Threads 1.8 1.6 EDP / best EDP 1.4 1.2 1 0.8 0.6 0.4 0.2 0 2 4 8 16 No. of Boosted Threads ICPP 2015 - Green-CM 17

  18. Architecture Throughput* Controller* * Energy* Tuning*of* Tuning*of* Β* α,*Τ* * End** backEoff* Hybrid* Asymmetric* Tx*abort* backEoff* dura.on* Wait** Conten.on (no.*of*retries,* Restart* Implementa.on* Manager* core*on*which* Tx* tx*is*execu.ng)* ICPP 2015 - Green-CM 18

  19. Controller • Online, lightweight • Hill climbing • Challenges: – Collection of energy – Multi dimensional • Different exploration strategies – Stabilization – Random jumps ICPP 2015 - Green-CM 19

  20. Controller • Tuning α (threshold for hybrid) 2.5 no stab EDP / best EDP 2 stab stab jmp 1 1.5 stab jmp 10 1 0.5 0 I K M S A n m T v t e r e M m u e r d a a 7 c e n g a r s e c h e d Benchmark ICPP 2015 - Green-CM 20

  21. Controller no stab stab • Tuning β (no. of boosted threads) stab jmp 1 stab jmp 10 1.6 1.4 EDP / best EDP 1.2 1 0.8 0.6 0.4 0.2 0 I K M S A n m T v t e r e M m u e r d a a 7 c e n g a r s e c h e d Benchmark ICPP 2015 - Green-CM 21

  22. Controller • Merging the learners independent stab jmp 1 stab jmp 1 – stab bidim stab jmp 1 stab – stab stab jmp 10 – stab 2.5 Coupling the Tuners EDP / best EDP 2 1.5 1 0.5 0 Intruder Kmeans Memcached STM7 Average Benchmark ICPP 2015 - Green-CM 22

  23. Evaluation Intruder 1.2 EDP-GreenCM / EDP 1 0.8 0.6 0.4 0.2 0 4 8 16 32 48 64 Threads ICPP 2015 - Green-CM 23

  24. Evaluation STM7 1.2 EDP-GreenCM / EDP 1 0.8 0.6 0.4 0.2 0 4 8 16 32 48 64 Threads ICPP 2015 - Green-CM 24

  25. Evaluation Memcached 1.2 EDP-GreenCM / EDP 1 0.8 0.6 0.4 0.2 0 4 8 16 32 48 64 Threads ICPP 2015 - Green-CM 25

  26. Evaluation Intruder, 64 threads p6 p5 % of total cores p4 p3 p2 p1 p0 spin no-asym asym ICPP 2015 - Green-CM 26

  27. Conclusion • Implementation of waits has a significant impact on energy efficiency • Experimental results (obtained on real system) contradict previously published ones based on simulation • Exploiting DVFS enhances energy efficiency • Self-tuning is needed to adapt to different workloads ICPP 2015 - Green-CM 27

  28. THANK YOU ICPP 2015 - Green-CM 28

  29. Evaluation Intruder 1.2 Energy-GreenCM / Energy 1 0.8 0.6 0.4 0.2 0 4 8 16 32 48 64 Threads ICPP 2015 - Green-CM 29

  30. Evaluation Intruder 1.2 1 Time-GreenCM / Time 0.8 0.6 0.4 0.2 0 4 8 16 32 48 64 Threads ICPP 2015 - Green-CM 30

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend