Green-CM: Energy efficient contention management for Transactional - - PowerPoint PPT Presentation
Green-CM: Energy efficient contention management for Transactional - - PowerPoint PPT Presentation
Green-CM: Energy efficient contention management for Transactional Memory Shady Alaa Paolo Romano INESC-ID/IST Mats Brorsson - KTH Agenda Introduction Related work Architecture Green-CM Evaluation Conclusion ICPP
Agenda
- Introduction
- Related work
- Architecture
- Green-CM
- Evaluation
- Conclusion
ICPP 2015 - Green-CM 2
Introduction
- Multicores are everywhere
– Complex programming
- Locks
- Deadlocks
– Transactional memory
- Atomics blocks
- Transparent from programmer
Main memory
Core 1 Core 2 Core 3 Core 4
atomic{ if(bal>amount) withdraw(amount); }
ICPP 2015 - Green-CM 3
Introduction
- Energy efficiency
– First order design choice – Battery based devices – Data centers
- Goal
– Energy efficient transactional memory in terms of both energy and performance
ICPP 2015 - Green-CM 4
Introduction
- Contention Manager
– minimize contention – which transaction to abort – when to restart an aborted transaction
- Energy efficiency:
– wait implementation – DVFS
ICPP 2015 - Green-CM 5
Related work
- Few work in literature
– Mainly HTM
- Clock gating processors upon abort
– Lowering frequency upon abort
- Using simulator
- Studies
– HTM consume lower energy
- Does not fit all workloads
– Need for adaptability
- Using DVFS in TM
– Fastlane
- Designed for low number of threads
ICPP 2015 - Green-CM 6
Architecture
Asymmetric* Conten.on Manager* Tx*abort* (no.*of*retries,* core*on*which* tx*is*execu.ng)* Throughput*
*
Energy* Controller* Hybrid* Wait** Implementa.on* backEoff* dura.on* Tuning*of* Β* Tuning*of* α,*Τ* * End** backEoff* Restart* Tx*
ICPP 2015 - Green-CM 7
Architecture
Asymmetric* Conten.on Manager* Tx*abort* (no.*of*retries,* core*on*which* tx*is*execu.ng)* Throughput*
*
Energy* Controller* Hybrid* Wait** Implementa.on* backEoff* dura.on* Tuning*of* Β* Tuning*of* α,*Τ* * End** backEoff* Restart* Tx*
ICPP 2015 - Green-CM 8
Implementing waits
- Building block for contention managers
- Drastic effect on energy consumption
- Can be implemented in two ways:
– Busy waiting – sleeping
ICPP 2015 - Green-CM 9
Implementing waits
- Busy waiting
– Fine granularity – Similar to real actual work
- Sleeping
– Coarse granularity – Low energy consumption – expensive
ICPP 2015 - Green-CM 10
Implementing waits
- Hybrid approach
– Either busy wait or sleep
- Adaptive fashion
– How to determine the threshold
- Cost of sleep
ICPP 2015 - Green-CM 11
Implementing waits
1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 100 1000 10000 100000 1x106 1x107 EDP / best EDP
- Static Thresholds
Intruder Kmeans Threshold
ICPP 2015 - Green-CM 12
No one size fits all
Architecture
Asymmetric* Conten.on Manager* Tx*abort* (no.*of*retries,* core*on*which* tx*is*execu.ng)* Throughput*
*
Energy* Controller* Hybrid* Wait** Implementa.on* backEoff* dura.on* Tuning*of* Β* Tuning*of* α,*Τ* * End** backEoff* Restart* Tx*
ICPP 2015 - Green-CM 13
Asymmetric CM
- DVFS
– Variable operating frequency
- Exploiting DVFS
– Boosting active threads – Reducing freq. of backing off threads
- Enabling DVFS
– Manual control is expensive – How to favor automatic boosting
P0 3.0 GHz P1 2.4 GHz P2 2.2 GHz P3 2.0 GHz P4 1.8 GHz P5 1.6 GHz P6 1.4 GHz
ICPP 2015 - Green-CM 14
Asymmetric CM
Linear backoff Exp. Backoff Exp. Backoff Exp. Backoff Linear backoff Exp. Backoff Exp. Backoff Exp. Backoff
- Linear backoff cores:
– Shorter backoff periods – Mainly busy waiting backoffs
- Exp. Backoff cores:
– Longer backoff periods – Mainly sleep waiting
- Favor boosting
– When enough cores are in sleep states
8 core processor
Busy wait Sleep Sleep Sleep Busy waiting Sleep Sleep Sleep
Boosted Sleep Sleep Sleep Boosted Sleep Sleep Sleep
ICPP 2015 - Green-CM 15
Asymmetric CM
- Increased contention?
– Cores not backing off exponentially
- Control number of cores to be boosted
ICPP 2015 - Green-CM 16
Asymmetric CM
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 4 8 16 EDP / best EDP
- No. of Boosted Threads
Static No. of Boosted Threads Intruder Kmeans Genome Memcached STM7
ICPP 2015 - Green-CM 17
Architecture
Asymmetric* Conten.on Manager* Tx*abort* (no.*of*retries,* core*on*which* tx*is*execu.ng)* Throughput*
*
Energy* Controller* Hybrid* Wait** Implementa.on* backEoff* dura.on* Tuning*of* Β* Tuning*of* α,*Τ* * End** backEoff* Restart* Tx*
ICPP 2015 - Green-CM 18
Controller
- Online, lightweight
- Hill climbing
- Challenges:
– Collection of energy – Multi dimensional
- Different exploration strategies
– Stabilization – Random jumps
ICPP 2015 - Green-CM 19
Controller
- Tuning α (threshold for hybrid)
0.5 1 1.5 2 2.5 I n t r u d e r K m e a n s M e m c a c h e d S T M 7 A v e r a g e EDP / best EDP Benchmark
no stab stab stab jmp 1 stab jmp 10
ICPP 2015 - Green-CM 20
Controller
- Tuning β (no. of boosted threads)
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 I n t r u d e r K m e a n s M e m c a c h e d S T M 7 A v e r a g e EDP / best EDP Benchmark
no stab stab stab jmp 1 stab jmp 10
ICPP 2015 - Green-CM 21
Controller
- Merging the learners
0.5 1 1.5 2 2.5 Intruder Kmeans Memcached STM7 Average EDP / best EDP Benchmark
Coupling the Tuners
independent stab jmp 1 stab – stab stab jmp 1 – stab stab jmp 10 – stab bidim stab jmp 1
ICPP 2015 - Green-CM 22
Evaluation
0.2 0.4 0.6 0.8 1 1.2 4 8 16 32 48 64 EDP-GreenCM / EDP Threads Intruder
ICPP 2015 - Green-CM 23
Evaluation
0.2 0.4 0.6 0.8 1 1.2 4 8 16 32 48 64 EDP-GreenCM / EDP Threads STM7
ICPP 2015 - Green-CM 24
Evaluation
0.2 0.4 0.6 0.8 1 1.2 4 8 16 32 48 64 EDP-GreenCM / EDP Threads Memcached
ICPP 2015 - Green-CM 25
Evaluation
spin no-asym asym % of total cores Intruder, 64 threads p0 p1 p2 p3 p4 p5 p6
ICPP 2015 - Green-CM 26
Conclusion
- Implementation of waits has a significant
impact on energy efficiency
- Experimental results (obtained on real
system) contradict previously published
- nes based on simulation
- Exploiting DVFS enhances energy
efficiency
- Self-tuning is needed to adapt to different
workloads
ICPP 2015 - Green-CM 27
THANK YOU
ICPP 2015 - Green-CM 28
Evaluation
ICPP 2015 - Green-CM 29
0.2 0.4 0.6 0.8 1 1.2 4 8 16 32 48 64 Energy-GreenCM / Energy Threads Intruder
Evaluation
ICPP 2015 - Green-CM 30
0.2 0.4 0.6 0.8 1 1.2 4 8 16 32 48 64 Time-GreenCM / Time Threads Intruder