automatic management of turbomode
play

Automatic Management of TurboMode David Lo Christos Kozyrakis - PowerPoint PPT Presentation

Automatic Management of TurboMode David Lo Christos Kozyrakis Stanford University http://mast.stanford.edu Executive Summary ! TurboMode overclocks cores to exhaust thermal budget ! An important performance feature of multi-core x86 servers !


  1. Automatic Management of TurboMode David Lo Christos Kozyrakis Stanford University http://mast.stanford.edu

  2. Executive Summary ! TurboMode overclocks cores to exhaust thermal budget ! An important performance feature of multi-core x86 servers ! Challenge: turbomode does not always benefit workloads ! Naively turning TurboMode on often leads to high energy waste ! Solution: predictive model to manage TurboMode (on/off) ! Using machine learning on performance counter data ! Eliminates negative cases, boosts ED and ED 2 by 47% and 68% 2 HPCA-20 February 19, 2014

  3. What is TurboMode (TM)? ! Dynamic overclocking of cores to exhaust thermal budget ! Matches actual power consumption to max design TDP ! Big performance gains: up to 60% frequency boost ! Found on all modern x86 multi-cores ! TurboMode control ! Black-box HW control decides when and how much to overclock ! SW has limited control: can only turn TurboMode on/off 3 HPCA-20 February 19, 2014

  4. Characterizing TurboMode ! Evaluate the effects of TM across the board ! Efficiency metrics: EDP, ED 2 P, throughput/W, throughput/$, … ! Many hardware platforms: Intel/AMD, server/notebook ! Many workloads: SpecCPU, SpecPower, websearch, … ! Characterization ! Run with TurboMode on and TM off ! Compare impact on all of efficiency metrics 4 HPCA-20 February 19, 2014

  5. Efficiency Metrics ! Guidelines ! We all care about performance and energy consumption ! Capture both latency and throughput workloads ! Metric recap ! ED : latency & energy ! ED 2 : latency & energy, more weighted towards latency (think servers) ! Throughput/W : throughput & energy ! Throughput/$ : throughput & cost efficiency (think datacenter TCO) 5 HPCA-20 February 19, 2014

  6. Evaluation Hardware ! Intel Sandy Bridge server [SBServer] : 19% max boost ! Intel Sandy Bridge mobile [SBMobile] : 44% max boost ! AMD Interlagos [ILServer] : 59% max boost ! Intel Ivy Bridge server [IBServer] : 12% max boost ! Intel Haswell server [Hserver] : 13% max boost 6 HPCA-20 February 19, 2014

  7. Evaluation Workloads ! Representative of multiple domains ! CPU, memory, and IO workloads ! Single-threaded SpecCPU benchmarks ! Multi-programmed SpecCPU mixes ! Multi-threaded PARSEC >100 configs ! Enterprise SPECpower_ssj2008 ! Websearch 7 HPCA-20 February 19, 2014

  8. Observation: No Optimal On/Off Setting Sandy$Bridg Sandy$ ridge$Se Server r In Interlag erlagos os$S $Ser erver er Sandy$ Sandy$Bridg ridge$Mo Mobile bile 127% 75% $off TurboMode$off 50% 25% ment$over$Tu 0% C25% mproveme C50% Mix$1 Mix$2 Websearch %$imp Mix$1 Mix$2 Websearch %$ ED ED ED² ² Wo Workload Wo Workload Wo Workload QPS/W QP QPS/$ QP 8 HPCA-20 February 19, 2014

  9. Observation: TM leads to High Variance on Efficiency Sandy$Bridge$Server$ED² ² 30% 20% ment mproveme 10% ~50% mixes benefit from TM 0% ~50% mixes suffer due to TM ²$imp C10% ED²$ C20% C30% Ap App$M $Mix x 1 82 9 HPCA-20 February 19, 2014

  10. Characterization Analysis ! TurboMode mostly benefits CPU bound workloads ! Boost in performance and efficiency from higher frequency ! SpecCPU mixes of CPU-intensive workloads, SpecPower, websearch, … ! TurboMode ineffective when memory/IO bound ! Interference on memory/IO really aggravates this ! Small/no performance gain, high energy waste with higher frequency ! SpecCPU mixes of memory-intensive workloads, canneal, streamcluster, … ! Applications have multiple phases ! CPU bound vs. memory/IO bound ! SpecCPU mixes 10 HPCA-20 February 19, 2014

  11. TurboMode Control ! Naïve TM control ! Always off: miss boost on CPU bound applications ! Always on: suffer inefficiency on interference-bound applications ! Need dynamic TM control ! Understands applications running and metric of interest ! Predicts optimal setting (on/off), adjust dynamically to phases ! No a priori knowledge of applications, no new hardware needed 11 HPCA-20 February 19, 2014

  12. Predictive Model for TurboMode ! Idea: use runtime info to dynamically predict TM benefits ! Focus primarily on detecting memory interference ! Build predictive model based on performance counters ! Use performance counters & model to predict interference severity ! If too severe, turn off TurboMode 12 HPCA-20 February 19, 2014

  13. Autoturbo: Predictive Control for TurboMode Sample perf Core 1 Core 2 counters per core App N App N App N Perf Classifier Perf Perf Training data Core Core N N-1 App properties per core Enable/disable TurboMode TurboMode TM on/off heuristic Metric 13 HPCA-20 February 19, 2014

  14. Training the Predictive Model Raw training data Feature selection Model selection Single SpecCPU, Single SPECCPU, 85% Naïve Bayes TurboMode on TurboMode on Single SPECCPU, Single SpecCPU, 81% Logistic Regression TurboMode off TurboMode off Single SPECCPU Single SpecCPU 73% Nearest Neighbors +stream, TurboMode on +stream, TurboMode on Single SpecCPU Single SPECCPU 75% Decision Tree +stream, TurboMode off +stream, TurboMode off 14 HPCA-20 February 19, 2014

  15. Model Validation ! Model accuracy : ~90% on cross-validation ! Best counters: those that indicate memory-bound workload ! SBServer/SBMobile : % cycles with outstanding memory requests, … ! ILServer : L2 MPKI, # requests to memory/instruction, … ! CPU/thermal intensity counters don’t correlate strongly! ! E.g., floating-point intensity counters 15 HPCA-20 February 19, 2014

  16. Autoturbo Evaluation ! Used autoturbo in conjunction with workloads ! Evaluation workloads are apps other than single-thread SpecCPU ! Measure efficiency metrics ! Compare against ! Baseline: TurboMode is always off ! Naïve TM : TurboMode is always on ! Static oracle : TurboMode on if leads to benefit for the overall run 16 HPCA-20 February 19, 2014

  17. Autoturbo results Sandy Bridge Mobile QPS/$ Sandy Bridge Server ED ² 10% 40% Gains over QPS/$ improvement ED ² improvement never using 5% 20% TurboMode 0% 0% -5% -20% -10% -40% App Mix App Mix Gains over always using TurboMode Naïve Auto Naïve Auto 1 35 1 82 Static Oracle Static Oracle 17 HPCA-20 February 19, 2014

  18. Autoturbo Analysis ! Autoturbo gets best of both worlds ! Reduces cases where TM causes efficiency degradation ! Keeps cases where TM leads to benefits ! autoturbo often disables TM even though it is beneficial ! Cause : the interference predictor assumes worst case interference ! autoturbo beats the static oracle ! Cause : autoturbo can take advantage of dynamism during the run 18 HPCA-20 February 19, 2014

  19. Conclusions ! TurboMode is useful but must be managed dynamically ! This work: dynamic TurboMode control ! Predictive model for memory interference ! Dynamic control with no hand-tuning needed ! Eliminates efficiency drops, maintains efficiency gains of TurboMode ! Future work ! Apply similar approach to manage advanced power settings 19 HPCA-20 February 19, 2014

  20. autoturbo dealing with a phase change autoturbo dynamic adjustment on Sandy Bridge Mobile 3.00 Frequency (GHz) 2.90 Memory interference 2.80 occurs mid-workload 2.70 2.60 2.50 2.40 215 235 255 275 295 Time (s) 20 HPCA-20 February 19, 2014

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend