Performance, Energy, and Thermal Considerations for SMT and CMP - PDF document

Performance, Energy, and Thermal Considerations for SMT and CMP Architectures Yingmin Li † , David Brooks ‡ , Zhigang Hu †† , Kevin Skadron † † Dept. of Computer Science, University of Virginia †† IBM T.J. Watson Research Center ‡ Division of Engineering and Applied Sciences, Harvard University { yingmin,skadron } @cs.virginia.edu, zhigangh@us.ibm.com, dbrooks@eecs.harvard.edu Abstract in instructions per cycle (IPC) means increased power dissipation and possibly increased power density. Since the area increase reported for SMT execution is relatively Simultaneous multithreading (SMT) and chip multiprocessing (CMP) both allow a chip to achieve greater small (10-20%), thermal behavior and cooling costs are major concerns. throughput, but their relative energy-efficiency and thermal properties are still poorly understood. This paper uses Chip multiprocessing (CMP) [7] is another relatively Turandot, PowerTimer, and HotSpot to explore this design new microarchitectural paradigm that has found industrial space for a POWER4/POWER5-like core. For an equal- application [12, 14]. CMP instantiates multiple processor area comparison with this style of core, we find CMP to “cores” on a single die. Typically the cores each have pri- be superior in terms of performance and energy-efficiency vate branch predictors and first-level caches and share a for CPU-bound benchmarks, but SMT to be superior for second-level, on-chip cache. For multi-threaded or multi- memory-bound benchmarks due to a larger L2 cache. Al- programmed workloads, CMP architectures amortize the though both exhibit similar peak operating temperatures cost of a die across two or more processors and allow and thermal management overheads, the mechanism by data sharing within a common L2 cache. Like SMT, the which SMT and CMP heat up are quite different. More promise of CMP is a boost in throughput. The replication specifically, SMT heating is primarily caused by local- of cores means that the area and power overhead to support ized heating in certain key structures, CMP heating is extra threads is much greater with CMP than SMT. For a mainly caused by the global impact of increased energy given die size, a single-core SMT chip will therefore sup- output. Because of this difference in heat up machanism, port a larger L2 size than a multi-core chip. Yet the lack we found that the best thermal management technique is of execution contention between threads typically yields a also different for SMT and CMP. Indeed, non-DVS local- much greater throughput for CMP than SMT [4, 7, 20]. A ized thermal-management can outperform DVS for SMT. side effect is that each additional core on a chip dramati- Finally, we show that CMP and SMT will scale differently cally increases its power dissipation, so thermal behavior as the contribution of leakage power grows, with CMP suf- and cooling costs are also major concerns for CMP. fering from higher leakage due to the second core’s higher Because both paradigms target increased throughput temperature and the exponential temperature-dependence for multi-threaded and multi-programmed workloads, it of subthreshold leakage. is natural to compare them. This paper provides a thor- ough analysis of the performance benefits, energy efficiency, and thermal behavior of SMT and CMP in the con- 1. Introduction text of a POWER4-like microarchitecture. In this research we assume POWER4-like cores with similar complexity Simultaneous multithreading (SMT) [27] is a recent mi- for both SMT and CMP except for necessary SMT related croarchitectural paradigm that has found industrial appli- hardware enhancements. Although reducing the CMP core cation [12, 18]. SMT allows instructions from multiple complexity may improve the energy and thermal efficiency threads to be simultaneously fetched and executed in the for CMP, it is cost effective to design a CMP processor by same pipeline, thus amortizing the cost of many microar- reusing an existing core. The POWER5 dual SMT core chitectural structures across more instructions per cycle. processor is an example of this design philosophy. We The promise of SMT is area-efficient throughput enhance- combine IBM’s cycle-accurate Turandot [19] and Power- ment. But even though SMT has been shown energy ef- Timer [3, 9] performance and power modeling tools, mod- ficient for most workloads [17, 21], the significant boost ified to support both SMT and CMP, with University of

Performance, Energy, and Thermal Considerations for SMT and CMP - PDF document

Performance, Energy, and Thermal Considerations for SMT and CMP Architectures Yingmin Li , David Brooks , Zhigang Hu , Kevin Skadron Dept. of Computer Science, University of Virginia IBM T.J. Watson Research Center

SMT WORLDWIDE SMT America, Europe and Asia staff has over 20 years experience in the SMT field

Using SMT solvers for binary analysis and exploitation A primer on SMT, SMT solvers, Z3 & angr

POLYMETALLIC PRODUCER AGM PRESENTATION June 30, 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL: SMT

SMT Solvers: A Disruptive Technology John Rushby Computer Science Laboratory SRI International

Thermal Physics www.njctl.org Slide 3 / 163 Slide 4 / 163 Thermal Physics Temperature, Thermal

Thermal Physics www.njctl.org Slide 3 / 163 Thermal Physics Temperature, Thermal Equilibrium

Thermal Physics www.njctl.org Slide 3 / 163 Slide 4 / 163 Thermal Physics Temperature, Thermal

Thermal decomposition of the Thermal decomposition of the Thermal decomposition of the Thermal

Thermal Runaway Warning signs and preventive actions Presented by Peter DeMar 1 If thermal

Introduction to SAT and SMT Solvers Interfacing Yosys and SMT Solvers for BMC and more using

SMT in Asia Content Teknek and the SMT industry The market Why cleaning is needed

Scrambling and Descrambling SMT-LIB Benchmarks Tjark Weber Uppsala University, Sweden SMT 2016

Motivation SMT Theories of Interest History of SMT Eager approach Lazy approach Optimizations

Theorem Provers, SMT, and Interpolation Philipp Rmmer Uppsala University Sweden CP meets CAV

POLYMETALLIC PRODUCER CORPORATE PRESENTATION July 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL:

POLYMETALLIC PRODUCER CORPORATE PRESENTATION February 2020 TSX: SMT | NYSE AMERICAN: SMTS |

The linguistics of LISA sources Overview: key LISA sources. What do we hope to learn from these

Controlling for Context by Standardizing V2A May 20, 2020 2A 1 2A 2 2020 Schield ECOTS

CSE 158 Lecture 7 Web Mining and Recommender Systems Recommender Systems Announcements

Learning Networking by Reproducing Results Lisa Yan, Lecturer in Computer Science Stanford

Above 500 C the cementite particles coalesce into larger rounded globules in the ferrite

Accelerator Summary Apr. 22, 2005 K. Oide(KEK) @ Hawai`i Optics & Beam-Beam RF Coherent

Mastery-Based Learning Designing High-Impact, Forward-Thinking Policies for Sustaining

Portable hybrid ED-XRD and XRF system for non invasive study of cultural Heritage Ariadna Mendoza

Performance, Energy, and Thermal Considerations for SMT and CMP - PDF document

Performance, Energy, and Thermal Considerations for SMT and CMP Architectures Yingmin Li , David Brooks , Zhigang Hu , Kevin Skadron Dept. of Computer Science, University of Virginia IBM T.J. Watson Research Center

SMT WORLDWIDE SMT America, Europe and Asia staff has over 20 years experience in the SMT field

Using SMT solvers for binary analysis and exploitation A primer on SMT, SMT solvers, Z3 &amp; angr

POLYMETALLIC PRODUCER AGM PRESENTATION June 30, 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL: SMT

SMT Solvers: A Disruptive Technology John Rushby Computer Science Laboratory SRI International

Thermal Physics www.njctl.org Slide 3 / 163 Slide 4 / 163 Thermal Physics Temperature, Thermal

Thermal Physics www.njctl.org Slide 3 / 163 Thermal Physics Temperature, Thermal Equilibrium

Thermal Physics www.njctl.org Slide 3 / 163 Slide 4 / 163 Thermal Physics Temperature, Thermal

Thermal decomposition of the Thermal decomposition of the Thermal decomposition of the Thermal

Thermal Runaway Warning signs and preventive actions Presented by Peter DeMar 1 If thermal

Introduction to SAT and SMT Solvers Interfacing Yosys and SMT Solvers for BMC and more using

SMT in Asia Content Teknek and the SMT industry The market Why cleaning is needed

Scrambling and Descrambling SMT-LIB Benchmarks Tjark Weber Uppsala University, Sweden SMT 2016

Motivation SMT Theories of Interest History of SMT Eager approach Lazy approach Optimizations

Theorem Provers, SMT, and Interpolation Philipp Rmmer Uppsala University Sweden CP meets CAV

POLYMETALLIC PRODUCER CORPORATE PRESENTATION July 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL:

POLYMETALLIC PRODUCER CORPORATE PRESENTATION February 2020 TSX: SMT | NYSE AMERICAN: SMTS |

The linguistics of LISA sources Overview: key LISA sources. What do we hope to learn from these

Controlling for Context by Standardizing V2A May 20, 2020 2A 1 2A 2 2020 Schield ECOTS

CSE 158 Lecture 7 Web Mining and Recommender Systems Recommender Systems Announcements

Learning Networking by Reproducing Results Lisa Yan, Lecturer in Computer Science Stanford

Above 500 C the cementite particles coalesce into larger rounded globules in the ferrite

Accelerator Summary Apr. 22, 2005 K. Oide(KEK) @ Hawai`i Optics &amp; Beam-Beam RF Coherent

Mastery-Based Learning Designing High-Impact, Forward-Thinking Policies for Sustaining

Portable hybrid ED-XRD and XRF system for non invasive study of cultural Heritage Ariadna Mendoza

Using SMT solvers for binary analysis and exploitation A primer on SMT, SMT solvers, Z3 & angr

Accelerator Summary Apr. 22, 2005 K. Oide(KEK) @ Hawai`i Optics & Beam-Beam RF Coherent