CSE775: Computer Architecture Chapter 1: Fundamentals of Chapter 1: - PowerPoint PPT Presentation

CSE775: Computer Architecture Chapter 1: Fundamentals of Chapter 1: Fundamentals of Computer Design 1

Computer Architecture Topics Input/Output and Storage Disks, WORM, Tape p RAID Emerging Technologies DRAM Interleaving Memories Coherence, Memory L2 Cache Bandwidth, Hierarchy y Latency Latency L1 Cache Addressing, VLSI VLSI Protection, Exception Handling Instruction Set Architecture Pipelining, Hazard Resolution, Pipelining and Instruction S Superscalar, Reordering, l R d i Level Parallelism Prediction, Speculation, Vector, DSP 2

Computer Architecture Topics Shared Memory, y, Message Passing, P M P M P M P M ° ° ° Data Parallelism Network Interfaces S Interconnection Network Processor-Memory-Switch Processor-Memory-Switch Topologies Topologies, Routing, Multiprocessors Bandwidth, Networks and Interconnections Networks and Interconnections Latency Latency, Reliability 3

Measurement and Evaluation Architecture is an iterative process: • Searching the space of possible designs Design g • At all levels of computer systems At ll l l f t t Analysis Creativity Cost / Performance Analysis Good Ideas Good Ideas Mediocre Ideas Bad Ideas 4

Issues for a Computer Designer • Functional Requirements Analysis (Target) – Scientific Computing – High Performance Floating pt. p g g g p – Business – transactional support/decimal arithmetic – General Purpose –balanced performance for a range of tasks tasks • Level of software compatibility – PL level • Flexible, Need new compiler, portability an issue – Binary level (x86 architecture) • Little flexibility Portability requirements minimal • Little flexibility, Portability requirements minimal • OS requirements – Address space issues, memory management, protection p , y g , p • Conformance to Standards 5 – Languages, OS, Networks, I/O, IEEE floating pt.

Computer Systems: Technology Computer Systems: Technology Trends • 1988 • 2008 – Supercomputers – Powerful PC’s and laptops – Mas sively Par allel Processors – Clusters delivering – Mini-supercomputers Petaflop performance Petaflop performance – Minicomputers – Embedded Computers – Workstations – PDAs, I-Phones, .. , , – PC s PC’s 6

Technology Trends • Integrated circuit logic technology – a growth in transistor a gro th in transistor I t t d i it l i t h l count on chip of about 40% to 55% per year. • Semiconductor RAM • Semiconductor RAM – capacity increases by 40% per capacity increases by 40% per year, while cycle time has improved very slowly, decreasing by about one-third in 10 years. Cost has decreased at rate about the rate at which capacity increases. b h hi h i i • Magnetic disc technology – in 1990’s disk density had been improving 60% to100% per year while prior to 1990 about improving 60% to100% per year, while prior to 1990 about 30% per year. Since 2004, it dropped back to 30% per year. • Network technology – Latency and bandwidth are important. gy y p Internet infrastructure in the U.S. has been doubling in bandwidth every year. High performance Systems Area Network (such as InfiniBand) delivering continuous reduced latency InfiniBand) delivering continuous reduced latency. 7

Why Such Change in 20 years? • Performance – Technology Advances • CMOS (complementary metal oxide semiconductor) VLSI CMOS ( l t t l id i d t ) VLSI dominates older technologies like TTL (Transistor Transistor Logic) in cost AND performance – Computer architecture advances improves low-end Computer architecture advances improves low end • RISC, pipelining, superscalar, RAID, … • Price: Lower costs due to … – Simpler development • CMOS VLSI: smaller systems, fewer components – Higher volumes Higher volumes – Lower margins by class of computer, due to fewer services 8

Growth in Microprocessor Performance Figure 1.1 In 90’s, the main source of innovations in computer design has come from RISC-style I 90’ th i f i ti i t d i h f RISC st le pipelined processors. In the last several years, the annual growth rate is (only) 10-20%. 9

Growth in Performance of RAM & CPU Figure 5 2 Figure 5.2 • Mismatch between CPU performance growth and memory performance growth!! • And, almost unchanged memory latency • Little instruction-level parallelism left to exploit efficiently • Maximum power dissipation of air-cooled chips reached 10

Cost of Six Generations of Cost of Six Generations of DRAMs 11

12 Cost of Microprocessors

Components of Price for a $1000 Components of Price for a $1000 PC 13

Integrated Circuits Costs IC cost = Die cost + Testing cost + Packaging cost Final test yield Die cost = Wafer cost Dies per Wafer * Die yield Dies per wafer = š * ( Wafer_diam / 2) 2 / 2) 2 Di f š * ( W f di – š * Wafer_diam – Test dies š * W f di T di Die Area ¦ 2 * Die Area − α Defects_per_unit_area * Die_Area } } { { Die Yield = Wafer yield * 1 + α Die Cost goes roughly with die area 4 14 DAP.S98 1

Failures and Dependability • Failures at any level costs money – Integrated circuits (processor, memory) I t t d i it ( ) – Disks – Networks – Networks • Costs Millions of Dollars for 1hour downtime (Amazon, Google, ..) (Amazon, Google, ..) • No concept of downtime at the middle of night • Systems need to be designed with fault- Systems need to be designed with fault- tolerance – Hardware – Software 15

Performance and Cost Throughput Throughput Plane DC to Paris Speed Passengers (pmph) Boeing 747 Boeing 747 6.5 hours 6 5 hours 610 mph 610 mph 470 470 286,700 286 700 BAD/Sud 3 hours 3 hours 1350 mph 1350 mph 132 132 178,200 178 200 Concodre • Time to run the task (ExTime) Time to run the task (ExTime) – Execution time, response time, latency • Tasks per day, hour, week, sec, ns … (Performance) – Throughput, bandwidth 16

The Bottom Line: The Bottom Line: Performance (and Cost) "X is n times faster than Y" means ExTime(Y) ExTime(Y) Performance(X) Performance(X) --------- = --------------- ExTime(X) ExTime(X) Performance(Y) Performance(Y) • Speed of Concorde vs. Boeing 747 Speed of Concorde vs. Boeing 747 • Throughput of Boeing 747 vs. Concorde 17

Metrics of Performance Application Application Answers per month Answers per month Operations per second Programming Language Compiler (millions) of Instructions per second: MIPS (millions) of (FP) operations per second: MFLOP/s (millions) of (FP) operations per second: MFLOP/s ISA ISA Datapath Megabytes per second Control Function Units Function Units Cycles per second (clock rate) Transistors Wires Pins 18

Computer Engineering Methodology Evaluate Existing Evaluate Existing Implementation Systems for Systems for Complexity Bottlenecks Bottlenecks Bottlenecks Bottlenecks Benchmarks Technology Technology Trends Implement Next Implement Next Simulate New Simulate New Generation System Generation System G G ti ti S S t t Designs and Designs and Organizations Organizations Workloads 19

Measurement Tools • Benchmarks, Traces, Mixes • Hardware: Cost, delay, area, power estimation d C d l i i • Simulation (many levels) – ISA, RT, Gate, Circuit • Queuing Theory • Rules of Thumb • Fundamental “Laws”/Principles • Understanding the limitations of any measurement tool is crucial. 20

Issues with Benchmark Issues with Benchmark Engineering • Motivated by the bottom dollar, good performance on classic suites � more p customers, better sales. • Benchmark Engineering � Limits the Benchmark Engineering � Limits the longevity of benchmark suites • Technology and Applications � Limits the • Technology and Applications � Limits the longevity of benchmark suites. 21

SPEC: System Performance Evaluation Cooperative Evaluation Cooperative • First Round 1989 – 10 programs yielding a single number (“SPECmarks”) i ldi i l b ( k ) • Second Round 1992 – SPECInt92 (6 integer programs) and SPECfp92 (14 SPECInt92 (6 integer programs) and SPECfp92 (14 floating point programs) – “benchmarks useful for 3 years” • SPEC CPU2000 (11 integer benchmarks – CINT2000, and 14 floating-point benchmarks – CFP2000 • SPEC 2006 (CINT2006, CFP2006) SPEC 2006 (CINT2006, CFP2006) • Server Benchmarks – SPECWeb – SPECFS SPECFS • TPC (TPA-A, TPC-C, TPC-H, TPC-W, …) 22

SPEC 2000 (CINT 2000)Results 23

24 SPEC 2000 (CFP 2000)Results

Reporting Performance Results • Reproducibility • � Apply them on publicly available � Apply them on publicly available benchmarks. Pecking/Picking order – Real Programs Real Programs – Real Kernels – Toy Benchmarks Toy Benchmarks – Synthetic Benchmarks 25

H How to Summarize Performance t S i P f • Arithmetic mean (weighted arithmetic mean) tracks execution time: sum(T i )/n or sum(W i *T i ) • Harmonic mean (weighted harmonic mean) of H i ( i h d h i ) f rates (e.g., MFLOPS) tracks execution time: n/sum(1/R i ) or 1/sum(W i /R i ) n/sum(1/R i ) or 1/sum(W i /R i ) 26

CSE775: Computer Architecture Chapter 1: Fundamentals of Chapter 1: - PowerPoint PPT Presentation

CSE775: Computer Architecture Chapter 1: Fundamentals of Chapter 1: Fundamentals of Computer Design 1 Computer Architecture Topics Input/Output and Storage Disks, WORM, Tape p RAID Emerging Technologies DRAM Interleaving Memories

CSE775: Computer Architecture Chapter 1: Fundamentals of Computer Design 1 Computer

1 Growth in Performance of RAM & CPU Technology Trends Integrated circuit logic

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

Architecture: Culture and Space Architecture: Culture and Space Architecture: Culture and Space

CSE 675.02: three aspects of computer design: instruction set architecture, Introduction to

ICS 233 ICS 233 ICS 233 ICS 233 Computer Architecture & Computer Architecture &

Introduction to Software Architecture Reid Holmes Architecture Architecture is: All

CMS Strip Readout Architecture for SLHC OUTLINE brief review of LHC strip readout architecture p

A New Golden Age for 1. Software advances can inspire architecture Computer Architecture:

cse141: Introduction to Computer Architecture Steven Swanson Alice Liang 1 Todays Agenda

cse141: Introduction to Computer Architecture Steven Swanson Andiry Xu Qi Li 1 Today s

cse141: Introduction to Computer Architecture Steven Swanson Nathan Goulding Manoj Mardithaya

The eXplicit MultiThreading (XMT) Parallel Computer Architecture Parallel Computer Architecture

Hot Topics in Computer System Architecture Computer Architecture 1950s and 1960s:

Betting on Software Architecture as Code a note on hypothesis-driven architecture James Lewis :

Institute for East Asian Architecture and Urbanism in Kyoto www.East-Asian-Architecture.org

Comparison of Cipher implementations from cipher authors 256-bit stream ciphers D. J.

Applicant Post Award Process Public Assistance (PA) Program FEMA-DR-4512-VA COVID - 19

DMLSS and Strategic Sourcing Capability and Pricing Agreements Practical Experience Mr. Ivan

Cooperative Positioning in Urban Environments: Opportunities and Challenges Joon Wayn Cheong

It Takes a Village: Reasoning About Concurrent Processes David Castro, Francisco Ferreira,

Scope of coordination system in the pension field VC/2010/1536 Dalila Ghailani Researcher,

Incentive Plans for Startups PwC Kellerhals Carrard R em o Schm id , Partner K arim M aizar ,

Thought Exercise Traffic Modelling Reusing this material This work is licensed under a Creative

CSE775: Computer Architecture Chapter 1: Fundamentals of Chapter 1: - PowerPoint PPT Presentation

CSE775: Computer Architecture Chapter 1: Fundamentals of Chapter 1: Fundamentals of Computer Design 1 Computer Architecture Topics Input/Output and Storage Disks, WORM, Tape p RAID Emerging Technologies DRAM Interleaving Memories

CSE775: Computer Architecture Chapter 1: Fundamentals of Computer Design 1 Computer

1 Growth in Performance of RAM &amp; CPU Technology Trends Integrated circuit logic

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

Architecture: Culture and Space Architecture: Culture and Space Architecture: Culture and Space

CSE 675.02: three aspects of computer design: instruction set architecture, Introduction to

ICS 233 ICS 233 ICS 233 ICS 233 Computer Architecture &amp; Computer Architecture &amp;

Introduction to Software Architecture Reid Holmes Architecture Architecture is: All

CMS Strip Readout Architecture for SLHC OUTLINE brief review of LHC strip readout architecture p

A New Golden Age for 1. Software advances can inspire architecture Computer Architecture:

cse141: Introduction to Computer Architecture Steven Swanson Alice Liang 1 Todays Agenda

cse141: Introduction to Computer Architecture Steven Swanson Andiry Xu Qi Li 1 Today s

cse141: Introduction to Computer Architecture Steven Swanson Nathan Goulding Manoj Mardithaya

The eXplicit MultiThreading (XMT) Parallel Computer Architecture Parallel Computer Architecture

Hot Topics in Computer System Architecture Computer Architecture 1950s and 1960s:

Betting on Software Architecture as Code a note on hypothesis-driven architecture James Lewis :

Institute for East Asian Architecture and Urbanism in Kyoto www.East-Asian-Architecture.org

Comparison of Cipher implementations from cipher authors 256-bit stream ciphers D. J.

Applicant Post Award Process Public Assistance (PA) Program FEMA-DR-4512-VA COVID - 19

DMLSS and Strategic Sourcing Capability and Pricing Agreements Practical Experience Mr. Ivan

Cooperative Positioning in Urban Environments: Opportunities and Challenges Joon Wayn Cheong

It Takes a Village: Reasoning About Concurrent Processes David Castro, Francisco Ferreira,

Scope of coordination system in the pension field VC/2010/1536 Dalila Ghailani Researcher,

Incentive Plans for Startups PwC Kellerhals Carrard R em o Schm id , Partner K arim M aizar ,

Thought Exercise Traffic Modelling Reusing this material This work is licensed under a Creative

1 Growth in Performance of RAM & CPU Technology Trends Integrated circuit logic

ICS 233 ICS 233 ICS 233 ICS 233 Computer Architecture & Computer Architecture &