performance
play

Performance Measure, Report, and Summarize Performances Make - PowerPoint PPT Presentation

CSE 675.02: Introduction to Computer Architecture Performance Measure, Report, and Summarize Performances Make intelligent choices See through the marketing hype of Computer Systems Key to understanding underlying


  1. CSE 675.02: Introduction to Computer Architecture Performance • Measure, Report, and Summarize Performances • Make intelligent choices • See through the marketing hype of Computer Systems • Key to understanding underlying organizational motivation Why is some hardware better than others for different programs? Presentation C What factors of system performance are hardware related? (e.g., Do we need a new machine, or a new operating system?) How does the machine's instruction set affect performance? Gojko Babi ć 06/27/2005 Basic Performance Metrics Which of these airplanes has the best performance? • Response time: the time between the start and the completion of a task (in time units) Airplane Passengers Range (mi) Speed (mph) • Throughput: the total amount of tasks done in a given time Boeing 737-100 101 630 598 period (in number of tasks per unit of time) Boeing 747 470 4150 610 BAC/Sud Concorde 132 4000 1350 Douglas DC-8-50 146 8720 544 • Example: Car assembly factory: – 4 hours to produce a car (response time), • How much faster is the Concorde compared – 6 cars per an hour produced (throughput) to the 747? In general, there is no relationship between those two metrics, • How much bigger is the 747 than the Douglas DC-8? – throughput of the car assembly factory may increase to 18 cars per an hour without changing time to produce one car. – How? g. babic Presentation C 4

  2. Computer Performance: Introduction Computer Performance: TIME, TIME, • The computer user is interested in response time (or execution time) – the time between the start and completion of a given TIME task (program). • Response Time (latency) • The manager of a data processing center is interested in — How long does it take for my job to run? throughput – the total amount of work done in given time. — How long does it take to execute a job? — How long must I wait for the database query? • The computer user wants response time to decrease, while • Throughput the manager wants throughput increased. — How many jobs can the machine run at once? — What is the average execution rate? • Main factors influencing performance of computer system are: — How much work is getting done? – processor and memory, – input/output controllers and peripherals, – compilers, and • If we upgrade a machine with a new processor what do we increase? – operating system. • If we add a new machine to the lab what do we increase? g. babic Presentation C 5 Analysis of CPU Time CPU time depends on the program which is executed, Execution Time including: – a number of instructions executed, – types of instructions executed and their frequency of usage. • Elapsed Time Computers are constructed is such way that events in hardware – counts everything (disk and memory accesses, I/O , etc.) – a useful number, but often not good for comparison purposes are synchronized using a clock. • CPU time Clock rate is given in Hz (=1/sec). – doesn't count I/O or time spent running other programs A clock rate defines durations of discrete time intervals called – can be broken up into system time, and user time clock cycle times or clock cycle periods: • Our focus: user CPU time – time spent executing the lines of code that are "in" our program g. babic Presentation C 8

  3. Book's Definition of Performance Clock Cycles • Instead of reporting execution time in seconds, we often use cycles • For some program running on machine X, seconds program × seconds cycles program = cycle Performance X = 1 / Execution time X • Clock “ticks” indicate when to start activities (one abstraction): • "X is n times faster than Y" Performance (X) time n = –––––––––––––– • cycle time = time between ticks = seconds per cycle Performance (Y) • clock rate (frequency) = cycles per second (1 Hz. = 1 cycle/sec) • Problem: 1 12 A 4 Ghz. clock has a × = cycle time 10 250 picosecond s (ps) – machine A runs a program in 20 seconds 9 × 4 10 – machine B runs the same program in 25 seconds How many cycles are required for a program? How to Improve Performance • Could assume that number of cycles equals seconds program × seconds cycles program = number of instructions cycle 2nd instruction 3rd instruction 1st instruction So, to improve performance (everything else being equal) you can either (increase or decrease?) • 4th 5th 6th ... ________ the # of required cycles for a program, or time ________ the clock cycle time or, said another way, This assumption is incorrect, ________ the clock rate. different instructions take different amounts of time on different machines. Why? hint: remember that these are machine instructions, not lines of C code

  4. Different numbers of cycles for Example different instructions time • Our favorite program runs in 10 seconds on computer A, which has a 4 GHz. clock. We are trying to help a computer designer build a new machine B, that will run this program in 6 seconds. The designer can use new (or • Multiplication takes more time than addition perhaps more expensive) technology to substantially • Floating point operations take longer than integer ones increase the clock rate, but has informed us that this • Accessing memory takes more time than accessing increase will affect the rest of the CPU design, causing registers machine B to require 1.2 times as many clock cycles as machine A for the same program. What clock rate should • Important point: changing the cycle time often changes we tell the designer to target?" the number of cycles required for various instructions (more later) Now that we understand cycles Performance • Performance is determined by execution time • A given program will require • Do any of the other variables equal performance? – some number of instructions (machine instructions) – # of cycles to execute program? – some number of cycles – # of instructions in program? – some number of seconds – # of cycles per second? • We have a vocabulary that relates these quantities: – average # of cycles per instruction? – cycle time (seconds per cycle) – average # of instructions per second? – clock rate (cycles per second) – CPI (cycles per instruction) • Common pitfall: thinking one of the variables is a floating point intensive application might have a higher CPI indicative of performance when it really isn’t. – MIPS (millions of instructions per second) this would be higher for a program using simple instructions

  5. CPU Time Equation CPI Example • CPU time = Clock cycles for a program * Clock cycle time = Clock cycles for a program / Clock rate Clock cycles for a program is a total number of clock cycles • Suppose we have two implementations of the same instruction set architecture (ISA). needed to execute all instructions of a given program. For some program, • CPU time = Instruction count * CPI / Clock rate Machine A has a clock cycle time of 250 ps and a CPI of 2.0 Machine B has a clock cycle time of 500 ps and a CPI of 1.2 CPI – the average number of clock cycles per instruction (for a given execution of a given program) is an important parameter What machine is faster for this program, and by how much? given as: CPI = Clock cycles for a program / Instructions count • If two machines have the same ISA which of our quantities (e.g., clock Instruction count is a number of instructions executed, rate, CPI, execution time, # of instructions, MIPS) will always be sometimes referred as the instruction path length. identical? g. babic Presentation C 17 # of Instructions Example MIPS example • A compiler designer is trying to decide between two code • Two different compilers are being tested for a 4 GHz. machine with three different classes of instructions: Class A, Class B, and Class C, sequences for a particular machine. Based on the hardware which require one, two, and three cycles (respectively). Both implementation, there are three different classes of compilers are used to produce code for a large piece of software. instructions: Class A, Class B, and Class C, and they require one, two, and three cycles (respectively). The first compiler's code uses 5 million Class A instructions, 1 million Class B instructions, and 1 million Class C instructions. The first code sequence has 5 instructions: 2 of A, 1 of B, The second compiler's code uses 10 million Class A instructions, 1 and 2 of C million Class B instructions, and 1 million Class C instructions. The second sequence has 6 instructions: 4 of A, 1 of B, and 1 of C. • Which sequence will be faster according to MIPS? • Which sequence will be faster according to execution time? Which sequence will be faster? How much? What is the CPI for each sequence?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend