4 performance analysis of parallel programs 4 1
play

4. Performance Analysis of Parallel Programs 4.1 Performance - PowerPoint PPT Presentation

4. Performance Analysis of Parallel Programs 4.1 Performance Evaluation of Computer User criteria: - Small response times Computing center criteria: - High throughputs 4.1.1 Evaluation of CPU Performance 4.1.1 Evaluation of CPU Performance


  1. 4. Performance Analysis of Parallel Programs

  2. 4.1 Performance Evaluation of Computer User criteria: - Small response times Computing center criteria: - High throughputs

  3. 4.1.1 Evaluation of CPU Performance

  4. 4.1.1 Evaluation of CPU Performance The response time of a program A can be split into:

  5. 4.1.1 Evaluation of CPU Performance The response time of a program A can be split into: User CPU time of A

  6. 4.1.1 Evaluation of CPU Performance The response time of a program A can be split into: User CPU time of A System CPU time of A

  7. 4.1.1 Evaluation of CPU Performance The response time of a program A can be split into: User CPU time of A System CPU time of A Waiting time of A

  8. 4.1.1 Evaluation of CPU Performance The response time of a program A can be split into: User CPU time of A System CPU time of A Waiting time of A

  9. 4.1.1 Evaluation of CPU Performance User CPU time of A

  10. 4.1.1 Evaluation of CPU Performance User CPU time of A t cycle -> reciprocal to clock rate: T=1/f -> 2GHz = 1/(2*10 9 )s = 0.5ns (cycle time) n cycle (A)-> total number of CPU cycles needed for all instructions of A

  11. 4.1.1 Evaluation of CPU Performance CPI ( C lock cycles P er I nstruction)

  12. 4.1.1 Evaluation of CPU Performance CPI ( C lock cycles P er I nstruction)

  13. 4.1.1 Evaluation of CPU Performance CPI ( C lock cycles P er I nstruction) n instr (A) -> total number of instructions executed for A

  14. 4.1.1 Evaluation of CPU Performance CPI ( C lock cycles P er I nstruction) n i (A) -> is the number of instructions of type I i executed for the program A CPI i -> number of CPU cycles needed for instructions of type I i

  15. 4.1.1 Evaluation of CPU Performance CPI ( C lock cycles P er I nstruction) Example: We consider a processor with three instruction classes I 1 , I 2 , I 3 containing instructions which require 1, 2, or 3 cycles for their execution. We assume that there are two di fg erent possibilities for the translation of a Programming language construct using di fg erent instructions. CPI 1 = 10/5 = 2 CPI 2 = 9/6 = 1,5

  16. 4.1.2 MIPS and MFLOPS

  17. 4.1.2 MIPS and MFLOPS MIPS ( M illion I nstructions P er S econd)

  18. 4.1.2 MIPS and MFLOPS MIPS ( M illion I nstructions P er S econd) Drawbacks/limitations: - Only considers the number of instructions. - MIPS rate does not necessarily correspond to the execution time.

  19. 4.1.2 MIPS and MFLOPS MFLOPS ( M illion F loating-point O perations P er S econd)

  20. 4.1.2 MIPS and MFLOPS MFLOPS ( M illion F loating-point O perations P er S econd)

  21. 4.1.2 MIPS and MFLOPS MFLOPS ( M illion F loating-point O perations P er S econd) Drawbacks/limitations: - Doesn’t di fg erence between types of floating-points operations performed.

  22. 4.1.3 Performance of Processors with a Memory

  23. 4.1.3 Performance of Processors with a Memory

  24. 4.1.3 Performance of Processors with a Memory n mm_cycles (A) -> number of additional machine cycles caused by memory accesses of A .

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend