programming for performance
play

Programming for Performance 1 Textbook Definition of Real-time A - PowerPoint PPT Presentation

Programming for Performance 1 Textbook Definition of Real-time A Real-time System responds in a (timely) predictable way to unpredictable external stimuli arrivals. A system is a real-time system when it can support the execution of


  1. Programming for 
 Performance 1

  2. Textbook Definition of Real-time A Real-time System responds in a (timely) predictable way to unpredictable external stimuli arrivals. A system is a real-time system when it can support the execution of applications with time constraints on that execution. - Dedicated Systems Encyclopedia

  3. Real time systems • Games are not ‘really’ real time systems, but face many of the same challenges on a smaller scale • “Hard” – Any lateness of results unacceptable • “Firm” – Occasional lateness is not a total system failure • Could be significant quality degradation • Results cannot be used past deadline • “Soft” – Rising cost of lateness • Quality degrades the later you get

  4. Real Time Systems in Video Games • Video games have a variety of real-time systems – No system in video games are hard real time – Failures obviously aren’t as bad as in many real-time systems • Sound has firm constraints – Hardware consumes data at 44 KHz (stereo) – Any amount of dropout is very bad – Can’t extrapolate to fill in the missing sound data • Sound also has soft constraints – Sound must correlate with visual or input events

  5. Real Time Systems in Video Games • Rendering is a soft real-time system – 60 fps (frames per second) is ideal – 20 fps is okay – 5 fps is no fun at all – Some games are more sensitive (FPS, fighters)

  6. Characterizing performance • Four important measures – Latency (individual operation) – Throughput (individual operation) – Framerate – CPU/GPU utilization

  7. Latency • Total time for an operation to take place • Example: – Time from initiation of DVD read to time the head is placed over the correct track: up to 200 ms • When lately is high, systems need to be asynchronous • Operations off CPU often have very high latency: – Display, sound, input 10-50ms – Network: 300ms • Latency differences can cause dissociation • Some latency elements are outside our control – Wireless controllers, wireless headphones, motion smoothing on TVs

  8. Throughput • Amount of operations that can be completed in a given time • Example: – Most standard computing performance measures (TFLOPS, etc) – Amount of data that can be read from an Xbox 360 DVD in one second: 6 - 15 MB – Vertex or pixel processing rate

  9. Latency and Throughput Together • Latency and throughput must be considered together when measuring performance • Often one can be traded for another – CPU example: deep pipelines to increase clock rate – GPU example: triangle throughput vs. state change latency – Don’t concentrate solely on one to the detriment of another • e.g. adding display latency can increase the frame rate of the render, but it may make the controls feels sluggish

  10. Framerate • Total time from completion of one frame to completion of the next • Good general measure of performance • Often expressed as frames-per-second (30 fps) or as milliseconds per frame (i.e. 33 ms)

  11. Utilization • Because systems are asynchronous, and may have external constraints (i.e. vsync) different systems may be running for different portions of frame • Game where CPU is running flat out for 30ms but GPU is only running for 10ms has ‘worse’ performance than one where both are running for 30 ms – You are leaving quality on the table, could get either better performance or more stuff by balancing better • Also applies to multi-core – Want to balance utilization of cores as well as possible

  12. What Should You Measure? • Best case – Good for selling things, but not useful for optimisation • Worst case – Must use this to ensure application always performs better than lower-bounds • Average – Good indicator, but can be misleading if the performance can spike • Overall – Record per frame rate over many frames, plot the results in a spreadsheet to look for trouble areas or areas of high visibility – Helps if gameplay session can be repeatable (journaling) • Easiest situation: Best=Worst=Average

  13. Balanced Performance • Player experience is balanced when it is: – Smooth • Throughput handles workload – Responsive • Always achieve better than maximum allowable latency – Consistent • No peaks or valleys • A solid 30 fps is more playable than 5-to-60

  14. Optimisation Criteria • Games have stringent performance constraints – Display rate – Sound latency – Controller response – Load time – Network latency • A laggy, slow, choppy game is not fun – Online FPS with a 1000 ms ping • Hardware constraints – Memory optimisation

  15. Optimisation Pressures • Content demands outstrip capabilities of code – Designers always want more than you can provide – Puts positive pressure on programmer to improve system • Hardware remains fixed, quality bar is rising – Must out-do previous title, competition

  16. Why Optimise? • Appeal to a wider spectrum of hardware (PC) – A game that only works on today’s state-of-the-art hardware may shut out a large portion of your audience (and sales) • Facilitates better gameplay experience – Richer content – Faster, tighter controls – Higher game reviews • Fun & challenging – Optimising promotes understanding

  17. When not to Optimise • Optimised code has drawbacks – Takes more time to develop • Assembly takes more than 10 times as long as C++ – Compilers can and will beat you some (most?) of the time – Maintainability / readability suffers (even without Assembly) – Portability sacrificed – Hard to debug – Easy to be fooled • Wild goose chases • Lots of effort for small gain – Lost opportunity • Choose your battles carefully!

  18. Common Wisdom: The 90/10 Rule • 10% of the code takes 90% of the time • When you find the 10% you can dramatically increase your speed just by fixing it • The speed of most of the code doesn't matter, so you don't need to worry about it – Can waste a lot of time optimizing things that don't matter • You need to make sure that you find the right 10% • This is where good profiling techniques are essential • But...

  19. Death by a Thousand Cuts • Sometimes the 90/10 rule doesn't hold • Pervasive architectural problems and inefficient techniques can hide performance issues where you can't find them – Language features and hardware quirks are common culprits here, since they are resistant to many profiling techniques – So are over-designed and needlessly abstract systems • The only way to fight against this is to be aware of the costs of design choices up front • You can't generally find and fix these problems once things are nearing completion

  20. How to Optimise • Three steps: – Find performance bottlenecks – Fix them – Repeat

  21. How to Optimise • Good optimisation is a combination of knowledge, intuition and measurement • From Michael Abrash's, “Zen of Code Optimization”: – Have an overall understanding of the problem to be solved – Carefully consider algorithms and data structures – Understand how the compiler translates your code, and how the computer executes it – Identify performance bottlenecks – Eliminate them using the appropriate level of optimisation

  22. Understanding the Problem • Some questions to ask: – How long do I have to work on this? – Has this been solved before? (yes!) • What are the differences? – What are the characteristics of the data? • Are there special cases? • Where is the coherency? – What can be computed offline? – Is there a simpler problem lurking within? – Can the hardware help me? • Discuss the problem with your colleagues • Don’t start coding yet

  23. Algorithms and Data Structures • The most important aspect of fast code – A bubble-sort in hand-tweaked assembly is still slow – Have a toolkit of good general purpose algorithms developed by smart people • Quicksort, A*, hashing, etc. • “Big O” analysis is useful – In practice, we are less formal about it – Remember that ‘n’ and ‘c’ matter in real code! – We care more about the particularities of compilers and hardware

  24. Finding Bottlenecks • Intuition (guessing) – Helps if you are familiar with the algorithm/code – Don’t trust it alone though! • Can be misleading, or just plain wrong • Profiling – Measure performance to find hot spots – Many tools available: • Algorithm analysis • Counters • Timers • Profiler programs – Profiling exhibits some quantum uncertainty. Can’t always observe with affecting performance.

  25. Profiling: Counters and Metrics • Various counters and metrics should be built into the game: – Frame rate counter – Rendering statistics • Triangle count, textures used, etc. – Memory used per pool – Network ping time – Collision tests per frame – Anything else that is interesting

  26. Profiling: Isolation • Isolate components in a running game to determine their contribution to the frame rate: – Disable parts of the renderer • World • Characters • Special effects – Turn off sound – Turn off collision • May be misleading if components interact • Being able to do this easily is an example of good architecture paying off

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend