Programming for Performance 1 Textbook Definition of Real-time A - PowerPoint PPT Presentation

Programming for   Performance 1

Textbook Definition of Real-time A Real-time System responds in a (timely) predictable way to unpredictable external stimuli arrivals. A system is a real-time system when it can support the execution of applications with time constraints on that execution. - Dedicated Systems Encyclopedia

Real time systems • Games are not ‘really’ real time systems, but face many of the same challenges on a smaller scale • “Hard” – Any lateness of results unacceptable • “Firm” – Occasional lateness is not a total system failure • Could be significant quality degradation • Results cannot be used past deadline • “Soft” – Rising cost of lateness • Quality degrades the later you get

Real Time Systems in Video Games • Video games have a variety of real-time systems – No system in video games are hard real time – Failures obviously aren’t as bad as in many real-time systems • Sound has firm constraints – Hardware consumes data at 44 KHz (stereo) – Any amount of dropout is very bad – Can’t extrapolate to fill in the missing sound data • Sound also has soft constraints – Sound must correlate with visual or input events

Real Time Systems in Video Games • Rendering is a soft real-time system – 60 fps (frames per second) is ideal – 20 fps is okay – 5 fps is no fun at all – Some games are more sensitive (FPS, fighters)

Characterizing performance • Four important measures – Latency (individual operation) – Throughput (individual operation) – Framerate – CPU/GPU utilization

Latency • Total time for an operation to take place • Example: – Time from initiation of DVD read to time the head is placed over the correct track: up to 200 ms • When lately is high, systems need to be asynchronous • Operations off CPU often have very high latency: – Display, sound, input 10-50ms – Network: 300ms • Latency differences can cause dissociation • Some latency elements are outside our control – Wireless controllers, wireless headphones, motion smoothing on TVs

Throughput • Amount of operations that can be completed in a given time • Example: – Most standard computing performance measures (TFLOPS, etc) – Amount of data that can be read from an Xbox 360 DVD in one second: 6 - 15 MB – Vertex or pixel processing rate

Latency and Throughput Together • Latency and throughput must be considered together when measuring performance • Often one can be traded for another – CPU example: deep pipelines to increase clock rate – GPU example: triangle throughput vs. state change latency – Don’t concentrate solely on one to the detriment of another • e.g. adding display latency can increase the frame rate of the render, but it may make the controls feels sluggish

Framerate • Total time from completion of one frame to completion of the next • Good general measure of performance • Often expressed as frames-per-second (30 fps) or as milliseconds per frame (i.e. 33 ms)

Utilization • Because systems are asynchronous, and may have external constraints (i.e. vsync) different systems may be running for different portions of frame • Game where CPU is running flat out for 30ms but GPU is only running for 10ms has ‘worse’ performance than one where both are running for 30 ms – You are leaving quality on the table, could get either better performance or more stuff by balancing better • Also applies to multi-core – Want to balance utilization of cores as well as possible

What Should You Measure? • Best case – Good for selling things, but not useful for optimisation • Worst case – Must use this to ensure application always performs better than lower-bounds • Average – Good indicator, but can be misleading if the performance can spike • Overall – Record per frame rate over many frames, plot the results in a spreadsheet to look for trouble areas or areas of high visibility – Helps if gameplay session can be repeatable (journaling) • Easiest situation: Best=Worst=Average

Balanced Performance • Player experience is balanced when it is: – Smooth • Throughput handles workload – Responsive • Always achieve better than maximum allowable latency – Consistent • No peaks or valleys • A solid 30 fps is more playable than 5-to-60

Optimisation Criteria • Games have stringent performance constraints – Display rate – Sound latency – Controller response – Load time – Network latency • A laggy, slow, choppy game is not fun – Online FPS with a 1000 ms ping • Hardware constraints – Memory optimisation

Optimisation Pressures • Content demands outstrip capabilities of code – Designers always want more than you can provide – Puts positive pressure on programmer to improve system • Hardware remains fixed, quality bar is rising – Must out-do previous title, competition

Why Optimise? • Appeal to a wider spectrum of hardware (PC) – A game that only works on today’s state-of-the-art hardware may shut out a large portion of your audience (and sales) • Facilitates better gameplay experience – Richer content – Faster, tighter controls – Higher game reviews • Fun & challenging – Optimising promotes understanding

When not to Optimise • Optimised code has drawbacks – Takes more time to develop • Assembly takes more than 10 times as long as C++ – Compilers can and will beat you some (most?) of the time – Maintainability / readability suffers (even without Assembly) – Portability sacrificed – Hard to debug – Easy to be fooled • Wild goose chases • Lots of effort for small gain – Lost opportunity • Choose your battles carefully!

Common Wisdom: The 90/10 Rule • 10% of the code takes 90% of the time • When you find the 10% you can dramatically increase your speed just by fixing it • The speed of most of the code doesn't matter, so you don't need to worry about it – Can waste a lot of time optimizing things that don't matter • You need to make sure that you find the right 10% • This is where good profiling techniques are essential • But...

Death by a Thousand Cuts • Sometimes the 90/10 rule doesn't hold • Pervasive architectural problems and inefficient techniques can hide performance issues where you can't find them – Language features and hardware quirks are common culprits here, since they are resistant to many profiling techniques – So are over-designed and needlessly abstract systems • The only way to fight against this is to be aware of the costs of design choices up front • You can't generally find and fix these problems once things are nearing completion

How to Optimise • Three steps: – Find performance bottlenecks – Fix them – Repeat

How to Optimise • Good optimisation is a combination of knowledge, intuition and measurement • From Michael Abrash's, “Zen of Code Optimization”: – Have an overall understanding of the problem to be solved – Carefully consider algorithms and data structures – Understand how the compiler translates your code, and how the computer executes it – Identify performance bottlenecks – Eliminate them using the appropriate level of optimisation

Understanding the Problem • Some questions to ask: – How long do I have to work on this? – Has this been solved before? (yes!) • What are the differences? – What are the characteristics of the data? • Are there special cases? • Where is the coherency? – What can be computed offline? – Is there a simpler problem lurking within? – Can the hardware help me? • Discuss the problem with your colleagues • Don’t start coding yet

Algorithms and Data Structures • The most important aspect of fast code – A bubble-sort in hand-tweaked assembly is still slow – Have a toolkit of good general purpose algorithms developed by smart people • Quicksort, A*, hashing, etc. • “Big O” analysis is useful – In practice, we are less formal about it – Remember that ‘n’ and ‘c’ matter in real code! – We care more about the particularities of compilers and hardware

Finding Bottlenecks • Intuition (guessing) – Helps if you are familiar with the algorithm/code – Don’t trust it alone though! • Can be misleading, or just plain wrong • Profiling – Measure performance to find hot spots – Many tools available: • Algorithm analysis • Counters • Timers • Profiler programs – Profiling exhibits some quantum uncertainty. Can’t always observe with affecting performance.

Profiling: Counters and Metrics • Various counters and metrics should be built into the game: – Frame rate counter – Rendering statistics • Triangle count, textures used, etc. – Memory used per pool – Network ping time – Collision tests per frame – Anything else that is interesting

Profiling: Isolation • Isolate components in a running game to determine their contribution to the frame rate: – Disable parts of the renderer • World • Characters • Special effects – Turn off sound – Turn off collision • May be misleading if components interact • Being able to do this easily is an example of good architecture paying off

Programming for Performance 1 Textbook Definition of Real-time A - PowerPoint PPT Presentation

Programming for Performance 1 Textbook Definition of Real-time A Real-time System responds in a (timely) predictable way to unpredictable external stimuli arrivals. A system is a real-time system when it can support the execution of

voice Kate Howland End-user programming? End-user programming? End-user programming?

Hierarchy of Software Complexity Application Programs Sequential Programming Embedded

Programming Styles and Objects Fermilab - TARGET 2018 Week 3 Programming styles Imperative

+ f(x) = Python Functional Programming Python Functional Programming Functional Programming by

NLP Programming Tutorial 0 - Programming Basics Graham Neubig Nara Institute of Science and

CS2281: Programming in UNIX Semester 3, 2004/05 CS2281: Programming in UNIX p.1/13 Syllabus

61A Lecture 26 Announcements Programming Languages Programming Languages 4 Programming

Core Core Programming Programming Tools Tools Performance Performance GUI GUI Gameplay

A CLASSIC HORROR STORY Ease of Performance Programming THE PIT AND THE PENDULUM Lorenzo

? P12 2 Getting Started/Lab Programming Lab Programming Program of Requirements PRELIMINARY

Introduction to Functional Programming in Python David Jones drj@ravenbrook.com Programming:

GPU programming in Haskell Henning Thielemann 2015-01-23 GPU programming in Haskell Motivation:

Programming Distributed Systems Programming Models for Distributed Systems Annette Bieniusa FB

MATHEMATICS 1 CONTENTS Mathematical programming Linear programming The LP-problem Old exam

Network Programming Network Programming as Programming across Machine Boundaries The

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

VK Computer Games Game Development Fundamentals Horst Pichler & Mathias Lux Universitt

Realistic and Interactive Realistic and Interactive Simulation of Rivers Simulation of Rivers

Gemini: Rendezvous and Docking INST 154 Apollo at 50 Gemini Objectives To demonstrate

Strategies for Incorporating Delegation into Attribute-Based Access Control (ABAC) Sylvia L.

Discrete logarithm problem for matrices over finite group rings Alex Myasnikov Stevens Institute

Parameterized Streaming Algorithms Graham Cormode Rajesh Chitnis Parameterized Streaming

Textures for Real-Time Ray Tracing Christian F. Ruff, Esteban W. G. Clua and Leandro A. F.

Leveraging Ambari to Build Comprehensive Management UIs For Your Hadoop Applications by

Programming for Performance 1 Textbook Definition of Real-time A - PowerPoint PPT Presentation

Programming for Performance 1 Textbook Definition of Real-time A Real-time System responds in a (timely) predictable way to unpredictable external stimuli arrivals. A system is a real-time system when it can support the execution of

voice Kate Howland End-user programming? End-user programming? End-user programming?

Hierarchy of Software Complexity Application Programs Sequential Programming Embedded

Programming Styles and Objects Fermilab - TARGET 2018 Week 3 Programming styles Imperative

+ f(x) = Python Functional Programming Python Functional Programming Functional Programming by

NLP Programming Tutorial 0 - Programming Basics Graham Neubig Nara Institute of Science and

CS2281: Programming in UNIX Semester 3, 2004/05 CS2281: Programming in UNIX p.1/13 Syllabus

61A Lecture 26 Announcements Programming Languages Programming Languages 4 Programming

Core Core Programming Programming Tools Tools Performance Performance GUI GUI Gameplay

A CLASSIC HORROR STORY Ease of Performance Programming THE PIT AND THE PENDULUM Lorenzo

? P12 2 Getting Started/Lab Programming Lab Programming Program of Requirements PRELIMINARY

Introduction to Functional Programming in Python David Jones drj@ravenbrook.com Programming:

GPU programming in Haskell Henning Thielemann 2015-01-23 GPU programming in Haskell Motivation:

Programming Distributed Systems Programming Models for Distributed Systems Annette Bieniusa FB

MATHEMATICS 1 CONTENTS Mathematical programming Linear programming The LP-problem Old exam

Network Programming Network Programming as Programming across Machine Boundaries The

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

VK Computer Games Game Development Fundamentals Horst Pichler &amp; Mathias Lux Universitt

Realistic and Interactive Realistic and Interactive Simulation of Rivers Simulation of Rivers

Gemini: Rendezvous and Docking INST 154 Apollo at 50 Gemini Objectives To demonstrate

Strategies for Incorporating Delegation into Attribute-Based Access Control (ABAC) Sylvia L.

Discrete logarithm problem for matrices over finite group rings Alex Myasnikov Stevens Institute

Parameterized Streaming Algorithms Graham Cormode Rajesh Chitnis Parameterized Streaming

Textures for Real-Time Ray Tracing Christian F. Ruff, Esteban W. G. Clua and Leandro A. F.

Leveraging Ambari to Build Comprehensive Management UIs For Your Hadoop Applications by

VK Computer Games Game Development Fundamentals Horst Pichler & Mathias Lux Universitt