A low latency GPU engine based reset mechanism for a more robust UI - - PowerPoint PPT Presentation

a low latency gpu engine based reset mechanism for a more
SMART_READER_LITE
LIVE PREVIEW

A low latency GPU engine based reset mechanism for a more robust UI - - PowerPoint PPT Presentation

A low latency GPU engine based reset mechanism for a more robust UI experience Carlos Santa Intel/Chrome OS 1 Problem statement: Stability and Robustness Looking at a specific stability problem affecting the UI experience under Chrome OS -


slide-1
SLIDE 1

1

A low latency GPU engine based reset mechanism for a more robust UI experience Carlos Santa

Intel/Chrome OS

slide-2
SLIDE 2

2

  • Looking at a specific stability problem affecting the UI experience under Chrome OS

under Intel Architecture when running gfx/video use case (video streaming type of app)

  • The behavior was a frozen UI, followed by a black screen followed by system

reboot (of course after some random time interval (hours to long long hours)).

  • Spent some time understanding the GFX architecture in Chrome OS as well as a

possible solution that could help here.

Problem statement: Stability and Robustness

slide-3
SLIDE 3

3

Current limitation

GL / D3D

Compositor Context Video App Context

GPU Driver GPU H/W

1 crash/hang

3D Render Engine Media Engine Video Codec Engine

GPU Process (Server)

2 full gpu reset

  • 1. If a 3D client app “hangs” the GPU then the GPU process may get killed followed

by a full GPU reset.

  • 2. For a complex use case such as video decode many frames and
  • bjects are currently in flight so killing the GPU Process and resetting the GPU causes

undesirables effects to the user including a frozen UI, black screen or a full system reboot.

Shared Memory Compositor Video App

Renderer Process (Client)

slide-4
SLIDE 4

4

  • New feature for Intel GPUs (still not in upstream) that can increase both stability and

robustness for any h/w accelerated compositors.

  • Timeout Detection and Recovery (TDR) allows for the independent engines in the GPU to be

reset independently (as opposed to a full GPU reset).

  • The complete solution though still requires changes to be

taken by a “qualified” user space media driver

Proposed solution: TDR

slide-5
SLIDE 5

5

GPU Process (Server) GL / D3D

Compositor Context Video App Context

Proposed solution:

GPU Driver GPU H/W

3D Render Engine Media Engine Video Codec Engine

2 3 media engine gpu reset

UMD Media Driver

1

  • 1. UMD Media Driver starts the watchdog timer after sending batch buffers
  • 2. At some time later the media engine is detected to be in hung state after the watchdog timer has expired
  • 3. The GPU driver resets only the affected media engine
  • 4. Because the UMD Media driver knows when the faulty batch got submitted it could take actions during the

the time it take the media driver to come back from the reset.

slide-6
SLIDE 6

6

Status of TDR in upstream:

Accepted in upstream Comments TDR – Reset Engine  Yes TDR – with GuC X No TDR - Watchdog X No

Requires qualify UI client

IGT – TDR Watchdog X No Submitted last week Prototype Comments TDR - Watchdog Ubuntu OS w/ drm-tip iHD Media Stack IGT – TDR Watchdog Ubuntu OS w/ drm-tip Passes validation IGT – TDR Watchdog

Chrome OS – cros-4.14

Passes validation TDR - Watchdog

Chrome OS – cros-4.14 i965 Media Stack Support WIP