Workflow Design for GPU Centric Film Production (or how we make the - - PowerPoint PPT Presentation

workflow design for gpu
SMART_READER_LITE
LIVE PREVIEW

Workflow Design for GPU Centric Film Production (or how we make the - - PowerPoint PPT Presentation

Network, Storage, and Workflow Design for GPU Centric Film Production (or how we make the ******* chimichangas) Simplified WorkFlOw Camera Dailies Editorial Color Grading / Deliverables Deadpool Armory Open Drives Velocity 36 TB Array


slide-1
SLIDE 1

Network, Storage, and Workflow Design for GPU Centric Film Production

(or how we make the ******* chimichangas)

slide-2
SLIDE 2
slide-3
SLIDE 3

Simplified WorkFlOw

Camera Dailies Editorial Color Grading / Deliverables

slide-4
SLIDE 4

Deadpool Armory

Open Drives Velocity 36 TB Array Open Drives Exos (SSD Disk hybrid) 216 TB Mellanox SX1012 5x Mac Pro 2013 3x HP Z840 64 GB Memory Intel V3 2687 W 5x Nvidia M6000 Solarflare 5622

slide-5
SLIDE 5

Example Facility DiAgram

slide-6
SLIDE 6

Interesting Facts

  • 1. The entire opening sequence for Deadpool was look dev’d and

pitched using Vray RT renders created using M6000’s.

  • 2. With an Nvidia M6000 and relatively unlimited Bandwidth. The

Entire offline of the Deadpool film can be rendered in 12 minutes.

  • 3. If 6 editors, all hit play at the same time in Adobe Premiere, to fully

load the GPU cache (for real-time effects absorption). Total bandwidth draw can reach up to a peak 4 GB/s even with offline style codecs.

slide-7
SLIDE 7

The Big Problem

  • GPU Speed has resulted in unpredictable artist environments. With

Extremely spikey IO patterns.

  • At any given moment Editors, Compositors, and now 3D Lighting artists

have the capability to completely saturate their link to central storage. Or create an IOPs storm.

  • This creates unacceptable lag, reducing many of the benefits of GPU

centered workflow.

  • GPU applications are now extremely affected by lack of throughput, but

also latency.

slide-8
SLIDE 8

Methods For Dealing

Choose a low latency Network Card 1 MB NFS Call return response time. 10 Gb.

  • 1. Solarflare 1.03 ms
  • 2. Intel X540 1.15 ms
  • 3. Atto 1.9 ms
  • 4. Promise 2.4 ms

In other words a card that costs the same price, can easily double your scene load

  • times. Or if the applications is poorly written, we have seen up to 5X reduction in

certain latency specific tasks.

slide-9
SLIDE 9

Storage Side

  • Intense predictive memory caching. We can achieve speeds up to 40

GB/s with latency mesaured in microseconds out of our highest caching tier. Currently limited by network interconnects.

  • Block level mechanism, particularly effective with wavlet or mip map

textures, we load what the engine needs.

  • Deep low latency analytics, allow us to control Bandwidth and IOPS to

prevent a system lock scenario and a consistent level of service to all clients.

slide-10
SLIDE 10

Storage Side

  • Currently Open Drives has optimization profiles for Adobe Premiere,

Nuke, Vray, Davinci, and Baselight.

  • These profiles have let us by way of example load a 47,000 object

project in under 75 seconds. A roughly 4x increase over last years project Gone Girl.

  • Further optimization on Deadpool let us Average an access pattern

with over an over 96 % cache hit ratio. Out of the layer that we can deliver up to 1.1 Million 4 KB read iops.

slide-11
SLIDE 11

Switch Side

  • L2 LACP Binding is your friend.
  • For video workflow, look at switch providers that can maintain low
  • latency. Also segregate completely your production workflow network

from your internet gigabit network.

  • If you’re dealing with 4K or larger image sequences, go to Jumbo
  • frames. Client Interupt scheduling can be painful. RDMA methods on

most client OS’s are still immature.

slide-12
SLIDE 12

Thank YOU