BUILDING A SUPER RESOLUTION VIDEO COMPOSITOR Thomas True, March 18. - PowerPoint PPT Presentation

BUILDING A SUPER RESOLUTION VIDEO COMPOSITOR Thomas True, March 18. 2019

Motivation Building Blocks Putting the Pieces Together AGENDA Case Study Results Q & A 2

MOTIVATION Create Large High-Resolution Displays Photo Courtesy of Cinnemassive: http://www.cinnemassive.com/ 3

Single GPU Limit!! MOTIVATION 4x 3840x2160@120 More and More Pixels or 4x 5120x2880@60 32 Displays!! 32x 3840x2160 @ 120 Hz 996 MP/s GPU or 32x 5120 x 2880 @ 60 Hz 885 MP/s GPU GPU GPU GPU GPU GPU GPU GPU 4

MOTIVATION Render Video + Graphics Video GPU GPU GPU GPU GPU GPU GPU GPU S8205- Multi-GPU Methods for Real-Time Graphics S7352-See the Big Picture: How to Build Large Display Walls Using NVIDIA APIs/Tools 5

BUILDING BLOCKS A Four Legged Stool DISPLAY SYNCHRONIZATION GPU VIDEO PROCESSING Mosaic NVIDIA Codec SDK • Quadro Sync • EFFICIENT RENDERING LOW-LATENCY VIDEO INGEST • GPU Direct for Video GPU Direct RDMA • 6

MOSAIC Create a Seamless Desktop Supported on all Quadro GPUs Supported in single and multi-GPU configurations Without Mosaic With Mosaic Drive 32 4K Displays at 60 Hz 7

MOSAIC Creates a Single Logical GPU With Mosaic Without Mosaic 8 Physical GPUs 8 Physical GPUs 8 Logical GPUs 1 Logical GPU 8

QUADRO SYNC II Hardware Features Provide Tear-Free Mosaic Display FRAMELOCK MULTIPLE DISPLAYS EXTERNAL/HOUSE SYNC SWAP SYNCHRONIZATION MOSAIC WITH SYNC 9

EFFICIENT RENDERING Explicit GPU Addressing With Directed Rendering Without Directed Rendering 10

NVIDIA VIDEO CODEC SDK cuDNN, TensorRT , Easy access to GPU DeepStream SDK cuBLAS, cuSPARSE video acceleration SOFTWARE VIDEO CODEC SDK CUDA TOOLKIT Video Encode and Decode for Windows and Linux APIs, libraries, tools, CUDA, DirectX, OpenGL interoperability samples NVIDIA DRIVER NVENC NVDEC CUDA HARDWARE Video decode Video encode High-performance computing on GPU S9331 – NVIDIA GPU Video Technologies: Overview, Applications and Optimization Techniques Wednesday March 20, 2:00-2:50PM, Room 230C 11

GPU DIRECT FOR VIDEO Video Transfers Through a Shareable System Memory Buffer SYSTEM Memory CPU Shared 3 rd Party Quadro/Tesla Video Input/Output GPU Card http://on-demand.gputechconf.com/siggraph/2016/video/sig1602-thomas-true-gpu-video-processing.mp4 12

GPU DIRECT FOR VIDEO Application Usage But This: Not This: Application Application 3 rd Party Video I/O SDK GPU OpenGL CUDA DX Vulkan OpenGL CUDA DX GPU Vulkan Direct Direct for for Video Video 3 rd Party Video I/O 3 rd Party Video I/O NVIDIA Driver NVIDIA Driver Device driver Device driver 13

GPU DIRECT FOR VIDEO Video Capture to OpenGL Texture main() { ….. GLuint glTex; glGenTextures (1, &glTex); \\ Create OpenGL texture obect glBindTexture ( GL_TEXTURE_2D , glTex); glTexImage2D ( GL_TEXTURE_2D , 0, GL_RGB , bufferWidth, bufferHeight, 0, 0, 0, 0); glBindTexture ( GL_TEXTURE_2D , 0); EXTRegisterGPUTextureGL (glTexIn); \\ Register texture with 3 rd party Video I/O SDK while(!quit) { EXTBegin (glTexIn); \\ Release texture from Video I/O SDK Render(glTexIn); \\ Use the texture EXTEnd(glTexIn); \\ Release texture back to Video I/O SDK } EXTUnregisterGPUTextureGL(glTexIn); \\ Unregister texture with 3 rd party Video I/O SDK } 14

GPU DIRECT RDMA Peer-to-Peer Video Transfers SYSTEM Memory CPU Shared 3 rd Party Quadro/Tesla Video Input/Output GPU Card https://docs.nvidia.com/cuda/gpudirect-rdma/index.html 15

PUTTING THE PIECES TOGETHER 16

PUTTING THE PIECES TOGETHER Application Steps to Success Design GPU-Display Topology to Optimize Locality 1. Single Full Screen Window with Multiple Viewports 2. Enumerate GPUs 3. Map GPUs to Displays 4. Perform Spatial Decomposition of Scene 5. Program Directed Compute 6. Program Directed Rendering 7. Swap / Present 8. 17

DESIGN TOPOLOGY TO OPTIMIZE LOCALITY Quadrants For Rectangular Content Stripes For Horizontal Content Columns For Vertical Content 18

APPLICATION ARCHITECTURE Full Screen Window with Content Regions Video 19

EXAMPLE SOFTWARE ARCHITECTURE Mixed 3D and Video Content A Content Region uses … Content its 2D Rectangle to Canvas Content compute the GPU Mask Region Content Region Region OGL Context GPU mask GPU spatial index 2D Rectangle Content Regions[] One Decoder per GPU GPU Mask Inherits The Canvas lives in the … main process and manages multiple Decoder Content Regions Decoder 3D Renderer Video Player Decoder CUDA Context CUDA Context Thread Demuxer CUDA Context Thread Decoders[] Thread Thread 20

MAPPING CONTENT REGIONS TO GPUS Spatial Indexing 0x04 0x08 0x01 0x02 1. Query each GPU’s pixel region 0x40 2. Store the regions in an index, e.g.: 0x10 0x20 0x80 a) Flat list b) Quadtree c) R-Tree 3. For each content region a) Use the index to determine which GPUs are intersected b) Decode only on these GPUs 0x01 | 0x02 = 0x03 c) Render only on these GPUs d) If the content region moves, re-query the index 21

GPU ENUMERATION Windows NVAPI // Enumerate Physical GPUs NvU32 numPhysGpus = 0; NvPhysicalGpuHandle nvGpuHandles[NVAPI_MAX_PHYSICAL_GPUS]; NvAPI_EnumPhysicalGPUs ( numPhysGpus, &nvGpuHandles ); // Enumerate Logical GPUs NvU32 numLogiGpus = 0; NvLogicalGpuHandle nvGpuHandles[NVAPI_MAX_LOGICAL_GPUS]; NvAPI_EnumLogicalGPUs ( numLogiGpus, &nvGpuHandles ); https://developer.nvidia.com/nvapi 22

MAPPING LOGICAL GPUS TO PHYSICAL GPUS Windows NVAPI // Enumerate Logical GPUs NvU32 numLogiGpus = 0; NvLogicalGpuHandle nvGpuHandles[NVAPI_MAX_LOGICAL_GPUS]; NvAPI_EnumLogicalGPUs ( numLogiGpus, &nvGpuHandles ) // Map Logical GPUs to Physical GPUs New in for (NvU32 index = 0; index < numLogiGPUs; index++) R421!!! { NV_LOGICAL_GPU_DATA logiGPUData = { 0 }; logiGPUData.version = NV_LOGICAL_GPU_DATA_VER; logiGPUData.pOSAdapterId = malloc(sizeof(LUID)); NvAPI_GPU_GetLogicalGpuInfo (nvGpuHandles[index], &logiGPUData); } https://developer.nvidia.com/nvapi 23

MAPPING PHYSICAL GPUS TO DISPLAYS Windows NVAPI // Get connected display IDs for each GPU NvU32 conDispIdCnt[NVAPI_MAX_PHYSICAL_GPUS] = { 0 }; NV_GPU_DISPLAYIDS *pConDispIds[NVAPI_MAX_PHYSICAL_GPUS]; NvU32 flags = NV_GPU_CONNECTED_IDS_FLAG_UNCACHED | NV_GPU_CONNECTED_IDS_FLAG_SLI | NV_GPU_CONNECTED_IDS_FLAG_FAKE; for (NvU32 index = 0; index < numPhysGpus; index++) { NvAPI_GPU_GetConnectedDisplayIds (nvGPUHandle[index], NULL, &conDispIdCnt[index], flags); if (conDispIdCnt[index]) { pConDispIds[index] = (NV_GPU_DISPLAYIDS*)calloc(conDispIdCnt[index], sizeof(NV_GPU_DISPLAYIDS)); pConnectedDisplayIds[index]->version = NV_GPU_DISPLAYIDS_VER; NvAPI_GPU_GetConnectedDisplayIds (nvGPUHandle[index], pConDispIds[index], &conDispIdCnt[index], flags); } } https://developer.nvidia.com/nvapi 24

MAPPING DISPLAYS TO SCREEN AREA Windows NVAPI // Get screen coordinates for each connected display for each GPU for (NvU32 index = 0; index < numPhysGpus; index++) { for (NvU32 display = 0; display < nvConnectedDisplayIdCount[index]; display++) { NvSBox dRect = { 0 }; // Desktop rect NvSBox sRect = { 0 }; // Scanout rect NvAPI_GPU_GetScanoutConfiguration (pConnectedDisplayIds[index][display].displayID, &dRect, &sRect); } } https://developer.nvidia.com/nvapi 25

MAPPING PHYSICAL GPUS TO DISPLAYS Windows NVAPI 1A00 1800 1C00 1900 6700 6A00 6800 6900 26

SPATIAL MAPPING Dividing the Workload Among the Physical GPUs 7 1 3 5 6 8 2 4 GPU 1 GPU 2 GPU 3 GPU 4 GPU 5 GPU 6 GPU 7 GPU 8 27

DIRECTED COMPUTE Explicit GPU Programming // Enumerate CUDA GPUs int numGPUs; CK_CUDA( cudaGetDeviceCount (&numGPUs)); // Get PCI bus ID and device ID for each GPU std::vector<int> busIDList(numGPUs); // Bus IDs std::vector<int> devIDList(numGPUs); // Device IDs for (int i = 0; i < numGPUs; i++) { CK_CUDA( cudaDeviceGetAttribute (&busIDList[i], cudaDevAttrPciBusId, i)); CK_CUDA( cudaDeviceGetAttribute (&devIDList[i], cudaDevAttrPciDevId, i)); } // Match PCI bus ID and device ID to those returned from NVAPI // Set CUDA device to matched GPU CK_CUDA( cudaSetDevice (matchedGPU)); 28

DIRECTED RENDERING Application must: 1. Manage multiple GPU OpenGL: Don’t Use GPU Affinity Context 2. Multi-pump the API Enumerate GPUs: wglEnumGpusNV ( UINT iGPUIndex, HGPUNV* phGPU ); Context Enumerate displays per GPU: Context wglEnumGpuDevicesNV ( HGPUNV hGPU, UINT iDeviceIndex, Context PGPU_DEVICE lpGpuDevice ); Render App Context Create an OpenGL context for a specific GPU: Context HGPUNV gpuMask[2] = {phGPU, nullptr}; Context HDC affinityDc = wglCreateAffinityDCNV ( gpuMask ); SetPixelFormat( affinityDc, ... ); Context HGLRC affinityGlrc = wglCreateContext( affinityDc ); Context https://www.khronos.org/registry/OpenGL/extensions/NV/WGL_NV_gpu_affinity.txt 29

BUILDING A SUPER RESOLUTION VIDEO COMPOSITOR Thomas True, March 18. - PowerPoint PPT Presentation

BUILDING A SUPER RESOLUTION VIDEO COMPOSITOR Thomas True, March 18. 2019 Motivation Building Blocks Putting the Pieces Together AGENDA Case Study Results Q & A 2 MOTIVATION Create Large High-Resolution Displays Photo Courtesy of

UNVEILING THE SUPER ORBITAL UNVEILING THE SUPER ORBITAL UNVEILING THE SUPER-ORBITAL UNVEILING

SYNTHESIS OF SUPER SYNTHESIS OF SUPER NANOPOROUS SYNTHESIS OF SUPER SYNTHESIS OF

SIGBI Limited General Meeting 2019 Resolutions 1-6 Resolution 1 Resolution 2 Resolution 3

Patagonia Gold Plc 2009 Patagonia Gold VOTING ORDINARY SPECIAL Resolution 1 Resolution 2

The Single Resolution Mechanism Elke Knig Chair of the Single Resolution Board FDIC Systemic

Residuals in Deep Super-Resolution Ruofan Zhou, Fayez Lahoud , Majed EI Helou, and Sabine

Super GPU & Super Kernels: Make programming of multi-GPU systems easy Michael Frumkin, May 8,

THE FALL 2018 NFL PRIMETIME SEASON & THE SUPER BOWL KTG CONTENT STRATEGY SUPER BOWL

Bigger is Better Trends in super computers, super software, and super data Michael L. Norman,

Super- -Kamiokande Kamiokande s s Solar Neutrino results Solar Neutrino results Super

CCP Resolution: proposal for an EU Regulation and FSB Guidance on CCP Resolution 2ND EUROPEAN

Patagonia Gold Plc g 2010 Cap-Oeste updated June 2010 Patagonia Gold AGM VOTING 2010 g

Machine learning meets super-resolution H. N. Mhaskar Claremont Graduate University, Claremont.

Super-Resolution via Image Recapture and Bayesian Effect Modeling Neil Toronto Oral Thesis

Representation Learning and Super-Resolution Generation for Scientific Visualization Chaoli Wang

Spectral super-resolution DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis

Using HMM to Blur the Lines between CPU and GPU Programming John Hubbard, May 10, 2017

Citywide Survey Informational presentation February 20 th , 2019 Historic Preservation Commission

Office of Planning Andrew Trueblood, Director Comprehensive Plan: Public Draft Amendment Release

Characterization of Indianas Coal Resource: Availability of the Reserves, Physical and

vulnerability in urban area Olivier SANTONI FERDI AREQUIPA - May 2017 Risk assessment

Community Mapping Creating the Evidence Base Citizen Science & Participatory Mapping DR

Physical Security Briefing for Senate IT November 25, 2019 Presented by: Joe Souza, Director

States Presentation using the 5 Themes of Geography You and your partner will work to complete the

BUILDING A SUPER RESOLUTION VIDEO COMPOSITOR Thomas True, March 18. - PowerPoint PPT Presentation

BUILDING A SUPER RESOLUTION VIDEO COMPOSITOR Thomas True, March 18. 2019 Motivation Building Blocks Putting the Pieces Together AGENDA Case Study Results Q & A 2 MOTIVATION Create Large High-Resolution Displays Photo Courtesy of

UNVEILING THE SUPER ORBITAL UNVEILING THE SUPER ORBITAL UNVEILING THE SUPER-ORBITAL UNVEILING

SYNTHESIS OF SUPER SYNTHESIS OF SUPER NANOPOROUS SYNTHESIS OF SUPER SYNTHESIS OF

SIGBI Limited General Meeting 2019 Resolutions 1-6 Resolution 1 Resolution 2 Resolution 3

Patagonia Gold Plc 2009 Patagonia Gold VOTING ORDINARY SPECIAL Resolution 1 Resolution 2

The Single Resolution Mechanism Elke Knig Chair of the Single Resolution Board FDIC Systemic

Residuals in Deep Super-Resolution Ruofan Zhou, Fayez Lahoud , Majed EI Helou, and Sabine

Super GPU &amp; Super Kernels: Make programming of multi-GPU systems easy Michael Frumkin, May 8,

THE FALL 2018 NFL PRIMETIME SEASON &amp; THE SUPER BOWL KTG CONTENT STRATEGY SUPER BOWL

Bigger is Better Trends in super computers, super software, and super data Michael L. Norman,

Super- -Kamiokande Kamiokande s s Solar Neutrino results Solar Neutrino results Super

CCP Resolution: proposal for an EU Regulation and FSB Guidance on CCP Resolution 2ND EUROPEAN

Patagonia Gold Plc g 2010 Cap-Oeste updated June 2010 Patagonia Gold AGM VOTING 2010 g

Machine learning meets super-resolution H. N. Mhaskar Claremont Graduate University, Claremont.

Super-Resolution via Image Recapture and Bayesian Effect Modeling Neil Toronto Oral Thesis

Representation Learning and Super-Resolution Generation for Scientific Visualization Chaoli Wang

Spectral super-resolution DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis

Using HMM to Blur the Lines between CPU and GPU Programming John Hubbard, May 10, 2017

Citywide Survey Informational presentation February 20 th , 2019 Historic Preservation Commission

Office of Planning Andrew Trueblood, Director Comprehensive Plan: Public Draft Amendment Release

Characterization of Indianas Coal Resource: Availability of the Reserves, Physical and

vulnerability in urban area Olivier SANTONI FERDI AREQUIPA - May 2017 Risk assessment

Community Mapping Creating the Evidence Base Citizen Science &amp; Participatory Mapping DR

Physical Security Briefing for Senate IT November 25, 2019 Presented by: Joe Souza, Director

States Presentation using the 5 Themes of Geography You and your partner will work to complete the

Super GPU & Super Kernels: Make programming of multi-GPU systems easy Michael Frumkin, May 8,

THE FALL 2018 NFL PRIMETIME SEASON & THE SUPER BOWL KTG CONTENT STRATEGY SUPER BOWL

Community Mapping Creating the Evidence Base Citizen Science & Participatory Mapping DR