53023 - EGLSTREAMS: INTEROPERABILITY FOR CAMERA, CUDA AND OPENGL - - PowerPoint PPT Presentation

53023 eglstreams
SMART_READER_LITE
LIVE PREVIEW

53023 - EGLSTREAMS: INTEROPERABILITY FOR CAMERA, CUDA AND OPENGL - - PowerPoint PPT Presentation

53023 - EGLSTREAMS: INTEROPERABILITY FOR CAMERA, CUDA AND OPENGL Debalina Bhattacharjee Sharan Ashwathnarayan Tegra SOC and typical use-cases Why Interops EGLStream and Its Key Features Examples on EGLStream Agenda


slide-1
SLIDE 1

Debalina Bhattacharjee Sharan Ashwathnarayan

53023 - EGLSTREAMS: INTEROPERABILITY FOR CAMERA, CUDA AND OPENGL

slide-2
SLIDE 2

2

Agenda

  • Tegra SOC and typical use-cases
  • Why Interops
  • EGLStream and Its Key Features
  • Examples on EGLStream
  • Connect EGLStream to NvMedia and CUDA
  • Perform CUDA processing on Camera inputs
  • Connect EGLStream to NvMedia and OpenGL
  • Display with OpenGL.
  • Future Scope
slide-3
SLIDE 3

3

TRY IT OUT!!!

Download: scp –r nvidia_1b@10.30.25.175:GTCEGL12OCT /home/nvidia Open the pdf: Go to /home/nvidia/GTCEGL12OCT/GTCEGL12OCT.pdf Run: cd /home/nvidia/GTCEGL12OCT/ export DISPLAY=:0 chmod +x ./x11/egl_stream_demo IF Monitor has A on it: ./x11/egl_stream_demo A IF Monitor has B on it: ./x11/egl_stream_demo B

slide-4
SLIDE 4

4

TEGRA SOC

Tegra SOC engines

Armv8 CPU Geforce GPU ISP Video Encode Video Decode And More.. CPU COMPLEX

SECURITY ENGINE 2D ENGINE (VIC) VIDEO ENCODER VIDEO DECODER AUDIO ENGINE (APE) SAFETY ENGINE (SCE) IMAGE PROC (ISP) SAFETY MANAGER (HSM) BOOT PROC (BPMP) CAN PROC (SPE) I/O

GEFORCE GPU

slide-5
SLIDE 5

5

WHY IS INTEROP NEEDED?

Different API and libraries for different engines

GPU ISP CUDA OpenGL Argus NvMedia

API Processor

slide-6
SLIDE 6

6

INTEROP

  • Same physical memory shared between different API and no copy involved

buffer

Memory ISP GPU

NvMedia

Memory ISP GPU

NvMedia

buffer buffer

memcpy

With Interop No Interop Avoid Memcpy = Perf Gain Single Buffer = Less Memory FootPrint MAP Synchronize

slide-7
SLIDE 7

7

AUTOMOTIVE USE CASES

Typical use cases utilize all resources on SOC

ISP Capture GPU compute GPU Display Video Decode GPU compute GPU Display ISP Capture GPU compute Video Encode

slide-8
SLIDE 8

8

INTEROP BETWEEN APIS

Engines Need To Talk With Each Other

NvMedia (ISP) CUDA (GPU) OpenGL (GPU) Argus (ISP) EGLDisplay (Display)

?

slide-9
SLIDE 9

9

EGLSTREAM

NvMedia (ISP) CUDA (GPU) OpenGL (GPU) Argus (ISP) EGLDisplay (Display)

EGLStream

Unified interface to communicate between multiple APIs

slide-10
SLIDE 10

10

EGLSTREAM – HOW IT WORKS

Transfer a sequence of image frames from one API to another.

Producer-Consumer Model Producer

Produces Frames

EGLStream

Enables and hides details of buffer transport

Consumer

Accepts frame

Buffer Buffer

slide-11
SLIDE 11

11

EGLSTREAMS

Producer Consumer

B

EGL stream

Acquire() PresentFrame() Release() ReturnFrame()

B Buffer/Allocation ownership

  • Usage Semantics:
slide-12
SLIDE 12

12

NVMEDIA_CUDA API SEQUENCE

Consumer

NvMediaProducerConnect CUDAConsumerConnect NvMediaProducerPostImage CUDAConsumerAcquireFrame NvMedia ProducerGetImage CUDAConsumerReleaseFrame NvMediaProducerDisconnect CUDAEGLStreamConsumerDisconnect

Producer

Consumer connects first A returned frame is safe to be presented again Consumer waits for a frame to presented

CUDA Kernel

Run CUDA kernel on acquired frame Release the frame Released frame is Returned to producer Implicit Synchronization Implicit Synchronization

NVMEDIA BLIT()

slide-13
SLIDE 13

13

CUDA_EGLSTREAM INTEROP

  • Performance improvement – no memcpy needed with iGPU
  • Less Memory footprint – single buffer is shared with mapping
  • Ease of use - Support for implicit synchronization
  • Cross –Process support - Producer/Consumer can be in different processes (IPC)
  • No need of individual interop – unified interface
  • Portable across cameras - Support for both Interleaved and Multi-planar format
  • Supported Platforms

Key Advantages

slide-14
SLIDE 14

14

EGLSTREAMS

  • Support for Discrete GPU on DrivePX 2
  • Transfer buffers from camera to iGPU or dGPU efficiently
  • Support on x86/x86_64 Linux
  • Support added for easier development
  • Additional YUV multiplanar color formats

CUDA 9.0 support

slide-15
SLIDE 15

15

DEMO APPLICATION

  • Built on Vibrante 4.1.8.0
  • Uses NvMedia for Producer
  • CUDA used for compute processing
  • OpenGL used for Display
  • One EGLStream per Camera
slide-16
SLIDE 16

16

IMPORTANT

  • Don’t pull out the camera!
  • If you are confused about which file to edit, call TA.
  • Call for TA’s help if something is wrong.
  • Refer to README file for details. (Especially for killing the app)
slide-17
SLIDE 17

17

DEMO APPLICATION

  • Nvmedia APIs capture Image & Presents it to CUDA
  • CUDA acquires the image, runs fisheye correction & YUV to RGB conversion
  • Hands over the image to googleNet inference Engine.
  • Inference on the image done & result reported.
  • CUDA releases the acquired Image.
  • NvMedia accepts the returned image.
  • Cycle continues.

WorkFlow: NVMEDIA  CUDA

slide-18
SLIDE 18

18

NVMEDIA_CUDA API SEQUENCE

Consumer

NvMediaEglStreamProducerCreate cuEGLStreamConsumerConnect NvMediaEglStreamProducerPostImage cuEGLStreamConsumerAcquireFrame NvMediaEglStreamProducerGetImage cuEGLStreamConsumerReleaseFrame NvMediaEglStreamProducerDestroy cuEGLStreamConsumerDisconnect

Producer

Line: 256

inferSingleFrame

Line: 163 Line: 309 Line: 152 Line: 178 Line: 352 Line: 319 Line: 133 Line: 180

img_producer.c cuda_consumer.c

slide-19
SLIDE 19

19

NVMEDIA – CUDA

Build: cd /home/nvidia/GTCEGL12OCT/ make Run: export DISPLAY=:0 IF Monitor has A on it: ./x11/egl_stream_demo A IF Monitor has B on it: ./x11/egl_stream_demo B

slide-20
SLIDE 20

20

OTHER CONSUMERS

slide-21
SLIDE 21

21

DEMO APPLICATION

  • Nvmedia APIs capture Image & Presents it to CUDA
  • GL acquires the image & renders it to the DISPLAY.
  • GL releases the acquired Image.
  • NvMedia accepts the returned image.
  • Cycle continues.

WorkFlow: NVMEDIA  GL

slide-22
SLIDE 22

22

APPLICATION

Main/Camera Producer Thread 1. Initialize Camera resources, create an EglStream 2. Launch OpenGL Consumer thread and pass the EglStream to it 3. Connect NvMediaProducer to EGL stream 4. Loop

1. Post frame on NvMediaProducer

GL Consumer Thread 1. Initialize helper GL resources 2. Create GLConsumer 3. Connect GLConsumer to the EglStream 4. Loop

1. Acquire frame from the EGLStream. Wait if frame is not available 2. Render the acquired frame 3. Release the acquired frame

slide-23
SLIDE 23

23

REPLACE CUDA CONSUMER WITH GL CONSUMER

cd /home/nvidia/GTCEGL12OCT/ Open interop.c with an Editor Comment out cuda_consumer.h & uncomment gl_consumer.h Search and Replace CudaConsumer with GlConsumer

slide-24
SLIDE 24

24

NVMEDIA - GL

Build: cd /home/nvidia/GTCEGL12OCT/ make Run: export DISPLAY=:0 IF Monitor has A on it: ./x11/egl_stream_demo A IF Monitor has B on it: ./x11/egl_stream_demo B

slide-25
SLIDE 25

25

THINGS TO TRY

More complex pipelines

Camera Producer

EGL Stream

CUDA Consumer CUDA Producer CUDA Processing

Integrated GPU

discrete GPU

CUDA Consumer CUDA inference CUDA Producer OpenGL Consumer

EGL Stream

EGL Stream

slide-26
SLIDE 26

26

REFERENCE

  • EGL_KHR_stream: https://www.khronos.org/registry/EGL/extensions/KHR/EGL_KHR_stream.txt
  • EGL_KHR_stream_consumer_gltexture:

https://www.khronos.org/registry/EGL/extensions/KHR/EGL_KHR_stream_consumer_gltexture.txt

  • CUDA:

http://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__EGL.html#group__CUDA__EGL

  • NvMedia: https://developer.nvidia.com/embedded
slide-27
SLIDE 27

THANK YOU