April 4-7, 2016 | Silicon Valley
S6783 Elif Albuz, April 4, 2016
VISIONWORKS A CUDA ACCELERATED COMPUTER VISION LIBRARY S6783 Elif - - PowerPoint PPT Presentation
April 4-7, 2016 | Silicon Valley VISIONWORKS A CUDA ACCELERATED COMPUTER VISION LIBRARY S6783 Elif Albuz, April 4, 2016 Motivation Introduction to VisionWorks VisionWorks Software Stack AGENDA VisionWorks Programming Model
April 4-7, 2016 | Silicon Valley
S6783 Elif Albuz, April 4, 2016
2
3
Intelligent Video Analytics Drones Autonomous Driving Robotics Augmented Reality
4
5
Concept Reference Implementation Product Port to target & optimize
6
Depth Map Optical Flow Corner detection
7
CUDA accelerated library (OpenVX primitives + NVIDIA extensions + Plus Algorithms)
Flexible framework for seamlessly adding user-defined primitives. Interoperability with OpenCV Thread-safe API Documentation, tutorials, sample software pipelines that teach use
8
Automotive Embedded Desktop
9
CUDA Acceleration Framework
OpenVXTM Framework & Primitives NVIDIA VisionWorks Framework & Primitive Extensions VisionWorks SfM
NVIDIA Khronos
VisionWorks Core Library Source Samples
VisionWorks Source Samples
Feature Tracking, Hough Transform, Stereo Depth Extraction, Camera Hist Equalization..
NVXIO Multimedia Abstraction
VisionWorks-Plus
VisionWorks Object Tracker
VisionWorks CUDA API
10
IMAGE ARITHMETIC
Absolute Difference Accumulate Image Accumulate Squared Accumulate Weighted Add/ Subtract/ Multiply + Channel Combine Channel Extract Color Convert + CopyImage Convert Depth Magnitude MultiplyByScalar Not / Or / And / Xor Phase Table Lookup Threshold
FLOW & DEPTH
Median Flow Optical Flow (LK) + Semi-Global Matching Stereo Block Matching IME Create Motion Field IME Refine Motion Field IME Partition Motion Field
GEOMETRIC TRANSFORMS
Affine Warp + Warp Perspective + Flip Image Remap Scale Image +
FILTERS
BoxFilter Convolution Dilation Filter Erosion Filter Gaussian Filter Gaussian Pyramid Laplacian3x3 Median Filter Scharr3x3 Sobel 3x3
FEATURES
Canny Edge Detector FAST Corners + FAST Track Harris Corners + Harris Track Hough Circles Hough Lines
ANALYSIS
Histogram Histogram Equalization Integral Image Mean Std Deviation Min Max Locations
NVIDIA Extensions All OpenVX Primitives
+ type/mode extension by NVIDIA NVIDIA extension primitives
11
(except MedianFlow & FindHomography extensions)
(Go to "VisionWorks API" -> "NVIDIA Extensions API" -> "Vision Primitives API”)
(Measured on Drive PX, OS=‘V4L' Linux Kernel='3.18.21-tegra-g06aec38' CPU Rate='1632 MHz' GPU Rate='844 MHz' EMC Rate='1600 MHz’)
NVIDIA Extensions All OpenVX Primitives
12
Feature Tracker Stereo Depth Extraction OpenCV-NPP- OpenVX Interop Hough Lines & Circles
+ Video stabilization + Iterative Motion Estimation/Flow and other platform specific samples (available only on certain platforms) Camera Capture, OpenGL interop, Video playback
13
Camera input
ISP & Camera Processing CUDA
CSI
Vision processing GFX Render
Video/image file input Streamed video/image input
Image/Video Encode . . . Image/Video Decode
Interop/EGLStre ams Interop/EGLStre ams
NVXIO
CPU COMPLEX (Multi-core ARM v8)
SECURITY ENGINE 2D ENGINE (VIC) VIDEO ENCODER VIDEO DECODER AUDIO ENGINE (APE) SAFETY ENGINE (SCE) IMAGE PROC (ISP) SAFETY MANAGER (HSM) BOOT PROC (BPMP) CAN PROC (SPE) I/O
GPU
14
Structure From Motion Object Tracker
15
16
Standard specified heterogeneous compute API with individual function calls Heterogeneous compute API with graph
Extensible with user defined nodes Direct CUDA API for advanced CUDA developers
17
OpenVX Immediate mode API enables developers to easily port their applications. OpenVX API Immediate mode calls are prefixed with “vxu” Ported Video Stabilization algorithm in OpenCV to VisionWorks Immediate Mode.
Color Conversion Optical Flow
Stabilized frames
Cv::Mat to Vx_image Processs pts & Find Homography Warp Perspective
OpenCV image Source
Feature detection Image Pyramid
18
Performance boost: Video stabilization application is accelerated by 2.6x (including the overhead for Mat to vx_image conversions)
Color Conversion Optical Flow
Stabilized frames
Cv::Mat to Vx_image Processs pts & Find Homography Warp Perspective
OpenCV image Source
Feature detection Image Pyramid
0.6x 1.4x 1.7x 4.9x 2.3x 4.6x
19
OpenVX API graph mode calls are prefixed with “vx” OpenVX Graph enables advanced optimizations
Color Conversion Optical Flow
Stabilized frames
Processs pts & Find Homography Warp Perspective
Image Source
Feature detection Image Pyramid
20
Performance boost: Video stabilization application is further accelerated compared to immediate mode.
Color Conversion Optical Flow
Stabilized frames
Processs pts & Find Homography Warp Perspective
Image Source
Feature detection Image Pyramid
21
VisionWorks CUDA API enables developer with low-level access. Developer manages
YUV frame Gray frame Camera/image/video Input data Rendering/Output
nvxcuColor Convert nvxcuChannel Extract nvxcuOptica lFlowPyrLK nvxcuHarris Track nvxcuGaussian Pyramid
RGB frame (CUDA buffer) Array of keypoints
22
Quick port from other libraries To be able to reassign CPU and GPU tasks based
Let the graph manager to hide overheads, optimize and manage data To be able to reassign CPU and GPU tasks based on perf. Low level CUDA API access for advanced CUDA developers
23
Enable VisionWorks debug markers with “export NVX_PROF=nvtx”
24
Installed location: /usr/share/visionWorks/docs
25
Weekly VisionWorks downloads for various platforms
26
S6739-VisionWorks™ Toolkit Programming Tutorial Room LL20A L6129-VisionWorks™ Toolkit LAB Session Room 210C H6115 - Designing Computer Vision Applications with VisionWorks™ Pod B
27
28
[1] Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Proceedings
[2] Efficient Convolutional Patch Networks for Scene Understanding CVPR Workshop on Scene Understanding (CVPR-WS). [3] M. Cordts, M. Omran, S. Ramos, T. Scharwächter, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, "The Cityscapes Dataset," in CVPR Workshop on The Future of Datasets in Vision, 2015. 2015.
29
[1] Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Proceedings
[2] Efficient Convolutional Patch Networks for Scene Understanding CVPR Workshop on Scene Understanding (CVPR-WS). 2015.
30
31
NVXIO (Multimedia Abstraction) Histogram Eq w/Camera input Feature tracking with compressed images
Source Samples with multimedia I/0
Hough Lines with decoded video
. . . Platform Software Stack (Multimedia, Interop, GL, UI, System)
32
Platform Camera Decode Interop Render Encode
Android Android Camera HAL v3.0 Android API CUDA-OpenGL interop? OpenGLES 3.0 (?) Vibrante NvMedia capture NvMedia +Gst NvMedia h264 ES EGLStreams OpenGLES (GLFW) Gst Linux4Tegra Gst-capture Gst+OpenMAX EGLStreams OpenGLES Gst+OpenMAX Ubuntu Linux 14.04 V4L through OpenCV4Tegra Gst+VDPAU CUDA-OpenGL Interop OpenGL Gst Windows x64 V4W/OpenCV NVCUVID (Gst?) CUDA-OpenGL Interop OpenGL Ffmeg/OpenCV
Gst - Gstreamer
33