VISIONWORKS A CUDA ACCELERATED COMPUTER VISION LIBRARY S6783 Elif - PowerPoint PPT Presentation

April 4-7, 2016 | Silicon Valley VISIONWORKS™ A CUDA ACCELERATED COMPUTER VISION LIBRARY S6783 Elif Albuz, April 4, 2016

Motivation Introduction to VisionWorks™ VisionWorks™ Software Stack AGENDA VisionWorks™ Programming Model Conclusion Demo 2

COMPUTER VISION Intelligent Video Analytics Autonomous Driving Robotics Drones Augmented Reality 3

COMPUTER VISION 4

COMPUTER VISION APP DEVELOPMENT Product Port to target & optimize Reference Implementation Concept 5

VISIONWORKS ™ MOTIVATION Deliver high performance, robust computer vision primitives Depth Map Ease development of computer vision applications on Tegra platforms Optical Flow Accelerate prototype to product cycle Corner detection 6

VISIONWORKS ™ AT A GLANCE CUDA accelerated library (OpenVX primitives + NVIDIA extensions + Plus Algorithms) Flexible framework for seamlessly adding user-defined primitives. Interoperability with OpenCV Thread-safe API Documentation, tutorials, sample software pipelines that teach use of primitives and framework 7

VISIONWORKS ™ SUPPORTED PLATFORMS Automotive Embedded Desktop Drive PX JETSON TX1 Ubuntu Linux 14.04, Windows 8 JETSON TK1 JETSON TK1 Pro  Drive PX2 8

VISIONWORKS™ TOOLKIT SOFTWARE STACK VisionWorks VisionWorks VisionWorks-Plus . . . Object Tracker SfM NVXIO VisionWorks Source Samples Source Samples Multimedia Feature Tracking, Hough Transform, Stereo Depth Extraction, Camera Hist Equalization.. Abstraction NVIDIA VisionWorks VisionWorks Core Framework & Primitive Extensions Library VisionWorks OpenVX TM Framework & Primitives CUDA API Khronos NVIDIA CUDA Acceleration Framework 9

VISIONWORKS ™ PRIMITIVES IMAGE ARITHMETIC Stereo Block Matching Median Filter IME Create Motion Field Scharr3x3 Absolute Difference IME Refine Motion Field Sobel 3x3 Accumulate Image IME Partition Motion Field All OpenVX Accumulate Squared FEATURES Accumulate Weighted Primitives GEOMETRIC Add/ Subtract/ Multiply + Canny Edge Detector Channel Combine TRANSFORMS FAST Corners + Channel Extract FAST Track Affine Warp + Color Convert + Harris Corners + Warp Perspective + CopyImage Harris Track Flip Image Convert Depth Hough Circles Remap Magnitude Hough Lines Scale Image + MultiplyByScalar Not / Or / And / Xor ANALYSIS FILTERS Phase Histogram NVIDIA BoxFilter Table Lookup Histogram Equalization Extensions Convolution Threshold Integral Image Dilation Filter Mean Std Deviation Erosion Filter FLOW & DEPTH Min Max Locations Gaussian Filter Median Flow Gaussian Pyramid + type/mode extension by NVIDIA Optical Flow (LK) + Laplacian3x3 Semi-Global Matching NVIDIA extension primitives 10

VISIONWORKS ™ PRIMITIVES • VisionWorks primitives are CUDA optimized All OpenVX (except MedianFlow & FindHomography extensions) Primitives • 85% of VisionWorks OpenVX API is also accelerated with NEON. Table of NEON optimized primitives are listed in VisionWorks Toolkit Ref. (Go to "VisionWorks API" -> "NVIDIA Extensions API" - > "Vision Primitives API” ) • Primitive acceleration with VisionWorks • Up to 92x speedup compared to OpenCV CPU kernels on Drive PX (Ave 8x) NVIDIA Extensions • Up to 13x speedup compared to OpenCV CUDA kernels on Drive PX (Ave 2x) (Measured on Drive PX, OS =‘V4L' Linux Kernel='3.18.21-tegra-g06aec38' CPU Rate='1632 MHz' GPU Rate='844 MHz' EMC Rate='1600 MHz’) 11

VISIONWORKS ™ SAMPLE APPLICATIONS Stereo Depth OpenCV-NPP- Hough Lines & Feature Tracker OpenVX Interop Extraction Circles + Video stabilization + Iterative Motion Estimation/Flow and other platform specific samples (available only on certain platforms) Camera Capture, OpenGL interop, Video playback 12

VISIONWORKS SAMPLE APPLICATIONS NVXIO MULTIMEDIA ABSTRACTION Camera input Interop/EGLStre Interop/EGLStre ams ams GFX CSI ISP & Camera Render Processing Vision processing Video/image file input CUDA Image/Video Image/Video . . . Decode Encode CPU COMPLEX Streamed NVXIO GPU (Multi-core video/image ARM v8) input AUDIO SECURITY VIDEO VIDEO 2D ENGINE ENGINE ENGINE ENCODER DECODER (VIC) (APE) SAFETY SAFETY BOOT PROC CAN PROC IMAGE ENGINE MANAGER (BPMP) (SPE) PROC (ISP) (SCE) (HSM) 13 I/O

VISIONWORKS™ PLUS ALGORITHMS Object Tracker Structure From Motion 14

Programming with VisionWorks Library 15

VISIONWORKS ™ PROGRAMMING MODEL VisionWorks VisionWorks VisionWorks OpenVX™ OpenVX™ CUDA API Immediate Mode Graph Mode Heterogeneous compute Direct CUDA API for Standard specified API with graph advanced CUDA heterogeneous optimizations developers compute API with  individual function Extensible with user calls defined nodes 16

VISIONWORKS OPENVX™ IMMEDIATE MODE VIDEO STABILIZATION SAMPLE OpenVX Immediate mode API enables developers to easily port their applications. OpenVX API Immediate mode calls are prefixed with “ vxu ” Ported Video Stabilization algorithm in OpenCV to VisionWorks Immediate Mode. OpenCV image Feature Source detection Cv::Mat to Processs pts Color Optical Warp Vx_image & Find Conversion Flow Perspective Homography Stabilized frames Image Pyramid 17

VISIONWORKS OPENVX™ IMMEDIATE MODE VIDEO STABILIZATION SAMPLE Performance boost: Video stabilization application is accelerated by 2.6x (including the overhead for Mat to vx_image conversions) 1.4x OpenCV image Feature Source detection 0.6x 4.9x 2.3x 4.6x Cv::Mat to Processs pts Color Optical Warp Vx_image & Find Conversion Flow Perspective Homography Stabilized frames 1.7x Image Pyramid 18

VISIONWORKS OPENVX™ GRAPH MODE VIDEO STABILIZATION SAMPLE OpenVX API graph mode calls are prefixed with “ vx ” OpenVX Graph enables advanced optimizations Buffer reuse, kernel fusion • Efficient use of streaming and CUDA textures • Automatic scheduling across processing units based on various factors (safety, perf,..) • Tiling and pipelining vision functions at sub-frame level • Feature detection Processs pts Image Color Optical Warp & Find Source Conversion Flow Perspective Homography Stabilized frames Image Pyramid 19

VISIONWORKS OPENVX™ GRAPH MODE VIDEO STABILIZATION SAMPLE Performance boost: Video stabilization application is further accelerated compared to immediate mode. Feature detection Processs pts Color Image Optical Warp & Find Conversion Source Flow Perspective Homography Stabilized frames Image Pyramid 20

VISIONWORKS CUDA API FEATURE TRACKING SAMPLE VisionWorks CUDA API enables developer with low-level access. Developer manages • Data allocations and transfer Scheduling and pipelining • Camera/image/video YUV Gray Rendering/Output Input data frame frame nvxcuColor nvxcuChannel nvxcuGaussian nvxcuOptica nvxcuHarris Convert Extract lFlowPyrLK Track Pyramid Array of RGB frame (CUDA buffer) keypoints 21

VISIONWORKS™ API SELECTION VisionWorks VisionWorks VisionWorks OpenVX™ OpenVX™ CUDA API Immediate Mode Graph Mode Let the graph manager to Quick port from other Low level CUDA API hide overheads, optimize libraries access for advanced and manage data  CUDA developers  To be able to reassign To be able to reassign CPU CPU and GPU tasks based and GPU tasks based on on perf. perf. 22

DEBUGGING WITH VISIONWORKS ™ Enable VisionWorks debug markers with “export NVX_PROF=nvtx ” 23

VISIONWORKS ™ DOCUMENTATION Installed location: /usr/share/visionWorks/docs 24

VISIONWORKS ™ FACTS First Khronos OpenVX™ 1.0 compliant library (Jan 2015) VisionWorks enables key demos (CES’16 and more at GTC) 27K downloads (embedded) since release in Nov, 2015 + Installed by default on all automotive platforms Weekly VisionWorks downloads for various platforms 25

CONCLUSION VisionWorks Toolkit delivers multiple levels of API • – OpenVX Immediate Mode, OpenVX Graph Mode, VisionWorks CUDA API • Heterogeneous API enables switching from GPU to CPU – this is very powerful, reducing productization time Delivers high performance • Offers significant speedup over CUDA optimized OpenCV functions – • Adopts native media APIs on Tegra platforms and delivers ready to use code samples H6115 - Designing S6739 - VisionWorks™ L6129 -VisionWorks ™ Computer Vision Toolkit Programming Toolkit LAB Session Applications with Tutorial VisionWorks™ – Pod B Room LL20A Room 210C 26

RESOURCES & USEFUL LINKS http://www.embedded-vision.com/ https://www.khronos.org/openvx/ https://developer.nvidia.com/embedded/visionworks VisionWorks Webinars - https://developer.nvidia.com/embedded/learn/tutorials 27

FULLY CONVOLUTIONAL NETWORK [1] Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. [2] Efficient Convolutional Patch Networks for Scene Understanding CVPR Workshop on Scene Understanding (CVPR-WS). [3] M. Cordts, M. Omran, S. Ramos, T. Scharwächter, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, "The Cityscapes Dataset," in CVPR Workshop on The Future of Datasets in Vision, 2015. 2015. VISIONWORKS WITH DEEP LEARNING DEMO 28

VISIONWORKS A CUDA ACCELERATED COMPUTER VISION LIBRARY S6783 Elif - PowerPoint PPT Presentation

April 4-7, 2016 | Silicon Valley VISIONWORKS A CUDA ACCELERATED COMPUTER VISION LIBRARY S6783 Elif Albuz, April 4, 2016 Motivation Introduction to VisionWorks VisionWorks Software Stack AGENDA VisionWorks Programming Model

PROGRAMMING TUTORIAL Thierry Lepley, April 4 th 2016 TUTORIAL GOAL Intermediate Tutorial for

Security Evaluation and Enhancement of Bistable Ring PUFs RFIDSec, June 23, 2015 (1) , Ulrich

Basics for the AEC Community Presenters: Judy Frydland Commissioner Judy Frydland

Airport Apron Roundabout Operational Concept and Capacity Evaluation Concept and Capacity

Advisory Committee Meeting #4| Agenda 1. Optio ns Re vie w / Co mmunity Dialo g ue # 2 Re sults

Rebuilding Highway 7 Town Centre Boulevard to Sciberras Road Project Status Update

Dementia Strategy Update 15 July 2019 Dr Suzanne Wood, Cardiff and Vale UHB Ian Wile, Cardiff

Item Level RFID Tags Extend Memory Not Just an ID Need More Than

Considering Honors Classes? The Honors program is for students who have demonstrated high

NJSLA RESULTS: SPRING 2019 Measuring ADMINISTRATION College and Career Readiness Oceanport

SPRING 2017 Measuring ADMINISTRATION College and Career Readiness Shore Regional High School

Neshaminy High School Seventh Major Options Athletics Course Levels Departmental Sequencing

2013-2014 TENTATIVE BUDGET AND P ROPOSED I NITIATIVES F EBRUARY 27, 2013

PARCC RESULTS: SPRING 2015 AND SPRING 2016 ADMINISTRATIONS Measuring College and CHARLES SEIPP

Mastery-Based Learning Workgroup Washington State Board of Education June 20, 2018 2 Who are

College Connections: A Mandatory Intervention Program for Academically Under-Prepared Students

University of California, Riverside Dr. Scott Silverman, Coordinator Jennifer Hernandez,

+ PMS Parents Club Meeting Common Core Math Discussion December 12, 2014 + Common Core Math

District Assessments 0 WIDA Access 2.0 (ELL Students K-8) 0 PARCC (ELA and Math 3-8) 0

I-35 NORTHEAST Open House: 5:00 p.m. 6:00 p.m. EXPANSION (NEX) Public Hearing: 6:00 p.m.

Three DD Waivers Building Family and Community Living Independence Individual Supports

OF DIG OF IGIT ITAL AL EC ECON ONOMY: OMY: CHALLEN ALLENGES GES AN AND PROS OSPECTS

Diagnostics and supportive software solutions to healthcare providers. Forward looking Statements

INNOX PROJECT AND A STUDY ON READ - ACROSS TOOLS David Demortain 26 Feb 2016 CAAT-EU