Visual Perception for Autonomous Driving on the NVIDIA DrivePX2 and - - PowerPoint PPT Presentation

visual perception for autonomous driving on the nvidia
SMART_READER_LITE
LIVE PREVIEW

Visual Perception for Autonomous Driving on the NVIDIA DrivePX2 and - - PowerPoint PPT Presentation

Visual Perception for Autonomous Driving on the NVIDIA DrivePX2 and using SYNTHIA Dr. Juan C. Moure Dr. Antonio Espinosa http:// grupsderecerca.uab.cat/hpca4se/en/content/gpu http://adas.cvc.uab.es/elektra/ http://www.synthiadataset.net Our


slide-1
SLIDE 1

Visual Perception for Autonomous Driving on the NVIDIA DrivePX2 and using SYNTHIA

http://adas.cvc.uab.es/elektra/ http://www.synthia‐dataset.net

  • Dr. Juan C. Moure
  • Dr. Antonio Espinosa

http://grupsderecerca.uab.cat/hpca4se/en/content/gpu

slide-2
SLIDE 2

2

Our Background & Current Research Work

Computer Architecture Group: GPU acceleration: Bioinformatics, CV, Image Compression Computer Vision Group: CV Algorithms + Deep Learning for Camera-based ADAS GOAL: Camera-based Perception for Autonomous Driving Robotized car GPU-accelerated algorithms Deep Learning & Simulation Infrastructure (SYNTHIA)

Elektra Car + DrivePX2

slide-3
SLIDE 3

3

Overview of Presentation

GPU Accelerated Perception Depth Computation Semantic & Slanted stixels (Collaboration with Daimler) Speed up MAP estimation problem solved by DP using CNNs SYNTHIA toolkit New datasets, new ground-truth data, LIDARs …

slide-4
SLIDE 4

4

Stereo Vision for Depth Computation

Disparity: distance between same point in left & right images higher disparity = Objects are closer

10 meters

slide-5
SLIDE 5

5

SemiGlobal Matching (SGM) on GPU: Parallelism

Matching Cost Smoothed Cost

Large Grain Parallelism Medium Grain Parallelism Fine Grain Parallelism

[Hernández ICCS‐2016]

slide-6
SLIDE 6

6

SGM on GPU: Results

50 100 150

Performance ( Frames / Second, fps )

960x360 1280x480 1920x720 Maximum Disparity= Image Height / 4 SGM: 4 path directions

Tegra X1 (DrivePX) Tegra Parker (DrivePX2)

real-time

Tegra Parker improves performance ≈ 4x vs Tegra X1:

  • 3.5x Higher Effective Memory Bandwidth
  • Higher execution overlap among kernels
slide-7
SLIDE 7

Stixel World: Compact representation of the world

Stereo Disparity Stixels

Obj. Sky Obj. Obj. Grnd

slope horizon

Stereo Images

Stixel = Stick + Pixel Fixed‐width, variable number of stixels per column First proposed by a Research Group in DAIMLER

[Pfeiffer BMVC‐2011]

slide-8
SLIDE 8

Semantic Stixels: Unified approach

Stereo Disparity Semantic Stixels

Buil. Sky Ped. Side Road

slope horizon

Stereo Images

Semantic segmentation

[Schneider IV‐2016]

slide-9
SLIDE 9

Enhanced model: Slanted Stixels

9

  • MAP estimation Problem joining Semantic & Depth Bayesian Model (converted to energy minimization)
  • Stixel Disparity Model includes slant b:
  • Redefine Energy function (log-likelihood) :
  • log
  • Enforces prior assumptions: no sky below horizon, objects stand on road

, ∗ Stixels Slanted Stixels

[Hernández BMVC‐2017]

Best Industrial Paper

slide-10
SLIDE 10

New SYNTHIA-San Francisco dataset

10

  • SF city designed with SYNTHIA toolkit
  • 2224 Photorealistic images featuring slanted roads, with pixel‐level depth &

semantic ground truth

  • Very expensive to generate equivalent real‐data images
slide-11
SLIDE 11

Results: Quantitative & Visual

11

Left Image Original Stixels Slanted Stixels 3D representation

Accuracy Results

  • n SYNTHIA‐SF

Disparity Error (%) from 30.9 to 12.9 IoU (%) from 46 to 48.5 Accuracy remained the same for other datasets

slide-12
SLIDE 12

Computation Complexity: Dynamic Programming

12

Work Complexity (per column)  ( h2 ), h = image height

Semantic Segmentation Disparity Image Ground Object Sky Pixel Size

Each column processed independently Dynamic programming strategy for efficient evaluation

  • f all the possible configurations

h

slide-13
SLIDE 13

Stixel (DP) Algorithm on GPU: Parallelism

13

Stereo Disparity

Large Grain parallelism Medium and Fine Grain parallelism

Sequential Operation with Decreasing Parallelism CTA

··· CTA

h

h

step 1 step 2 step 3 ….. step h

…..

slide-14
SLIDE 14

Performance Results

14

Performance ( Frames/Second, fps )

53 369 24 164 7 49

50 100 150 200 250 300

Tegra X1 (DrivePX) Tegra Parker (DrivePX2) 960x360 1280x480 1920x720

  • Real‐time performance on DrivePX2 for all image sizes ( ≈6x‐7x on DrivePX2 vs DrivePX )
  • Complex Stixel Model: 60‐70% of time for Stixel algorithm + 30‐40% of time for semantic inference

Original Stixel Model

17 107 8 49 3 17

50 100 150 200 250 300

Tegra X1 (DrivePX) Tegra Parker (DrivePX2) 960x360 1280x480 1920x720

Slanted + Semantic Stixel Model

(includes time for semantic inference)

Performance ( Frames/Second, fps )

slide-15
SLIDE 15

Improving Computation Complexity: Pre-segmentation

15

Semantic Segmentation Disparity Image Ground Object Sky

  • Infer possible Stixel cuts (pre‐segmentation) from inputs
  • Avoid checking all possible Stixel combinations

NAÏVE Pre-segmentation  ( h )

Accuracy degrades 10-20% when using pre-segmentation h h’ Work Complexity (column)  ( h’×h’), h’ << h

slide-16
SLIDE 16

Pre-segmentation using a DNN

16

Semantic Segmentation Disparity Image Ground Object Sky

  • Infer possible Stixel cuts from inputs by using

general data relations (among columns)

h h’ Now accuracy improves slightly when using pre-segmentation

DNN-based Pre-segmentation

slide-17
SLIDE 17

Improved Performance Results

17

Performance ( Frames/Second, fps )

37 193 17 87 7 35

50 100 150 200 250 300

Tegra X1 (DrivePX) Tegra Parker (DrivePX2) 960x360 1280x480 1920x720

  • Improves performance on both DrivePX and DrivePX2 ( ≈2x )
  • Now 15-30% of time for Stixel algorithm + 70-85% of time for semantic inference
  • Inference time increase almost neglectable ( <10% )
  • Most of the CNN for pre‐segmentation is shared with CNN for semantic segmentation

+ Pre‐segmentation

17 107 8 49 3 17

50 100 150 200 250 300

Tegra X1 (DrivePX) Tegra Parker (DrivePX2) 960x360 1280x480 1920x720

Slanted + Semantic Stixel Model

(includes time for semantic inference)

Performance ( Frames/Second, fps )

slide-18
SLIDE 18

SYNTHIA Dataset Toolkit

18

Image generator of precise annotated data for training DNNs on Autonomous Driving tasks Ground truth data:

  • RGB & Per pixel: depth, semantic class, optical flow,

3D rounding boxes

  • Fully compatible with Cityscapes classes

Generation of LIDAR data Problem Customization: Synthia‐SanFrancisco www.synthia‐dataset.net

slide-19
SLIDE 19

Summary: Real sequence video

19

slide-20
SLIDE 20

Thank you

Autonomous University of Barcelona

  • Dr. Juan C. Moure

juancarlos.moure@uab.es

http://grupsderecerca.uab.cat/hpca4se/en/content/gpu