Vijay John, Yuquan Xu , Seiichi Mita, Sm Smart t Vehicle Research - - PowerPoint PPT Presentation

vijay john yuquan xu seiichi mita sm smart t vehicle
SMART_READER_LITE
LIVE PREVIEW

Vijay John, Yuquan Xu , Seiichi Mita, Sm Smart t Vehicle Research - - PowerPoint PPT Presentation

Vijay John, Yuquan Xu , Seiichi Mita, Sm Smart t Vehicle Research Center Hossein Tehrani, Tomoyoki Oishi, Masataka Konishi, Hakusho Chin Advanced Mobil ilit ity Develo lopment Kazuhisa Ishimaru, Sakiko Nishino Res esea earch Divis isio


slide-1
SLIDE 1

Vijay John, Yuquan Xu , Seiichi Mita, Sm Smart t Vehicle Research Center

Hossein Tehrani, Tomoyoki Oishi, Masataka Konishi, Hakusho Chin Advanced Mobil ilit ity Develo lopment Kazuhisa Ishimaru, Sakiko Nishino Res esea earch Divis isio ion 2.0

slide-2
SLIDE 2
  • ADAS and Automated Driving
  • World 3D Reconstruction
  • 3D Deep Sensor Fusion
  • Future Plan
  • Conclusion
slide-3
SLIDE 3

ADAS App pplications ar are bo booming

  • Adaptive Cruise Control (ACC)
  • Adaptive Front Lights (AFL)
  • Driver Monitoring System (DMS)
  • Forward Collision Warning (FCW)
  • Intelligent Speed Adaptation (ISA)
  • Lane Departure Warning (LDW)
  • Pedestrian Detection System (PDS)
  • Surround-View Cameras (SVC)
  • Autonomous Emergency Braking (AEB)

Vehicle Platform Sensors Configuration

slide-4
SLIDE 4

Se Sensor Fusi usion & & Per erception (36 (360 deg deg Sc Scene Un Understanding)

Pat ath Plannin ing g / Beh ehavio ior Gen eneratio ion

Contr trol

Cam Cameras St Stereo Las aser Sen ensor RA RADAR So Sonar GPS PS/IMU

Deep Understanding of f Environment

Sensors

slide-5
SLIDE 5
slide-6
SLIDE 6

<St Stereo Vis ision>

Far

Clos lose

far - small shift

Shift = Disparit ity

close – big shift

Close Far

Left Right

Road 3D-Object Car Barrier

Close Far

Cal alibration

<Process>

Disp Disparity ty Calc Calculati tion (St (Stereo mat atchin ing)

De Detection

Dis isparity ma map

𝑒𝑗𝑡𝑢𝑏𝑜𝑑𝑓 = 𝐶 . 𝑔 𝑒𝑗𝑡𝑞𝑏𝑠𝑗𝑢𝑧

slide-7
SLIDE 7

Eac ach pixel l has as suc uch mat atchin ing g cost cur urve, , which constructs the e Mat atchin ing Cost t Spac ace

<Cost Space >

24 40 55

Image height

Finding tr true disparit ity valu lue for every pixel l fr from Matchin ing Cost Space

< Matching Cost >

Lef Left Ima Image e

Disparity: 55 40

Right Ima Image e

Disparity: 55 40 24

Ground Truth

slide-8
SLIDE 8

Matching cost curve

Neighbors: low matching cost

Focus on Red pixel

10 20 30 40 50 60

Disparity Matching cost

300 400 500 600

0.008 0.064

Horizontal Axis

Left Image Right Image

slide-9
SLIDE 9

Vit iterbi alg lgorithm can fin find the glo lobal optimum

Pixel

430 440 450 460 470 480 490

Mat atching cost 0.006 0.018 0.012

60 50 40 30 20 10

Disparity Viterbi node Path cost Block Matching (wrong) Viterbi (optimal)

Exploiting the neighbors’ matching cost can be translated into Mathematical Optimization about the Sh Shortest Path th Problem

slide-10
SLIDE 10

Cost for VSLj Cost for VSLj+1 DSL_RDLUk , k=1,…,K Costs Regrouping Right down to left up Viterbi direction Left up to right down Viterbi direction DSL_RDLUk , k=1,…,K Merge Costs Average VSLj , j=1,…,640

VSLj−1 VSLj VSLj+1 VSLj+2

Huge Networks wit ith Parallel Optimization

slide-11
SLIDE 11

< SGBM >

< Proposed Method(Multi-Path Viterbi) >

Merge

Viterbi ← → Viterbi ↑ ↓ Viterbi ↖ ↘ Viterbi ↗ ↙ Cost space Out: disparity Viterbi→ Cost space + Output: disparity

Merge Diagonal directions …

MPV Optimize the Accumulates Information Step by Step

Independently merge

Hierarchically merge Viterbi← Viterbi↑ Viterbi↓ Viterbi ↗ Viterbi ↙ Viterbi ↖ Viterbi ↘

< MPV >

slide-12
SLIDE 12

Multi Path Viterbi

Co Conventio tional meth ethod: SGB SGBM

Blo lock Matching

Dense & sm smooth th

slide-13
SLIDE 13

Im Image Si Size : 1280×960 Calculati tion Tim Time :

15ms/Frame

GPU : GeFORCE GTX 1080 Nagoya Ur Urban Roa

  • ad
slide-14
SLIDE 14

Tokyo Metropolitan Highway

Calc lculation Time :1 :15ms/Frame GPU GPU : GeFORCE GTX 1080

slide-15
SLIDE 15

Calculati tion Tim Time : 15ms/Frame GPU : GeFORCE GTX TX 1080

slide-16
SLIDE 16
slide-17
SLIDE 17
slide-18
SLIDE 18

Electromagnetic Range Wave: Camera ,Laser, Radar

0.1μm 1μm 10μm 1mm 100μm

Laser Visible Light Camera Infrared Camera Car Radar

10mm

Car Sonar Electromagnetic Wave Sound Wave

Wave Length

Can an Se See De Detail

Can an See ee Far ar

Fog & Dust

Rain ~100/m3

~100/cm3

Sun Light Sno Snow

slide-19
SLIDE 19

Perception (Learning Framework)

Cam ameras St Stereo Las aser Se Sensor

Training a le learning fr framework for perception tasks

Sen ensors Free Space Objects

Traditional l Learning

Featu ture Extr tracti tion (HOG,

DPM etc)

Featu ture Cla lassifi ficati tion

(SVM, Random Forest etc)

Deep le learning

(Feature Extr

tractio ion + Feature Cla lassification)

slide-20
SLIDE 20
  • Single sensor-based

learning is not robust or descriptive enough

  • Challenges

– Environmental Variation (occlusion, illumination variation, etc.) – High Inter-Class and Intra-Class Variability

Perception (L (Learning Framework)

Single Sensor

Camera, Lidar, Stereo Labels

slide-21
SLIDE 21

There are many vehicle vari rieties with ith diff ifferent ori rientations

slide-22
SLIDE 22

We have a large number of On-Road Objects

We have a lot of variety of on road objects!!!!

slide-23
SLIDE 23

We have the different type of road boundaries

We have a lot of variety of Free Space Boundary!!!!

Concrete Curb Guardrail Wall Pylon Divider

slide-24
SLIDE 24

Illumination variation as observed by a monocular camera image with appearance features

slide-25
SLIDE 25
  • Sensor Fusion-based

learning with

Complementary Sensors

addresses these issues

  • Monocular Camera

appearance features and depth features are

Comple lementary Features

Se Sensor Fusion and Perception (L (Learning Framework)

Cam ameras St Stereo Las aser Sen ensor

Sensor Fusion

Labels

slide-26
SLIDE 26

Monocular Camera Depth th Camera Monocular Camera ⇒ Rich Appearance Information Depth Camera ⇒ Depth Information (3D Data) In Inexpensive St Stereo-based Depth th Ine Inexpensive Ill Illuminati tion Varia iati tion Ill Illumination In Invariant due to robust stereo algorithm [1]

[1] [1] Xu et et al.

  • al. Real

eal-time St Stereo Disp sparity Qual ality Imp Improvement for Ch Chall allenging Traffic En Environments, , IV IV 20 2018

Depth th in informati tion fr from st stereo ca camera rob

  • bust to
  • ill

illuminati tion variati tion

slide-27
SLIDE 27

Appearance and

and Depth Features are Fused wit ithin in a Deep le learnin ing Framework for Environment Perception

Deep le learning fr framework

Appearance (M (Monocular cam camera) Descriptive Ap Appearance Fea eatu tures Depth th (S (Stereo Camera/Laser) Ill Illuminati tion in invaria iant t depth th featu tures Se Sensor fu fusion wit ith co complementa tary featu tures

3D Environment Perception

slide-28
SLIDE 28

Image & Depth

 Sensor Fusion : Raw Data Level Fusion

Image + Feature Feature Extraction Free Space Detection Object Detection

Image Depth

Image Feature Extraction Depth Feature Extraction

Feature Integration

Free Space Detection Object Detection

Deep Features

 Sensor Fusion : Feature Level Fusion

Good Bad

slide-29
SLIDE 29

Int ntensi sity Inp nput ut Depth Inp nput ut C1_I _Int

Co Concatenation

Con

  • nca

catenation Con

  • nca

catenation Con

  • nca

catenation Free Space ce Outp utput Object t Outp utput C2_I _Int C1_Dep C2_Dep C2_Dep C2_I _Int US US_D _DC1_Int C2_D _Dep C2_D _Dep C1_D _Dep C1_I _Int C1_D _Dep DC1 C1_Int DC DC1_Dep DC2 C2_Int DC DC2_Dep

Ups Upsampling_ DC1_De Dep

Ups

psampling_ DC1_Int Ups Upsampling_ DC2_Int Ups Upsampling_ DC2_De Dep

Feat eature Map ap

Featu ture Ma Map

Featu ture Ma Map

Fea eatur ture Map ap Fea eatur ture Map ap Fea eatur ture Map ap C2_I _Int C2_I _Int C1_I _Int Con

  • nca

catenation Skip ip Conn

  • nnecti

tions

Ent Entire De Dept pth Enc Encoder Feat eature Ma Maps (m,n,n) ) ar are tr transferred to

  • Free Sp

Space and and Ob Object De Decoder Feature Ma Maps (o,n,n) ) for

  • r Conc
  • ncatenation (m+o,n,n)

Skip ip Conn

  • nnecti

tions

Ent Entire In Intensity En Encoder Feat eature Ma Maps (m,n,n) ) ar are tr transferred to

  • Free Sp

Spac ace an and Ob Object De Decoder Fea eature Ma Maps (o,n,n) for Concatenation (m+o,n,n)

Skip ip Conn

  • nnecti

tions

slide-30
SLIDE 30

Free Space Objects

Obje bjects

Free ee Spa Space

Dep epth Int Intensit ity

De Dept pth Feat eature Ex Extraction Ima Image Feat eature Ex Extraction Feat eature In Integration Ob Objects De Detection Freespace De Detection

slide-31
SLIDE 31

ChiNet

Int Inten ensity ty Im Imag age Disp Disparity ty Im Imag age

Free Sp Spac ace

  • Trained with 9000 Sa

Samples from Japanese Highway dataset

  • Manually annotated free space and objects
  • Trained on Keras wit

ith the theano ba backend

  • Trained with Nvid

idia ia Tita Titan X X GPU

Ob Objects

slide-32
SLIDE 32

Free ee Spac ace e Obje bjects

slide-33
SLIDE 33
slide-34
SLIDE 34
slide-35
SLIDE 35
slide-36
SLIDE 36

Implemented on GeF eForce Tita Titan X X using Keras with Theano backend

slide-37
SLIDE 37

Comparison : “Intensity” vs “Intensity and Depth”

Intensity and Disparity fusion

Wrong boundary Pylon not detected

Car detection Not accurate

Car not detected

Car detected

Pylon detected Car, better detection

Accurate boundary

Intensity image only

slide-38
SLIDE 38

In Intensity im image only Intensity and Depth Fusion

Pylon not detected

Pylon detected False object

No false object

Wrong boundary better boundary

Evaluation Result

Comparison : “Intensity” vs “Intensity and Dept

slide-39
SLIDE 39

So Some of f Learned Im Image Feature

Depth Intensity Image

  • Vehicle Lower Part
  • Free Space
  • Sky
  • Driving Lane
  • Edge
  • Free Space

Strong Weak

slide-40
SLIDE 40

Some of Learned Depth Features

Depth Intensity Image

  • Close Distance Objects
  • Close Free Space
  • Edges
  • Far Distance Objects
  • Far Free Space

Strong Weak

slide-41
SLIDE 41
slide-42
SLIDE 42

Pi Pixel Da Data ta After Me Mean Cen entering

We ha have diff ifferent t dis istrib ibuti tion even aft fter mean ce centering

Day Time Day Time Night Time Night Time

slide-43
SLIDE 43

Electromagnetic Range Wave: Camera ,Laser, Radar

0.1μm 1μm 10μm 1mm 100μm

Laser Visible Light Camera Infrared Camera Car Radar

10mm

Car Sonar Electromagnetic Wave Sound Wave

Wave Length

Can an Se See De Detail

Can n See Far

Fog & Dust

Rain ~100/m3

~100/cm3

Sun Light Sno Snow

Rob

  • bust

t and Reli liable Area

slide-44
SLIDE 44

Thermal Camera

Norm rmal Camera

slide-45
SLIDE 45

Thermal Camera

Norm rmal Camera

slide-46
SLIDE 46
slide-47
SLIDE 47

Stability against a variety of light conditions

slide-48
SLIDE 48

32 cm

slide-49
SLIDE 49

Pedestrian Pedestrian Pedestrian Pedestrian

slide-50
SLIDE 50
slide-51
SLIDE 51

RGB Camera Thermal Camera

3D Dense Data

RGB Fea eatu ture e Extr tractio ion Dep epth Fea eatu ture e Extr tractio ion

Feature Integration

Free ee Spac ace e Detecti tion Obje bject Det etectio ion Ther ermal Fea eatu ture e Extr tractio ion

slide-52
SLIDE 52

320TOPS

Automated Driving Unit

DRIVE PX PEGASUS

Support the High Speed and Processing Requirement for Lev. 5

Process and Integration Laser , Stereo, Sonar

Far Infrared Camera, Visible Camera

Milliwave Radar IMU & GNSS & Map

slide-53
SLIDE 53
  • Sensor fusion of appearance and depth features for

environment perception

  • Increased robustness and perception accuracy
  • ChiN

iNet advantages – Precise object boundary detection – Detection of small objects in the road – Detection of far-away objects

  • Computational time

– Reduction of computational time to ~15 ms possible with optimized CUDA libraries and advances in GPU computing

slide-54
SLIDE 54