S7348: Deep Learning in Ford's Autonomous Vehicles
Bryan Goodman Argo AI 9 May 2017
1
Ford's Autonomous Vehicles Bryan Goodman Argo AI 9 May 2017 1 - - PowerPoint PPT Presentation
S7348: Deep Learning in Ford's Autonomous Vehicles Bryan Goodman Argo AI 9 May 2017 1 Ford s 12 Year History in Autonomous Driving Today: examples from Stereo image processing Object detection Using RNNs Motorsports 2
Bryan Goodman Argo AI 9 May 2017
1
Today: examples from
2
Ford’s 12 Year History in Autonomous Driving
− Compare pixels on the same epipolar line in two images − Choose the best match
3
an object using the visual information from two eyes.
4
Right Stereo Camera Deep Convolutional Neural Networks Post-Processing Left Stereo Camera Distance Map Estimation
− General network − Encoding and decoding layers − Retain objects of interest in the training data sets
5
Conv1
Conv5
Conv2 Conv3
Conv6 Deconv6
Conv7 Deconv7 Deconv8 Conv8 Deconv9
Encoder Decoder
Conv9
Loss Function
Deconv10 Conv10 Conv4
− Specialized network − Encoding and decoding layers − The cross correlation layers force the network to look for correspondence on the epipolar line − The weights in the encoding layers are shared
6 Conv1L
Conv4L Loss Function Encoder Decoder
Conv2L Conv3L Conv1R
Conv4R
Conv2R Conv3R
CC5
Conv5
Conv6 CC6
Deconv6
Conv7 CC7 Deconv7 Deconv8 Conv8 Deconv9 Conv9
− Computes CC values between each pairs of patches − Outputs the CC values for each pair of patches − Does not lose any information
− In AV driving, closer objects are more important than distant ones − Assigns more weight to the closer objects − The closer object distance is estimated more accurately
7 0.2 0.4 0.6 0.8 1 1 0.4 0.2 α d 0.6 0.8
− Generate 14,000 pairs of RGB stereo images − Synthetic distance maps are only generated for the objects of interest, e.g. cars or pedestrians − Gaussian noise added to the stereo images
8
− Project LIDAR point clouds onto the camera images − The baseline and optic axes are not the same as the synthetic data
9
Left camera Right camera Network I Network II
10
11
12
Detection Result Original Image Enhanced Contrast
Network’s detection outperforms human labeler in low-contrast areas
Pedestrian detection Pedestrian misdetection Detected, but not labeled
13
Image 0 Feature Map
RNN Conv
Image 1 Feature Map Image 2 Feature Map
RNN Conv RNN Conv Detector Detector Detector
14
15
The Ford team reviews pictures during the race
16
Looking for damage and
Gap
17
Using ~2k images labeled with boxes around the vehicles, the model does well detecting cars
18
Next – determine car number: labeled ~30k images
Outliers easy to find in review
Human: ??? Model: 78 Confidence: 0.999
Human: ??? Model: 42 Confidence: 0.985
23
Activated Filter Input Image
The Model is not a black box. We can see that it is detecting the numbers – important for robustness when the paint changes
and artificial intelligence: self-driving vehicles
Area of California
me at GTC or visit: www.argo.ai
24