RGBD Occlusion Detection via Deep Convolutional Neural Networks - PowerPoint PPT Presentation

RGBD Occlusion Detection via Deep Convolutional Neural Networks Soumik Sarkar 1,2 , Vivek Venugopalan 1 , Kishore Reddy 1 , Michael Giering 1 , Julian Ryde 3 , Navdeep Jaitly 4,5 1 United Technologies Research Center, East Hartford, CT 2 Currently with Iowa State University, Ames, IA 3 United Technologies Research Center, Berkeley, CA 4 University of Toronto, ON, Canada 5 Currently with Google Inc., Mountain View, CA Subject to the EAR, ECCN: EAR99. This information is subject to the export control laws of the United States, specifically including the Export Administration Regulations (EAR), 15 C.F.R. Part 730 et seq. Transfer, retransfer or disclosure of this data by any means to a non-US person (individual or company), whether in the U.S. or abroad, without any required export license or other approval from the U.S. Govt. is prohibited. UTC Proprietary - This material contains proprietary information of United Technologies Corporation. Any copying, distribution, or dissemination of the contents of this material is strictly prohibited and may be unlawful without the express written permission of UTC. If you have obtained this material in error, please notify UTRC Counsel at (860) 610-7948 immediately. 1

United Technologies Otis Sikorsky UTC Climate, Controls & Security UTC Propulsion & Aerospace Systems Pratt & Whitney UTC Aerospace Systems This page contains no technical data and is not subject to the EAR or the ITAR. 2

Occlusion detection A voxel map and the corresponding geometric edges for a hallway  Occlusion edges help image feature selection, once occlusion boundaries are established – the depth of the region can be determined  This is very useful in Simultaneous localization and mapping (SLAM) problems in robotics applications for indoor environments, object recognition, grasping, obstacle avoidance in UAV applications, etc. This page contains no technical data and is not subject to the EAR or the ITAR. 3

Occlusion detection Original image Image edge Range edge  Occlusion edges depend on the gradient of the depth image which is very sensitive to noise in the depth map  The depth map derived from a single image is very noisy and has large errors.  In our work, we are estimating the occlusion edges directly rather than estimating depth first and then calculating occlusion edges. Secondly there are additional cues other than depth which contribute to establishing occlusion edges that our technique is taking advantage of. This page contains no technical data and is not subject to the EAR or the ITAR. 4

Deep Neural Nets and Convolutional Neural Nets Output predicted feature vectors h m + 1 W hidden layers Target layer h m shared weights max max max max subsampling Convolutions - sliding fil ter max pooling  Convolutional filters to generate feature maps from data  Subsampling or pooling for dimension reduction and higher order feature Input to the Deep Neural network generation UTC PROPRIETARY - Export Controlled - ECCN: EAR99 5

Occlusion detection from Freiburg dataset Reference: http://vision.in.tum.de/data/datasets/rgbd-dataset/download  Use readily available dataset for demonstrating occlusion edge detection from Computer Vision Group at Technische Universität München (TUM)  Partition the trajectory into training and test datasets for the neural nets UTC PROPRIETARY - Export Controlled - ECCN: EAR99 6

Problem setup Deep Convolutional Neural Network Input: RGB+D information from Output: Occlusion edges consecutive video frames (640x480) captured by mobile sensor UTC PROPRIETARY - Export Controlled - ECCN: EAR99 7

Training and Testing processes Training process Center-pixel based labels Partition into 32x32 patch examples Network Training No occlusion patch Occlusion patch Testing process Trained Network Prediction of patch 32x32 patches generated (Center-pixel) label with fixed stride UTC PROPRIETARY - Export Controlled - ECCN: EAR99 8

Post-processing for Occlusion edge reconstruction Testing and post-processing Trained Network Prediction of patch 32x32 patches generated (Center-pixel) label with fixed stride Prediction confidence from softmax posterior Prediction confidence converted Gaussian labels are fused in to patch-wide label using a a mixture model to generate Gaussian kernel (with Full Width smooth occlusion edges at Half Maximum - FWHM) UTC PROPRIETARY - Export Controlled - ECCN: EAR99 9

Experimental setup  Nvidia Tesla K40 GPU with 2880 cores and 12 GB device RAM  Initial pre-processing for dividing dataset into training and test and extracting small images (32x32) from large frames (480x640)  Image size fixed at 32x32 with number of channels depending on the experiment  4 channels for RGBD  3 channels for RGB  6 channels for RGBD + optical flow (UV)  Ground truth consists of labelled edges by using only the depth sensor UTC PROPRIETARY - Export Controlled - ECCN: EAR99 10

Optical flow pre-processing UTC PROPRIETARY - Export Controlled - ECCN: EAR99 11

Results Test error Data Channels Patch Training Testing Computation (averaged over stride dataset dataset time/epoch 80-100 epochs) RGBD (1 frame) 4 4 56354 500000 15.35 1m 21s 4 8 14278 316167 18.76 2m 17s RGB (1 frame) 3 4 56354 500000 16.43 1m 2s 3 8 14278 316167 18.72 1m 42s RGBDUV 6 4 56354 500000 15.18 1m 22s UTC PROPRIETARY - Export Controlled - ECCN: EAR99 12

Post-processing Results  Input: RGBD image (32x32x4), stride 8 Performance improves with  Input: RGBD image (32x32x4), stride 4 higher granularity of fusion UTC PROPRIETARY - Export Controlled - ECCN: EAR99 13

Post-processing Results Overall detection confidence  Input: RGB image (32x32x3), stride 8 deteriorates without D channel Performance improves with  Input: RGB image (32x32x3), stride 4 higher granularity of fusion UTC PROPRIETARY - Export Controlled - ECCN: EAR99 14

RGBD and optical flow (RGBDUV) Results UTC PROPRIETARY - Export Controlled - ECCN: EAR99 15

Conclusion  Deep CNN can extract significant occlusion edge features from only RGB channels (i.e., without the depth sensor information). Occlusion detection accuracy increases when we introduce optical flow.  Deep Convolutional Neural Nets (Deep CNN) for multi-modal fusion applied to occlusion detection  The trade-off between high resolution patch analysis and frame-level computation time is critical for real-time robotics applications  Currently investigating multiple time-frames of RGB input in order to extract structure from motion This page contains no technical data and is not subject to the EAR or the ITAR. 16

Questions 17

RGBD Occlusion Detection via Deep Convolutional Neural Networks - PowerPoint PPT Presentation

RGBD Occlusion Detection via Deep Convolutional Neural Networks Soumik Sarkar 1,2 , Vivek Venugopalan 1 , Kishore Reddy 1 , Michael Giering 1 , Julian Ryde 3 , Navdeep Jaitly 4,5 1 United Technologies Research Center, East Hartford, CT 2 Currently

Chronic Total Occlusion of the Chronic Total Occlusion of the Coronary Artery Coronary Artery

Employing Dynamic Employing Dynamic Transparency for 3D Occlusion Transparency for 3D Occlusion

Computer Graphics - Advanced Rasterization - Stefan Lemme Recap: occlusion query Occlusion

Depth Prediction and RGBD Images for Recognition Yihui He, Metehan Ozten yihuihe@foxmail.com, m

Human-Robot Interaction Elective in Artificial Intelligence Lecture 7 RGBD Perception Luca

ACUTE ARTERIAL Vascular Surgery Conference OCCLUSION Michael Lebow, MD ACUTE ARTERIAL OCCLUSION

Image Space Occlusion Culling Image Space Occlusion Culling 1 Hudson et al, SoCG 97 Occluder A

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Efficient Grasping from RGBD Images: Learning Using a New Rectangle Representation Yun Jiang,

Eye Gaze Tracking Usin ing an RGBD Camera: A Comparison with an RGB Solu lution Xuehan Xiong

RGBD Tutorial 14210240041 Gu Pan Image RGB YUV Lab Depth Image RGB image Depth image Each pixel in

CS395T paper review Indoor Segmentation and Support Inference from RGBD Images Chao Jia Sep

Vidi Smart Occlusion Glasses for Amblyopia Veysel Ozkapici- veysel@vidi.com.tr 0 533 214 49 77

Incremental and Approximate Inference for Faster Occlusion-based Deep CNN Explanations Supun

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Low Level Low Level Low Level Low Level Detection of Detection of Detection of Detection of

June June 9, 9, 2014 2014 Plena Plenary Presenta y Presentati tion, Academ , Academia:

Budget Advisory Council Budget Proposal Athletics Division Overview We have a direct impact on

Budget Advisory Council Budget Proposal College of Engineering and Computer Science College

AssetWorks M5 M5 v15 Information The screens/frames contain the same information in basically

1 Tuesday, December 4, 2018 10:00 AM UTC 6:00 PM UTC 1 st Livestream 2 nd Livestream 2 Agenda

AGAT: Automatic Generation of Text Weather Forecast with Spatio-Temporal Aggregation Lior Perez

15 June 2017 | 10 UTC & 19 UTC 6/15/2017 1 Welcome to the Pre-ICANN59 Policy Update Webinar

Bangladesh Proposal for UTC#159 Presented by: Mohammad Enamul Kabir, Director (Training &

RGBD Occlusion Detection via Deep Convolutional Neural Networks - PowerPoint PPT Presentation

RGBD Occlusion Detection via Deep Convolutional Neural Networks Soumik Sarkar 1,2 , Vivek Venugopalan 1 , Kishore Reddy 1 , Michael Giering 1 , Julian Ryde 3 , Navdeep Jaitly 4,5 1 United Technologies Research Center, East Hartford, CT 2 Currently

Chronic Total Occlusion of the Chronic Total Occlusion of the Coronary Artery Coronary Artery

Employing Dynamic Employing Dynamic Transparency for 3D Occlusion Transparency for 3D Occlusion

Computer Graphics - Advanced Rasterization - Stefan Lemme Recap: occlusion query Occlusion

Depth Prediction and RGBD Images for Recognition Yihui He, Metehan Ozten yihuihe@foxmail.com, m

Human-Robot Interaction Elective in Artificial Intelligence Lecture 7 RGBD Perception Luca

ACUTE ARTERIAL Vascular Surgery Conference OCCLUSION Michael Lebow, MD ACUTE ARTERIAL OCCLUSION

Image Space Occlusion Culling Image Space Occlusion Culling 1 Hudson et al, SoCG 97 Occluder A

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Efficient Grasping from RGBD Images: Learning Using a New Rectangle Representation Yun Jiang,

Eye Gaze Tracking Usin ing an RGBD Camera: A Comparison with an RGB Solu lution Xuehan Xiong

RGBD Tutorial 14210240041 Gu Pan Image RGB YUV Lab Depth Image RGB image Depth image Each pixel in

CS395T paper review Indoor Segmentation and Support Inference from RGBD Images Chao Jia Sep

Vidi Smart Occlusion Glasses for Amblyopia Veysel Ozkapici- veysel@vidi.com.tr 0 533 214 49 77

Incremental and Approximate Inference for Faster Occlusion-based Deep CNN Explanations Supun

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Low Level Low Level Low Level Low Level Detection of Detection of Detection of Detection of

June June 9, 9, 2014 2014 Plena Plenary Presenta y Presentati tion, Academ , Academia:

Budget Advisory Council Budget Proposal Athletics Division Overview We have a direct impact on

Budget Advisory Council Budget Proposal College of Engineering and Computer Science College

AssetWorks M5 M5 v15 Information The screens/frames contain the same information in basically

1 Tuesday, December 4, 2018 10:00 AM UTC 6:00 PM UTC 1 st Livestream 2 nd Livestream 2 Agenda

AGAT: Automatic Generation of Text Weather Forecast with Spatio-Temporal Aggregation Lior Perez

15 June 2017 | 10 UTC &amp; 19 UTC 6/15/2017 1 Welcome to the Pre-ICANN59 Policy Update Webinar

Bangladesh Proposal for UTC#159 Presented by: Mohammad Enamul Kabir, Director (Training &amp;

15 June 2017 | 10 UTC & 19 UTC 6/15/2017 1 Welcome to the Pre-ICANN59 Policy Update Webinar

Bangladesh Proposal for UTC#159 Presented by: Mohammad Enamul Kabir, Director (Training &