visual scene understanding for autonomous driving
play

Visual Scene Understanding for Autonomous Driving Raquel Urtasun - PowerPoint PPT Presentation

Visual Scene Understanding for Autonomous Driving Raquel Urtasun University of Toronto Oct 3, 2014 R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 1 / 34 Autonomous Driving State of the art Localization, path planning, obstacle avoidance


  1. Visual Scene Understanding for Autonomous Driving Raquel Urtasun University of Toronto Oct 3, 2014 R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 1 / 34

  2. Autonomous Driving State of the art Localization, path planning, obstacle avoidance R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 2 / 34

  3. Autonomous Driving State of the art Localization, path planning, obstacle avoidance Heavy usage of Velodyne and detailed (recorded) maps R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 2 / 34

  4. Autonomous Driving 3D Laser- scanner State of the art Localization, path planning, obstacle avoidance Heavy usage of Velodyne and detailed (recorded) maps Goal : autonomous driving cheap sensors and little prior knowledge R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 2 / 34

  5. Autonomous Driving 3D Laser- scanner State of the art Localization, path planning, obstacle avoidance Heavy usage of Velodyne and detailed (recorded) maps Goal : autonomous driving cheap sensors and little prior knowledge Problems for computer vision Stereo, optical flow, visual odometry, structure-from-motion R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 2 / 34

  6. Autonomous Driving State of the art Localization, path planning, obstacle avoidance Heavy usage of Velodyne and detailed (recorded) maps Goal : autonomous driving cheap sensors and little prior knowledge Problems for computer vision Stereo, optical flow, visual odometry, structure-from-motion Object detection, recognition and tracking R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 2 / 34

  7. Autonomous Driving State of the art Localization, path planning, obstacle avoidance Heavy usage of Velodyne and detailed (recorded) maps Goal : autonomous driving cheap sensors and little prior knowledge Problems for computer vision Stereo, optical flow, visual odometry, structure-from-motion Object detection, recognition and tracking 3D scene understanding R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 2 / 34

  8. Autonomous Driving State of the art Localization, path planning, obstacle avoidance Heavy usage of Velodyne and detailed (recorded) maps Goal : autonomous driving cheap sensors and little prior knowledge Problems for computer vision Stereo, optical flow, visual odometry, structure-from-motion Object detection, recognition and tracking 3D scene understanding R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 2 / 34

  9. Benchmarks: KITTI Data Collection Two stereo rigs (1392 × 512 px, 54 cm base, 90 ◦ opening) Velodyne laser scanner, GPS+IMU localization 6 hours at 10 frames per second! R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 3 / 34

  10. The KITTI Vision Benchmark Suite R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 4 / 34

  11. First Difficulty: Sensor Calibration T T C GPS T T Velodyne C Camera calibration [Geiger et al., ICRA 2012] Velodyne ↔ Camera registration GPS+IMU ↔ Velodyne registration R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 5 / 34

  12. Second Difficulty: Object Annotation 3D object labels: Annotators (undergrad students from KIT working for months) Occlusion labels: Mechanical Turk R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 6 / 34

  13. One more Difficulty: Evaluation More than 200 submissions, 8000 downloads since CVPR 2012! R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 7 / 34

  14. An autonomous system has to sense the environment R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 8 / 34

  15. 3D Reconstruction Goal: given 2 cameras mounted on top of the car, reconstruct the environment in 3D. R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 9 / 34

  16. Joint Stereo, Flow, Occlusion and Segmentation Slanted-plane MRF with explicit occlusion handling which also computes an over-segmentation of the image into superpixels MRF on continuous variables (slanted planes) and discrete var. (boundary, super pixel assignments, outliers) )*+,*$- � !"#$"%&'()*+),-" � ).&$-*%/01/2.&$*/"3/4*+,*$- � !"#$%&'( � ?"$6$#"#4/@&'8&9.*/ ./0%1)*2'()*+),-" � 5*.&6"$4782/9*-:**$/4*+,*$-4/ ;/4-&-*4/ <==.#48"$ � >8$+* � ?"2.&$&' � 184='*-*/@&'8&9.*/ )#2*'28A*.4/ BC?D/ EF'9*.&*GH/*-/&.I/JKLLM/ ////////////////////////&$%/)NO?/ EF=7&$-&H/*-/&.I/JKLKM P/ Energy that looks at shape, compatibility and boundary length R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 10 / 34

  17. Comparison to the State-of-the-art on KITTI Stereo � Flow � wSGM � BTF4ILLUM � 4.97 % � 6.52 % � [Demetz,,et,al,,2014] � [Spangenberg,,et,al,,2013] � AARBM � TGV2ADCSIFT � 4.86 % � 6.20 % � [Einecke,,et,al,,2014] � [Braux4Zin,,et,al,,2013] � PR4Sceneflow � NLTGV4SC � 4.36 % � 5.93 % � [Ran_l,,et,al,,2014] � [Vogel,,et,al,,2013] � MoMonSLIC � PCBP � 3.91 % � 4.04 % � [Yamaguchi,,et,al,,2012] � [Yamaguchi,,et,al,,2013] � PR4Sf+E � PR4Sceneflow � 4.02 % � 3.76 % � [Vogel,,et,al,,2013] � [Vogel,,et,al,,2013] � StereoSLIC � PCBP4Flow � 3.64 % � 3.92 % � [Yamaguchi,,et,al,,2013] � [Yamaguchi,,et,al,,2013] � PCBP4SS � PR4Sf+E � 3.57 % � 3.40 % � [Yamaguchi,,et,al,,2013] � [Vogel,,et,al,,2013] � Ours,(Stereo) � Ours,(Flow) � 3.38 % � 3.39 % � VC4SF � 3.05 % � Ours,(Joint) � 2.82 % � [Vogel,,et,al,,2014] � VC4SF � 2.72 % � Ours,(Joint) � 2.83 % � [Vogel,,et,al,,2014] � ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� Error,>,3,pixels,(Non4Occluded) � Error,>,3,pixels,(Non4Occluded) � Runtime on 1Core@3.5GHz for average resolution 1237.1 x 374.1 pixels Joint � Stereo)only � Flow)only � Total)run1me � 26.3) sec. � 4.8) sec. � 11.0) sec. � R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 11 / 34

  18. Results on KITTI [K. Yamaguchi, D. McAllester and R. Urtasun, ECCV 2014] Occlusion � Hinge � Coplanar � Disparity)image � Flow)image � R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 12 / 34

  19. An autonomous system has to understand the scene in 3D R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 13 / 34

  20. 3D Scene Understanding Goal: Infer from a short ( ≈ 10s) video sequence: Geometric properties , e.g., street orientation Topological properties , e.g., number of intersecting streets Semantic activities , e.g., traffic situations at an intersection 3D objects , e.g., cars R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 14 / 34

  21. Geometric Model 1 2 3 4 5 6 7 (Model topology) (Geometric parameters) R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 15 / 34

  22. Static and Dynamic Observations Observations 3D Tracklets : Generate tracklets from 2D detections in 3D by employing the orientation as well as size of the bounding boxes R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 16 / 34

  23. Static and Dynamic Observations Observations 3D Tracklets : Generate tracklets from 2D detections in 3D by employing the orientation as well as size of the bounding boxes Segmentation of the scene into semantic labels. Lines that follow the dominant orientations in the scene (i.e., reasoning about vanishing points). R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 16 / 34

  24. Static and Dynamic Observations Observations 3D Tracklets : Generate tracklets from 2D detections in 3D by employing the orientation as well as size of the bounding boxes Segmentation of the scene into semantic labels. Lines that follow the dominant orientations in the scene (i.e., reasoning about vanishing points). R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 16 / 34

  25. Static and Dynamic Observations Observations 3D Tracklets : Generate tracklets from 2D detections in 3D by employing the orientation as well as size of the bounding boxes Segmentation of the scene into semantic labels. Lines that follow the dominant orientations in the scene (i.e., reasoning about vanishing points). Representation We will reason about dynamics in bird eye’s perspective and static in the image. R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 16 / 34

  26. Why high-order semantics? Certain behaviors are not possible given the traffic ”patterns” R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 17 / 34

  27. Why high-order semantics? Certain behaviors are not possible given the traffic ”patterns” We learned those patterns from data. Example of traffic patterns learned from data for 4 way intersections Pattern 1 Pattern 2 Pattern 3 Pattern 4 Pattern 5 Pattern 6 Pattern 7 Pattern 8 Pattern 9 Pattern 10 Pattern 11 The arrows represent our concept of lane R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 17 / 34

  28. Why high-order semantics? Certain behaviors are not possible given the traffic ”patterns” We learned those patterns from data. Example of traffic patterns learned from data for 4 way intersections Pattern 1 Pattern 2 Pattern 3 Pattern 4 Pattern 5 Pattern 6 Pattern 7 Pattern 8 Pattern 9 Pattern 10 Pattern 11 The arrows represent our concept of lane R. Urtasun ( UofT) Autonomous Driving Oct 3, 2014 17 / 34

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend