3D Vision Torsten Sattler and Martin Oswald Spring 2018 3D Vision - - PowerPoint PPT Presentation

3d vision
SMART_READER_LITE
LIVE PREVIEW

3D Vision Torsten Sattler and Martin Oswald Spring 2018 3D Vision - - PowerPoint PPT Presentation

3D Vision Torsten Sattler and Martin Oswald Spring 2018 3D Vision Understanding geometric relations between images and the 3D world between images Obtaining 3D information describing our 3D world from images from


slide-1
SLIDE 1

3D Vision

Torsten Sattler and Martin Oswald

Spring 2018

slide-2
SLIDE 2

3D Vision

  • Understanding geometric relations
  • between images and the 3D world
  • between images
  • Obtaining 3D information describing
  • ur 3D world
  • from images
  • from dedicated sensors
slide-3
SLIDE 3

3D Vision

  • Extremely important in robotics and

AR / VR

  • Visual navigation
  • Sensing / mapping the environment
  • Obstacle detection, …
  • Many further application areas
  • A few examples …
slide-4
SLIDE 4

Google Tango

(officially discontinued, lives on as ARCore)

slide-5
SLIDE 5

Google Tango

slide-6
SLIDE 6

Image-Based Localization

slide-7
SLIDE 7

Geo-Tagging Holiday Photos

(Li et al. ECCV 2012)

slide-8
SLIDE 8

Augmented Reality

(Middelberg et al. ECCV 2014)

slide-9
SLIDE 9

Video credit: Johannes Schönberger

Large-Scale Structure-from-Motion

slide-10
SLIDE 10

Virtual Tourism

slide-11
SLIDE 11

UNC/UKY UrbanScape project

3D Urban Modeling

slide-12
SLIDE 12

3D Urban Modeling

slide-13
SLIDE 13

Mobile Phone 3D Scanner

slide-14
SLIDE 14

Mobile Phone 3D Scanner

slide-15
SLIDE 15

Self-Driving Cars

slide-16
SLIDE 16

Self-Driving Cars

slide-17
SLIDE 17

Self-Driving Cars

slide-18
SLIDE 18

Micro Aerial Vehicles

slide-19
SLIDE 19

Microsoft HoloLens

Mixed Reality

slide-20
SLIDE 20

Virtual Reality

slide-21
SLIDE 21

Raw Kinect Output: Color + Depth

http://grouplab.cpsc.ucalgary.ca/cookbook/index.php/Technologies/Kinect

slide-22
SLIDE 22

Human-Machine Interface

slide-23
SLIDE 23

3D Video with Kinect

slide-24
SLIDE 24

Autonomous Micro-Helicopter Navigation

Use Kinect to map out obstacles and avoid collisions

slide-25
SLIDE 25

Dynamic Reconstruction

slide-26
SLIDE 26

Performance Capture

slide-27
SLIDE 27

Performance Capture

(Oswald et al. ECCV 14)

slide-28
SLIDE 28

Performance Capture

slide-29
SLIDE 29

Motion Capture

slide-30
SLIDE 30

Interactive 3D Modeling

(Sinha et al. Siggraph Asia 08) collaboration with Microsoft Research (and licensed to MS)

slide-31
SLIDE 31

Scanning Industrial Sites

as-build 3D model of off-shore oil platform

slide-32
SLIDE 32

Scanning Cultural Heritage

slide-33
SLIDE 33

Cultural Heritage

Stanford’s Digital Michelangelo

Digital archive Art historic studies

slide-34
SLIDE 34

accuracy ~1/500 from DV video (i.e. 140kb jpegs 576x720)

Archaeology

slide-35
SLIDE 35

Forensics

  • Crime scene recording and analysis
slide-36
SLIDE 36

Forensics

slide-37
SLIDE 37

Sports

slide-38
SLIDE 38

Surgery

slide-39
SLIDE 39

Johannes Schönberger CAB G 85.1

jsch@inf.ethz.ch

Martin Oswald CNB G103.2

martin.oswald@inf.ethz.ch

Torsten Sattler CNB 104

torsten.sattler@inf.ethz.ch

Federico Camposeco CAB G 86.3

federico.camposeco@inf.ethz.ch

Peidong Liu CAB G 84.2

peidong.liu@inf.ethz.ch

Nikolay Savinov CAB G 81.1

nikolay.savinov@inf.ethz.ch

3D Vision Course Team

Katarina Tóthóva CAB G 102.2

katarina.tothova@inf.ethz.ch

slide-40
SLIDE 40
  • To understand the concepts that relate

images to the 3D world and images to

  • ther images
  • Explore the state of the art in 3D vision
  • Implement a 3D vision system/algorithm

Course Objectives

slide-41
SLIDE 41

Learning Approach

  • Introductory lectures:
  • Cover basic 3D vision concepts and approaches.
  • Further lectures:
  • Short introduction to topic
  • Paper presentations (you)

(seminal papers and state of the art, related to your projects)

  • 3D vision project:
  • Choose topic, define scope (by week 4)
  • Implement algorithm/system
  • Presentation/demo and paper report

Grade distribution

  • Paper presentation & discussions: 25%
  • 3D vision project & report: 75%
slide-42
SLIDE 42

Slides and more

http://www.cvg.ethz.ch/teaching/3dvision/

Also check out on-line “shape-from-video” tutorial:

http://www.cs.unc.edu/~marc/tutorial.pdf http://www.cs.unc.edu/~marc/tutorial/

Textbooks:

  • Hartley & Zisserman, Multiple View Geometry
  • Szeliski, Computer Vision: Algorithms and Applications

Materials

slide-43
SLIDE 43

Feb 19 Introduction Feb 26 Geometry, Camera Model, Calibration Mar 5 Features, Tracking / Matching Mar 12 Project Proposals by Students Mar 19 Structure from Motion (SfM) + papers Mar 26 Dense Correspondence (stereo / optical flow) + papers Apr 2 Bundle Adjustment & SLAM + papers Apr 9 Student Midterm Presentations Arp16 Easter break Apr 23 Multi-View Stereo & Volumetric Modeling + papers Apr 30 Whitsundite May 7 3D Modeling with Depth Sensors + papers May 14 3D Scene Understanding + papers May 21 4D Video & Dynamic Scenes + papers May 28 Student Project Demo Day = Final Presentations

Schedule

slide-44
SLIDE 44

Fast Forward

  • Quick overview of what is coming…
slide-45
SLIDE 45

Pinhole camera Geometric transformations in 2D and 3D

  • r

Camera Models and Geometry

slide-46
SLIDE 46
  • Know 2D/3D correspondences,

compute projection matrix also radial distortion (non-linear)

Camera Calibration

slide-47
SLIDE 47

Harris corners, KLT features, SIFT features

key concepts: invariance of extraction, descriptors to viewpoint, exposure and illumination changes

Feature Tracking and Matching

slide-48
SLIDE 48

l2 C1 m1 M? L1 m2 L2 M C2 Triangulation

  • calibration
  • correspondences

3D from Images

slide-49
SLIDE 49

Fundamental matrix Essential matrix

Also how to robustly compute from images

Epipolar Geometry

slide-50
SLIDE 50

Initialize Motion (P1,P2 compatibel with F) Initialize Structure (minimize reprojection error) Extend motion (compute pose through matches seen in 2 or more previous views) Extend structure (Initialize new structure, refine existing structure)

Structure from Motion

slide-51
SLIDE 51
  • Visual Simultaneous Navigation and Mapping

Visual SLAM

(Clipp et al. ICCV’09)

slide-52
SLIDE 52

Stereo and Rectification

Warp images to simplify epipolar geometry Compute correspondences for all pixels

slide-53
SLIDE 53

Multi-View Stereo

slide-54
SLIDE 54

Joint 3D Reconstruction and Class Segmentation

(Haene et al CVPR13)

joint reconstruction and segmentation

(ground, building, vegetation, stuff)

reconstruction only

(isotropic smoothness prior)

■ Building ■ Ground ■ Vegetation ■ Clutter

slide-55
SLIDE 55

Structured Light

  • Projector = camera
  • Use specific patterns to obtain

correspondences

slide-56
SLIDE 56

Papers and Discussion

  • Will cover recent state of the art
  • Each student team will present a paper (5min

per team member), followed by discussion

  • “Adversary” to lead the discussion
  • Papers will be related to projects/topics
  • Will distribute papers later

(depending on chosen projects)

slide-57
SLIDE 57

Projects and reports

  • Project on 3D Vision-related topic
  • Implement algorithm / system
  • Evaluate it
  • Write a report about it
  • 3 Presentations / Demos:
  • Project Proposal Presentation (week 4)
  • Midterm Presentation (week 8)
  • Project Demos (week 15)
  • Ideally: Groups of 3 students
slide-58
SLIDE 58

Course project example: Build your own 3D scanner!

Example: Bouguet ICCV’98

slide-59
SLIDE 59

Project Topics

slide-60
SLIDE 60

Goal: Description:

DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks

The goal is to implement a deep recurrent convolutional neural network for end-to-end visual

  • dometry [1]

Most of existing VO algorithms are developed under a standard pipeline including feature extraction, feature matching, motion estimation, local optimization, etc. Although some of them have demonstrated superior performance, they usually need to be carefully designed and specifically fine-tuned to work well in different environments. Some prior knowledge is also required to recover an absolute scale for monocular VO. This project is to implement a novel end-to- end framework for monocular VO by using deep Recurrent Convolutional Neural Networks (RCNNs). Since it is trained and deployed in an end-to-end manner, it infers poses directly from a sequence of raw RGB images (videos) without adopting any module in the conventional VO pipeline. Based on the RCNNs, it not only automatically learns effective feature representation for the VO problem through Convolutional Neural Networks, but also implicitly models sequential dynamics and relations using deep Recurrent Neural Networks. Extensive experiments on the KITTI VO dataset show competitive performance to state-of-the-art methods, verifying that the end-to-end Deep Learning technique can be a viable complement to the traditional VO systems.

[1] Wang et. al., DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks, ICRA 2017

Recommended : Python and prior knowledge in machine learning

Peidong Liu, CNB D102 peidong.liu@inf.ethz.ch

slide-61
SLIDE 61

Goal: Description:

Deep Relative Pose Estimation for Stereo Camera

Design a neural network to estimate the relative pose between two frames for a stereo camera.

Recently there is some work on relative pose estimation between two images/frames based on neural network, which aims for the application in autonomous driving. However, compared to traditional geometric methods (e.g. 5-point algorithm), these methods have much worse accuracy. With a stereo camera we can obtain two frames captured at the same time, and recover the depth for each frame without scale ambiguity. This would help the pose estimation. This project aims to design a neural network to estimate the relative pose between two frames for a stereo camera. The students will start from learning the existing neural network for disparity/depth estimation and pose estimation for the monocular camera. Then they will focuson the design ofneural network forthe stereo camera.

[1] Zhou T, Brown M, Snavely N, Lowe DG. Unsupervised learning ofdepth and ego-motion from video. In CVPR 2017. [2] Ummenhofer B, Zhou H, Uhrig J, Mayer N, Ilg E, Dosovitskiy A, Brox T. Demon: Depth and motion network for learning monocular stereo. In CVPR 2017. [3] Mayer N, Ilg E, Hausser P, Fischer P, Cremers D, Dosovitskiy A, Brox T. A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In CVPR 2016.

Required: Python, Linux Recommended: Experience with TensorFlow, PyTorch or other deep learning frameworks

Zhaopeng Cui zhaopeng.cui@inf.ethz.ch CNB G104

slide-62
SLIDE 62

Goal: Description:

DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks

The goal is to implement a deep recurrent convolutional neural network for end-to-end visual

  • dometry [1]

Most of existing VO algorithms are developed under a standard pipeline including feature extraction, feature matching, motion estimation, local optimization, etc. Although some of them have demonstrated superior performance, they usually need to be carefully designed and specifically fine-tuned to work well in different environments. Some prior knowledge is also required to recover an absolute scale for monocular

  • VO. This project is to implement a novel end-to-end framework for monocular VO by using deep Recurrent

Convolutional Neural Networks (RCNNs). Since it is trained and deployed in an end-to-end manner, it infers poses directly from a sequence of raw RGB images (videos) without adopting any module in the conventional VO pipeline. Based on the RCNNs, it not only automatically learns effective feature representation for the VO problem through Convolutional Neural Networks, but also implicitly models sequential dynamics and relations using deep Recurrent Neural Networks. Extensive experiments on the KITTI VO dataset show competitive performance to state-of- the-art methods, verifying that the end-to-end Deep Learning technique can be a viable complement to the traditional VO systems.

[1] Wang et. al., DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks, ICRA 2017 Recommended : Python and prior knowledge in machine learning

Peidong Liu, CNB D102 peidong.liu@inf.ethz.ch

slide-63
SLIDE 63

Goal: Description:

Differential Rolling-Shutter SfM

Model the Rolling Shutter (RS) to create RS artifact free images

It is well know that moving RS cameras create distorted images. The effect is typically visible when vertical structures appear slanted. In this work we want to model the RS effect and compensate for it. The input to the algorithm is a short image burst, from which we will first compute the optical flow, then estimate the camera pose and camera motion parameters. Finally we want to create a global shutter image by warping the RS image over the estimated depth into a global shutter reference frame.

[1] Zhuang et al., “Rolling-Shutter-Aware Differential SfM and Image Rectification”, ICCV 2017 Required: C++, some experience with image processing Recommended: Experience with OpenCV

Olivier Saurer <saurero@inf.ethz.ch>

slide-64
SLIDE 64

Goal: Description:

3DScanBox

Build a multi-camera 3D scan box

Implement a simple 3D scanner by using an aluminum frame and a bunch of cameras. The necessary material is provided. Depending on the group size and interest the focus of the project can be put on different aspects. Ranging from multi-camera online calibration to multi-view stereo or fusion.

Required: C++, some experience with image processing Recommended: OpenCV, maybe Google Ceres

Petri Tanskanen, petri.tanskanen@inf.ethz.ch

slide-65
SLIDE 65

Goal: Description:

Camera Pose Estimation For Artistic Purposes

Create a Blender plugin that finds poses and focal lengths (extrinsic and intrinsic parameters) of a set of reference images. 3D artists take reference images of scenes and objects they want to model. For modelling it is helpful to align the virtual cameras in the modelling tool of choice (Blender, Maya, …) to the reference images. There exists a Blender plugin[1] that utilizes two-point perspective and user input to find pose and focal length of a single reference image. This project aims to implement an SfM plugin to find relative poses and focal lengths of a set of reference images. Open questions: rely on user input for point matches or use SIFT + Feature Matching? images might need to be undistorted since Blender’s camera does not model this phenomenon

[1] Per Gantelius,“BLAM”,https://github.com/stuffmatic/blam Required: Python, C++ Recommended: Blender, OpenCV or COLMAP Daniel Thul daniel.thul@inf.ethz.ch

slide-66
SLIDE 66

Goal: Description:

Transfer from Recognition to Optical Flow by Matching Neural Paths

Implementation of Optical Flow Method by Matching Neural Paths

The goal is to extend the stereo method of Savinov et al [1] to optical flow. The main challenge is with handling of large memory requirements by passing only restricted subset of most probable labels during the back-propagation phase of label likelihoods. The method could be implemented in any deep learning framework.

[1] Savinov et al., “Matching Neural Paths: Transfer from Recognition to Correspondence Search”, NIPS 2017 Required: C++, CUDA, any Deep learning framework

Lubor Ladicky, lubor.ladicky@inf.ethz.ch

slide-67
SLIDE 67

Goal: Description:

Navigation by Reinforcement Learning

Benchmark different RL algorithms on their ability to learn to navigate to a goal

You will take one of the popular RL libraries like Tensorforce [2] or OpenAI Baselines [3] and benchmark them on 3D navigations tasks proposed in [1]. Those tasks are implemented as maps in a Vizdoom environment. The agent is given a high reward for reaching the image-specified goal and small reward for collecting items like healthkits (which should ignite his curiosity and make him explore). His goal is to maximize rewards. You will compare the following RL methods: A3C, A2C, PPO.

[1] Savinov et al., "Semi-parametric topological memory for navigation", ICLR 2018, https://openreview.net/pdf?id=SygwwGbRW [2] https://github.com/reinforceio/tensorforce [3] https://github.com/openai/baselines Required: python Recommended: knowledge in Machine Learning, experience with tensorflow and RL

Nikolay Savinov nikolay.savinov@inf.ethz.ch

slide-68
SLIDE 68

Goal: Description:

Appearance Representation based

  • n Auto-Encoders

Improve appearance model with deep auto-encoder This project aims to build efficient appearance representations of shapes observed from multiple viewpoints and over time. Recent work [1] has addressed this using Principal Components Analysis (PCA). The goal of this project is to explore, as an alternative, deep auto-encoders for dimensionality reduction. The students will build on existing tools using MATLAB and python / tensorflow to explore appearance representations obtained from auto-encoders and compare the results to [1]. [1] Boukhayma et al., “Eigen appearance maps of dynamic shapes”, ECCV 201 Required: Python and MATLAB, some experience with image processing Recommended: Some experience with deep learning / tensorflow

  • Dr. Vagia Tsiminaki (vagia.tsiminaki@inf.ethz.ch)
  • Dr. Lisa Koch (lisa.koch@inf.ethz.ch)

Auto-encoder for dimensionality reduction

slide-69
SLIDE 69

Goal: Description:

SuperPoint: Self-Supervised Interest Point Detection and Description

The goal is to implement a self-supervised fully convolutional neural network for interest point detection and description [1] This project is to implement a self-supervised framework for training interest point detectors and descriptors suitable for a large number of multiple-view geometry problems in computer vision. As opposed to patch-based neural networks, this fully-convolutional model operates on full-sized images and jointly computes pixel-level interest point locations and associated descriptors in one forward pass. We introduce Homographic Adaptation, a multi-scale, multi- homography approach for boosting interest point detection repeatability and performing cross-domain adaptation (e.g., synthetic-to-real). Our model, when trained on the MS-COCO generic image dataset using Homographic Adaptation, is able to repeatedly detect a much richer set of interest points than the initial pre-adapted deep model and any other traditional corner

  • detector. The final system gives rise to state-of-the-art homography estimation results on HPatches when

compared to LIFT, SIFT and ORB.

[1] DeTone et. al., SuperPoint: Self-Supervised Interest Point Detection and Description, arXiv 2017

Recommended : Python and prior knowledge in machine learning

Peidong Liu, CNB D102 peidong.liu@inf.ethz.ch

slide-70
SLIDE 70

Goal: Description:

Real-Time Surface Reconstruction

Perform depth-map fusion directly into the mesh.

Traditional approaches rely on volumetric representations or point-clouds to represent the environment and fuse the different depth measurements. In this work we want to use a mesh representation to model the environment. The goal is to fuse new depth estimates directly into the mesh. Adaptive tessellation is used to represent different levels of geometric details in the scene. The input to the algorithm is a set of calibrated RGBD images. The focus of the is the implementation of the fusion algorithm. For this we will closely follow Yienkiewicz et al. [1].

[1] Zienkiewicz et al., “Monocular, Real-Time Surface Reconstruction using Dynamic Level of Detail”, 3DV 2016 Required: C++, some experience with image processing Recommended: Experience with OpenCV

Olivier Saurer <saurero@inf.ethz.ch>

slide-71
SLIDE 71

Goal: Description:

Data Generation with a Virtual Simulator for Autonomous Driving

Generate 3D training data with a recent open urban driving simulator for autonomous driving.

Recently there is an open-source simulator for autonomous driving research, which is named as CARLA [1]. This simulator supports real-time data acquisition of RGB image, semantic segmentation, and depth map, which can be used as the training data for the deep learning methods. This project aims to utilize the virtual simulator to generate more kinds of training data including 2D/3D instance-level segmentation, 3D bounding boxes, 3D shapes and poses of vehicles, etc., which will be used for the training of deep 3D detection methods.

[1] Dosovitskiy A, Ros G, Codevilla F, López A, Koltun V. CARLA: An open urban driving simulator. Conference on Robot Learning (CoRL), 2017 Required: C++, Python Recommended: Familiar with UE4 programming

Zhaopeng Cui zhaopeng.cui@inf.ethz.ch CNB G104

slide-72
SLIDE 72

Goal: Description:

Data fusion for semantic 3D reconstruction

Improve the data fusion pipeline for semantic 3D reconstruction using a learning approach

Semantic 3D reconstruction is the task of jointly reconstructing and segmenting a 3D model. It has been shown in [1] that both tasks can benefit from each other: the 3D structure offers ground for regularizing the segmentation, while the semantic information gives access to shape priors (eg ground is flat and horizontal…). Methods presented in [1] or [2] take as input multiple depth maps and corresponding 2D semantic segmentations, and fuse them into a modified Truncated Signed Distance Function (TSDF) [3]. Though very efficient, this fusion could be improved in order to

  • btain better input for the methods.

In this project we propose to leverage the availability of semantic 3D data, and machine learning libraries such as tensorflow, in

  • rder to learn a method to fuse the data used for semantic 3D segmentation.

[1] Dense Semantic 3D Reconstruction, Häne et al., TPAMI 2017 [2] Learning Priors for semantic 3D reconstruction, Cherabier et al., unpublished 2018 [3] A volumetric method for building complex models from range images, Curless et Levoy, SIGGRAPH 1996

Required: Python, Tensorflow Recommended: Optimization (Maths), C++

Ian Cherabier (ian.cherabier@inf.ethz.ch) Martin Oswald (martin.oswald@inf.ethz.ch)

slide-73
SLIDE 73
slide-74
SLIDE 74
slide-75
SLIDE 75
slide-76
SLIDE 76

Goal: Description:

Surface Reconstruction in Medical Imaging: Data and CNNs

Creation of synthetic MR datasets and their use in testing of various surface reconstruction architectures

Required: Python, Matlab Recommended: Experience with machine learning and TensorFlow

Katarina Tothova katarina.tothova@inf.ethz.ch

Reconstruction of organ surfaces is an important task in medical image analysis, especially in cardiac and neuro-imaging. Besides their significance in diagnosis and surgical planning, high-quality organ surface models provide powerful measures for statistical analysis or disease tracking. Thanks to recent advances in machine learning, we are devising a deep neural network–based approach for direct organ surface reconstruction from MRI data. To test the efficacy of the proposed network architectures, it is necessary to design and produce relevant synthetic MR data

slide-77
SLIDE 77

Goal: Description:

3D Appearance Super-resolution Benchmark

Generate appearance super-resolution benchmark datasets This project aims to generate a Super-Resolution Appearance dataset and provide a systematic benchmark for evaluation. Previous work [1] presents a framework for synthetic generation of realistic benchmarks for 3D reconstruction from images. ETH3D Benchmark [2] covers a variety of indoor and outdoor scenes. The goal of this project is to build on these works and generate super-resolved appearance dataset for the multi-view case. [1] A. Ley et al. “SyB3R: A Realistic Synthetic Benchmark for 3D Reconstruction from Images” ECCV 2016 [2] T. Schöps et al. “A Multi-View Stereo Benchmark with High-Resolution Images and Multi-Camera Videos” CVPR 2017

Required: Matlab/Python, some experience with image processing Recommended: Experience with C++, scripting language

  • Dr. Vagia Tsiminaki (vagia.tsiminaki@inf.ethz.ch)
slide-78
SLIDE 78

Goal: Description:

Super-resolving Appearance of 3D Faces for Dermatology App

Super-resolve appearance for mobile phone applications

This project aims to implement a super-resolution algorithm for appearance representations of 3D

  • faces. Previous work [1] presented a method to retrieve high resolution textures of objects observed

in multiple videos under small object deformations. The goal of this project is to implement the proposed method for mobile phone applications where performance in terms of time and memory are important. The students will implement the super-resolution framework [1] using C++ . The project can be build upon an existing C++/CUDA implementation of [2]. [1] Tsiminaki et al. “High resolution 3D shape texture from multiple videos” CVPR 2014 [2] D. Mitzel and T. Pock and T. Schoenemann and D. Cremers, Video Super Resolution using Duality based TV-L1 Optical Flow, DAGM, pages 432-441, 2009

Required: C++, some experience with image processing Recommended: Experience with Matlab

  • Dr. Vagia Tsiminaki (vagia.tsiminaki@inf.ethz.ch)
  • Dr. Martin Oswald (martin.oswald@inf.ethz.ch)

`

slide-79
SLIDE 79

Goal: Description:

Motion blur aware camera pose tracking

The goal is to implement a camera pose tracker for motion blurred images Camera pose tracker is usually a front-end of a visual odometry (VO) algorithm. Most existing works assume the input images to VO are sharp images. However, images can be easily blurred, which would further fail the VO, if the camera moves too fast with a longer exposure time. In this project, we plan to investigate and implement a motion blur aware camera pose tracker. To make the problem tractable, we assume the reference image is sharp and only current image is being motion

  • blurred. Furthermore, we assume the depth map corresponding to the reference image is already known.

All the required dataset can be generated from a simulation tool, which is already being set up for you.

[1] Good programming skills in C++

Peidong Liu and Vagia Tsiminaki, CNB D102 peidong.liu@inf.ethz.ch vagia.tsiminaki@inf.ethz.ch

slide-80
SLIDE 80

3D Vision, Spring Semester 2018

Goal:

Requirements / Tools: Supervisor:

Description:

Your Own Project Learn about the techniques presented in the lecture Choose your own topic! Available hardware: Google Tango Tablets Microsoft HoloLens GoPro Cameras Intel RealSense Sensor We find one for you

Required: Related to 3D Vision / topics of the lecture

slide-81
SLIDE 81

Your Next Steps

  • Find a group (ideally: groups of 3)
  • Find a project (one of ours or your own)
  • Topic subscription via doodle in a few days:
  • For questions contact us via the lecture Moodle

(preferred) or contact Nikolay per email

  • First come first serve!
  • Do not contact supervisors directly!
  • After topic assignment: talk with your supervisor
  • Write a project proposal
  • Don’t worry: You’ll get reminders!
slide-82
SLIDE 82

Feb 19 Introduction Feb 26 Geometry, Camera Model, Calibration Mar 5 Features, Tracking / Matching Mar 12 Project Proposals by Students Mar 19 Structure from Motion (SfM) + papers Mar 26 Dense Correspondence (stereo / optical flow) + papers Apr 2 Easter break Apr 9 Bundle Adjustment & SLAM + papers Apr 16 Student Midterm Presentations Apr 23 Multi-View Stereo & Volumetric Modeling + papers Apr 30 3D Modeling with Depth Sensors + papers May 7 3D Scene Understanding + papers May 14 4D Video & Dynamic Scenes + papers May 21 Whitsundite May 28 Student Project Demo Day = Final Presentations

Schedule