3D Vision Andreas Geiger and Torsten Sattler Spring 2017 3D Vision - - PowerPoint PPT Presentation
3D Vision Andreas Geiger and Torsten Sattler Spring 2017 3D Vision - - PowerPoint PPT Presentation
3D Vision Andreas Geiger and Torsten Sattler Spring 2017 3D Vision Understanding geometric relations between images and the 3D world between images Obtaining 3D information describing our 3D world from images from
3D Vision
- Understanding geometric relations
- between images and the 3D world
- between images
- Obtaining 3D information describing
- ur 3D world
- from images
- from dedicated sensors
3D Vision
- Extremely important in robotics and
AR / VR
- Visual navigation
- Sensing / mapping the environment
- Obstacle detection, …
- Many further application areas
- A few examples …
Google Tango
Google Tango
Image-Based Localization
Geo-Tagging Holiday Photos
(Li et al. ECCV 2012)
Augmented Reality
(Middelberg et al. ECCV 2014)
Video credit: Johannes Schönberger
Large-Scale Structure-from-Motion
Virtual Tourism
UNC/UKY UrbanScape project
3D Urban Modeling
3D Urban Modeling
Mobile Phone 3D Scanner
Mobile Phone 3D Scanner
Self-Driving Cars
Self-Driving Cars
Self-Driving Cars
Micro Aerial Vehicles
Microsoft HoloLens
Mixed Reality
Virtual Reality
Raw Kinect Output: Color + Depth
http://grouplab.cpsc.ucalgary.ca/cookbook/index.php/Technologies/Kinect
Human-Machine Interface
3D Video with Kinect
Autonomous Micro-Helicopter Navigation
Use Kinect to map out obstacles and avoid collisions
Dynamic Reconstruction
Performance Capture
Performance Capture
(Oswald et al. ECCV 14)
Performance Capture
Motion Capture
Interactive 3D Modeling
(Sinha et al. Siggraph Asia 08) collaboration with Microsoft Research (and licensed to
Scanning Industrial Sites
as-build 3D model of off-shore oil platform
Scanning Cultural Heritage
Cultural Heritage
S tanford’s Digital Michelangelo
Digital archive Art historic studies
accuracy ~1/500 from DV video (i.e. 140kb jpegs 576x720)
Archaeology
Forensics
- Crime scene recording and analysis
Forensics
Sports
Surgery
Johannes Schönberger CAB G 85.1
jsch@inf.ethz.ch
Andreas Geiger CNB G105
andreas.geiger@inf.ethz.ch
Torsten Sattler CNB 104
torsten.sattler@inf.ethz.ch
Federico Camposeco CAB G 86.3
federico.camposeco@inf.ethz.ch
Peidong Liu CAB G 84.2
peidong.liu@inf.ethz.ch
Pablo Speciale CAB G 86.3
pablo.speciale@inf.ethz.ch
3D Vision Course Team
- To understand the concepts that relate
images to the 3D world and images to
- ther images
- Explore the state of the art in 3D vision
- Implement a 3D vision system/algorithm
Course Objectives
Learning Approach
- Introductory lectures:
- Cover basic 3D vision concepts and approaches.
- Further lectures:
- Short introduction to topic
- Paper presentations (you)
(seminal papers and state-of-the-art, related to your projects)
- 3D vision project:
- Choose topic, define scope (by week 4)
- Implement algorithm/system
- Presentation/demo and paper report
Grade distribution
- Paper presentation & discussions: 25%
- 3D vision project & report: 75%
Slides and more
http://www.cvg.ethz.ch/teaching/3dvision/
Also check out on-line “shape-from-video” tutorial:
http://www.cs.unc.edu/~ marc/tutorial.pdf http://www.cs.unc.edu/~ marc/tutorial/
Textbooks:
- Hartley & Zisserman, Multiple View Geometry
- Szeliski, Computer Vision: Algorithms and Applications
Materials
Feb 20 Introduction
Feb 27 Geometry, Camera Model, Calibration Mar 6 Features, Tracking / Matching Mar 13
Project Proposals by Students
Mar 20 Structure from Motion (SfM) + papers Mar 27 Dense Correspondence (stereo / optical flow) + papers Apr 3 Bundle Adjustment & SLAM + papers Apr 10
Student Midterm Presentations
Arp17 Easter break Apr 24 Multi-View Stereo & Volumetric Modeling + papers May 1 Labour Day May 8 3D Modeling with Depth Sensors + papers May 15 3D Scene Understanding + papers May 22 4D Video & Dynamic Scenes + papers May 29
Student Project Demo Day = Final Presentations
Schedule
Fast Forward
- Quick overview of what is coming…
Pinhole camera Geometric transformations in 2D and 3D
- r
Camera Models and Geometry
- Know 2D/3D correspondences,
compute projection matrix also radial distortion (non-linear)
Camera Calibration
Harris corners, KLT features, S IFT features
key concepts: invariance of extraction, descriptors to viewpoint, exposure and illumination changes
Feature Tracking and Matching
l2 C1 m1 M? L1 m2 L2 M C2 Triangulation
- calibration
- correspondences
3D from Images
Fundamental matrix Essential matrix
Also how to robustly compute from images
Epipolar Geometry
Initialize Motion (P1,P2 compatibel with F) Initialize Structure (minimize reprojection error) Extend motion (compute pose through matches seen in 2 or more previous views) Extend structure (Initialize new structure, refine existing structure)
Structure from Motion
- Visual Simultaneous Navigation and Mapping
Visual SLAM
(Clipp et al. ICCV’09)
Stereo and Rectification
Warp images to simplify epipolar geometry Compute correspondences for all pixels
Multi-View Stereo
Joint 3D Reconstruction and Class Segmentation (Haene et al CVPR13)
joint reconstruction and segmentation
(ground, building, vegetation, stuff)
reconstruction only
(isotropic smoothness prior)
Building Ground Vegetation Clutter
Structured Light
- Projector = camera
- Use specific patterns to obtain
correspondences
Papers and Discussion
- Will cover recent state of the art
- Each student team will present a paper (5min
per team member), followed by discussion
- “Adversary” to lead the discussion
- Papers will be related to projects/topics
- Will distribute papers later
(depending on chosen projects)
Projects and reports
- Project on 3D Vision-related topic
- Implement algorithm / system
- Evaluate it
- Write a report about it
- 3 Presentations / Demos:
- Project Proposal Presentation (week 4)
- Midterm Presentation (week 8)
- Project Demos (week 15)
- Ideally: Groups of 3 students
Course project example: Build your own 3D scanner!
Example: Bouguet ICCV’98
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Benchmarking Relative Pose Solvers
Implement two pose estimation methods and find which is the best algorithm for relative motion plus a vertical direction.
Many algorithms for upright relative motion (camera to camera motion with a known direction) exist, the most important ones are [1], [2] and [3] (code
- nline). However, these papers do a bad job of comparing with each other
and currently no survey that compares them fairly exist. In this project you will implement [1,2,3] and compare their time performance and accuracy using simulated data (provided by the supervisor). You will also compare them against the state-of-the art in relative motion without known vertical direction [4] (code online).
Federico Camposeco fede@inf.ethz.ch
Required: MATLAB
[1] Kalantari et al. „A new solution to the relative orientation problem using only 3 points and the vertical direction." J.of Math. Imaging and Vision 2011 [2] Naroditsky et al. "Two efficient solutions for visual odometry using directional correspondence.“ PAMI 2012 [3] Sweeney et al.. „Efficient Computation of Absolute Pose for Gravity-Aware Augmented Reality“. ISMAR 2015. [4] Hartley & Li. “An efficient hidden variable approach to minimal-case camera motion estimation." PAMI 2012.
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Simulating Structure from Motion
Build a synthetic SfM generator in MATLAB.
Many applications, such as localization, rely on using a Structure from Motion (SfM) dataset as an input. The characteristics of the SfM used affect how the algorithms build to work on them
- behave. Currently, only real-data SfM reconstructions are a valid
way of validating such applications, however such data is limited and expensive to reproduce. In this project, you will implement a MATLAB simulator that can build a faithful SfM reconstruction.
Federico Camposeco fede@inf.ethz.ch
Required: MATLAB
Semantic 3D reconstruction using a label hierarchy
Goal: leverage semantic hierarchy in order to
accelerate and reduce memory consumption of semantic 3D reconstruction
Description:
Semantic 3D reconstruction is the task of jointly inferring the 3D geometry and the semantic nature of objects in a scene from images. This can be formulated as a volumetric labeling problem as in [1]. However, [1] is limited in the number of labels that it can reconstruct. Solutions like [2] have been proposed to tackle this issue. Exploiting the hierarchy that exist between labels in a coarse to fine way could improve the efficiency of [1]. For instance, wall is a coarse label, refined by windows, itself refined by frames… We propose to explore this idea in order to reduce the memory footprint and speed up the algorithm, leading to an increase in the number of labels
- ne can reconstruct at the same time, and more efficiency.
[1] Ηäne et al, Joint 3D scene reconstruction and class segmentation, CVPR ‘13 [2] Cherabier et al, Multi‐Label semantic 3D reconstruction using voxel blocks, 3DV ‘16 Requirements / Tools: Required: C++, Linux Recommended: Convex Optimization Supervisors: Ian Cherabier (ian.cherabier@inf.ethz.ch) Martin Oswald (martin.oswald@inf.ethz.ch) Furniture Bed Night table Pillow Sheet Drawer
3D Vision, Spring Semester 2017
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Detection of repeated structures in SfM
- Detect large repeated structures in urban sparse
point clouds obtained through SfM.
Sparse point clouds of urban scenes obtained through Structrure-from-Motion are used in a large number of applications, such as virtual tourism, autonomous navigation, urban scene modelling, etc. However, many of these applications require the localization of a new image against this previously computed point cloud. This task becomes very hard in the presence of repetitive structures in the building. In this project, we aim to use the original images and the sparse point cloud in order to detect all of these repetitive structures and identify them on the point cloud, such that subsequent applications can use this information to improve their performance. We will start by using the SIFT matching-based algorithm described in [1] to detect matching 3D points, and then use guided matching and plane extraction in order to find the whole areas of support of each found symmetry/repetition.
[1] Cohen et al., “Discovering and Exploiting 3D Symmetries in Structure from Motion”, CVPR 2012
Andrea Cohen http://people.inf.ethz.ch/acohen/
Required: C++, OpenCV
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Multi Camera System Calibration
- Automatic calibration routine for a multi camera
stereo setup
An FPGA featured board includes 10 cameras to support wide field of view
- applications. As the camera arrangement is not restricted to a single use case
a calibration routine is needed. The cameras intrinsic and extrinsic parameters need to be estimated in order to perform stereo matching. Based on the parameters the FPGA board performs real-time correction. Tasks:
- Calibration routine for intrinsic/extrinsic parameters of one camera pair.
- Extend routine to support multiple camera pairs.
- Upload parameters to the FPGA board.
Dominik Honegger / Peidong Liu dominik.honegger@inf.ethz.ch peidong.liu@inf.ethz.ch
OpenCV / C++ 8 Camera System Output: Left, Right, Disparity FPGA Board
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Segmentation of the Heart
- n 3D MR Images
Use shape priors to improve state-of-the-art cardiac segmentation on 3D MR images
Automated segmentation of cardiac structures in volumetric MR images can be challenging due to poor image contrast. Current approaches [1] may fail to reconstruct anatomically correct shapes. The goal of this project is to incorporate shape priors into the problem formulation, which have been successfully applied to 3D reconstruction of street scenes [2], heads [3], and other objects.
[1] Koch et al. "Multi-atlas segmentation using partially annotated data: Methods and annotation strategies." arXiv preprint arXiv:1605.00029 (2016) [2] Häne et al. ”Joint 3D Scene Reconstruction and Class Segmentation” CVPR 2013 [3] Maninchedda et al. “Semantic 3D Reconstruction of Heads” ECCV 2016
Lisa Koch (l.koch@imperial.ac.uk), Katarina Tóthová (katarina.tothova@inf.ethz.ch)
Required: C++, some experience with image processing Recommended: Experience with medical images 2D cross-section at different locations: MRI (left), labels (right)
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Augmented Reality with Real‐time Physics Simulation
- To build an augmented reality framework
The goal is to build an augmented reality framework, capable of building the model of the environment (Kinect fusion or any alternative with additional regularization to fill the holes), human interaction (pose estimation from Kinect) and physics (fluid, rigid bodies, smoke) simulation using our real-time data-driven method [1], and rendering the simulation result back to the input image from the camera.
[1] Ladický et al., “Data-Driven Fluid Simulation using Regression Forests”, SIGGRAPH Asia 2015
Ľubor Ladický lubor.ladicky@inf.ethz.ch
C++, CUDA A representative image
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
High performance and tunable stereo reconstruction
- Traditional stereo algorithms have focused their efforts on
reconstruction quality and have largely avoided prioritizing for run time performance. Robots, on the other hand, require quick maneuverability and effective computation to
- bserve
its immediate environment and perform tasks within it. In this project, the students are expected to implement an existing high‐ performance and tunable stereo disparity estimation method [1]. The pipeline can potentially enable robots to quickly reconstruct their immediate surroundings and maneuver at high‐speeds. Please find more details from [1].
Peidong Liu peidong.liu@inf.ethz.ch
Required: C++, OpenCV [1] Sudeep Pillai et al., “High-Performance and Tunable Stereo Reconstruction”, ICRA 2016
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
High speed visual odometry with embedded stereo camera
- Towards aggressive maneuvers of mobile robots, fast localization
and mapping is required. Many existing pipelines usually require intense computational power, which limits their applications for resource constrained mobile robots, e.g., micro aerial vehicles (MAVs). In this project, the students are expected to explore the possibility to port one of those pipelines for a resource constrained MAV by exploiting an embedded stereo camera. In particular, the students are expected to extend an existing semi‐ dense/dense visual odometry pipeline for a FPGA accelerated stereo camera, which can provide high speed dense depth map.
Peidong Liu peidong.liu@inf.ethz.ch
Required: C++, OpenCV [1] J. Engel et al., LSD-SLAM: Large-scale direct monocular SLAM, ECCV 2014. [2] D. Honegger et al., Real-time and low latency embedded computer vision hardware based on a combination of FPGA and mobile CPU, IROS 2014.
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Depth Map and Normal Direction Fusion
Include surface normal direction likelihood estimates to volumetric depth map fusion
Volumetric depth map fusion takes a set of input depth maps and fuses them to a consistent 3D model. The formulation is posed as a segmentation of the voxel space into the free space (exterior of the
- bject) and occupied space (interior of the object). Traditionally
the depth maps give an input about the surface position. The goal of this project is to include likelihoods of surface normal direction into the fusion. Code to produce the input data (depth maps, surface direction likelihoods) and a baseline fusion approach will be provided.
Fabio Maninchedda (fabiom@inf.ethz.ch)
Required: C++, Linux Recommended: Convex Optimization Depth Map Surface Normals Baseline Fusion
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Orthogonal / Trifocal Stereo
- Robust stereo for man‐made environments
- Stereo is a 1D matching algorithm along a line
- Line-shaped texture and geometry can lead to unresolvable failures
- Mostly visible in man-made environments (houses, powerlines, fences)
- The ambiguity in 1D can be resolved with linearly independent
matching in orthogonal directions using 3 or more cameras
- The scope of the project is to use the CVG stereo / SfM benchmark
dataset to assess the robustness of an orthogonal stereo matching setup in comparison with traditional 2-camera stereo
Lorenz Meier, lm@inf.ethz.ch http://people.inf.ethz.ch/lomeier/
Required: C++ / Matlab, some experience with image processing Recommended: Efficient C++ coding, familiarity with stereo
- r optical flow and structure-from-motion
A representative image
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Real-time 3D Model Difference Reasoning with HoloLens
Interactive visualization of model differences (current scan vs. pre-loaded model on the HoloLens.
After loading a reference model to the HoloLens device, the internal mapping of the HoloLens can be used to directly compare the currently scanned model against the pre-loaded reference model. In a first step the reference model needs to be aligned with the current scan (using either with ICP [1] or global methods [2]). Significant differences between the models shall then be highlighted in real-time with intuitive augmented reality overlays.
References: [1] Besl, Paul J.; N.D. McKay, A Method for Registration of 3-D Shapes, TPAMI, 1992. [2] Zhou QY., Park J., Koltun V., Fast Global Registration. ECCV, 2016.
Martin Oswald <martin.oswald@inf.ethz.ch> Pablo Speciale <pablo@inf.ethz.ch>
Required: C++/Visual Studio on Windows 10 Recommended: Unity3D and graphics experience
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Fixing Structure‐from‐Motion (SfM) Reconstructions
- Structure‐from‐Motion (SfM) algorithms such as [1] reconstruct the 3D
structure of a scene from images. Due to ambiguities, caused e.g. by repetitive structures, SfM algorithms might connect unrelated parts, resulting in duplicated or collapsed structures (see images).
- The goal of this project is to use free‐space constraints [2] and semantic cues
[3] to detect these errors and, if possible, correct them.
- [1] Schönberger & Frahm, “Structure‐from‐Motion Revisited”, CVPR 2016
- [2] Cohen et al., “Indoor‐Outdoor 3D Reconstruction Alignment“, ECCV 2016
- [3] Cohen et al., “Merging the Unmatchable: Stitching Visually Disconnected SfM Models”, ICCV
2015
Torsten Sattler torsten.sattler@inf.ethz.ch CNB G104
Required: C++
Automatically detect (and resolve) errors in SfM models
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Detailed Object Reconstruction Using HoloLens
- Detailed reconstruction of objects on a HoloLens
Microsoft‘s HoloLens is able to track the movement of a user through the scene and can also capture a 3D mesh of the surrounding environment with its cameras. However, the mesh provided by HoloLens is rather coarse and does not capture fine details. As such, it is not sufficient when scanning fine objects. The goal of this project is to build a 3D object scanner using HoloLens that can also capture fine details. Using the coarse mesh provided by HoloLens as initialization, multi-view stereo refinement techniques such as [1] should be employed to refine the 3D model (of a selected subset of the scene).
[1] Vu et al., “High Accuracy and Visibility-Consistent Dense Multiview Stereo”, PAMI 2012
Torsten Sattler torsten.sattler@inf.ethz.ch CNB G104
Required: C++ / C#, Windows 10, experience with Unity
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Learning to navigate without a map
- Learn a reinforcement policy to navigate from the
start to the goal
The goal of the project is to determine how well deep learning is suited for planning under incomplete information. You will use the policy gradients algorithm [1] to learn to navigate a grid world presented in [2] (best paper at NIPS’16). While [2] assumes a fully-observable grid, you will work with the case in which you don’t know where the obstacles are in advance, and need to discover that. You will compare this partially-observed scenario with the fully-
- bserved one from [2] to discover how much more difficult it is. You will also
compare the deep net performance with the A-star algorithm [3]. [1] http://karpathy.github.io/2016/05/31/rl/ [2] https://papers.nips.cc/paper/6046-value-iteration-networks.pdf [3] https://en.wikipedia.org/wiki/A*_search_algorithm
Nikolay Savinov nikolay.savinov@inf.ethz.ch
Required: python, machine learning knowledge
3D Selfie for Game Avatar
Goal: High quality reconstruction of face and eyes. Eyes are challenging to reconstruct, due to their reflectiveness. We seek for a high-quality lightweight eye capture system using eye priors. Challenges:
- Use weak priors for face reconstruction
- Detection and tracking of eyes
- Guided reconstruction based on pre-captured eyes
- Implementation on mobile device
- P. Bérard et al.
Mobile phone reconstruction
Pre-captured eyes Guided reconstruction Supervisor: Olivier Saurer <saurero@inf.ethz.ch>
Dense Reconstruction via Mesh Sweeping
Goal: Design a fast stereo reconstruction algorithms that uses a rough mesh as initialization. The mesh is then refined by sweeping along the normals of the individual faces. Challenges:
- Mesh sweep
- Photometric refinement
- GPU implementation
Supervisor: Olivier Saurer <saurero@inf.ethz.ch> Reference:
https://arxiv.org/pdf/1604.06258.pdf
Micro Baseline Stereo
Goal: While a user frames an object to capture an image, parallax is created from the small motion. We want to exploit this “accidental” motion to infer depth and create a 3D reconstruction of the scene. Challenges:
- Rolling Shutter imagery
- Small baseline stereo
(high uncertainty in depth estimates)
- Implementation Mobile phone (optional)
Supervisor: Olivier Saurer <saurero@inf.ethz.ch>
Reference:
- High Quality Structure from Small Motion for
Rolling Shutter Cameras
- 3D Reconstruction from accidental motion
Courtesy: Sunghoon et al. (High Quality Structure from Small Motion for Rolling Shutter Cameras)
Efficient Texturing
Goal: Implement an efficient texturing algorithm that can deal with changing lighting conditions. The texture gets artifacts if lighting changes during Image capture, for example when the head is turning during scanning. Challenges:
- Color tone mapping
- Seam-leveling
- Fast processing on the phone
Pan & Taubin, 2015
Supervisor: Petri Tanskanen <tpetri@inf.ethz.ch>
Goal: The existing live Delaunay reconstruction pipeline should be improved towards AR apps by use of geometrical constraints for man-made scenes à la Atlanta world to capture walls and floor. The result could be further refined with a quick (<1 min) photometric optimization.
Delaunay Scene Reconstruction
Supervisor: Petri Tanskanen <tpetri@inf.ethz.ch>
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Wide Baseline Matching Using Novel View Synthesis
- Solve the problem of large baseline matching
using novel view synthesis from deep networks
Establishing feature correspondences between image pairs is an important task in image-based 3D
- reconstruction. Traditional approaches can robustly match between images with similar viewpoint but
fail to robustly match wide baseline image pairs due to the increasing geometric and radiometric
- distortions. For example, this leads to disconnected 3D models around buildings, drift in SLAM, etc.
Recent advancements in unsupervised deep learning lead to the develpment of variational methods that can synthesis novel viewpoints from existing images. This thesis aims to leverage existing deep neural network architectures for novel view synthesis for robust wide baseline matching. The basic idea is to generate a smooth sequence of images that can be used to track the motion of keypoints from one to the
- ther images. In the first part, the project should use existing state-of-the-art deep networks and
tracking algorithms to develop this method. In the second part, the method should be evaluated with respect to traditional wide baseline matching approaches in terms of matching performance.
[1] DeepStereo: Learning to Predict New Views from the World's Imagery, Flynn et al. [2] Detection and Tracking of Point Features, Tomasi and Kanade [3] Robust Wide Baseline Stereo from Maximally Stable Extremal Regions, Matas et al.
Johannes Schönberger jsch@inf.ethz.ch
Required: Python, Matlab, Ability to execute/combine existing code
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Omni‐Directional Stereo for 360° 3D Virtual Reality Video
Recently, Google and Facebook presented software and hardware (Google Jump and Facebook 360) to capture and render panoramic (360°) and stereoscopic (3D) videos that can be viewed using virtual reality viewers such as the Oculus Rift or Google
- Cardboard. The underlying technology is based on a ring of synchronized cameras and
3D computer vision methods, including stereo, optical flow, etc. This project aims to build a video renderer that takes images from a ring of calibrated and synchronized cameras and outputs a 360° 3D video that can be viewed using e.g. Google Cardboard on Youtube VR.
Johannes Schönberger jsch@inf.ethz.ch
Required: Python or Matlab or C++
- Implement a video renderer for 360° 3D virtual reality
video using a cameras ring with omni‐directional stereo
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Construction Progress Tracking
Compute the difference between two stages in the construction process of a building.
A building is 3D scanned by drones at different time/stages in the construction process. These scans represent the current state of the building construction. The provided data [1] (for week 27, 28, 29 and 30) is:
- CAD models (as planned).
- Point Cloud (3D reconstruction [2] from images obtained by drones).
The goal is to obtain the difference between two different stages which will allow to track the progress in the construction process. Certain assumptions will be made in order to simplify the problem: all the data will be pre- aligned; some code will be provided to obtain a volumetric representation from the 3D reconstruction and from the CAD models. [1] Schependomlaan Dataset: https://github.com/openBIMstandards/DataSetSchependomlaan. [2] Schönberger & Frahm Structure-from-Motion Revisited. CVPR 2016. https://github.com/colmap/colmap
Pablo Speciale <pablo@inf.ethz.ch> Martin Oswald <martin.oswald@inf.ethz.ch>
Required: Matlab / C++
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Global Model Alignment
Robust 3D Registration between scanned models and CAD models.
For the purpose of construction progress tracking we want to register 3D model scans with their corresponding CAD model for a comparison of the planned vs. the actual build progress. To this end, we want to be able to align: 1) scanned model vs. scanned model, 2) scanned model vs. CAD model, 3) CAD model vs. CAD model. For robustness we want to apply global alignment methods [1] and study which features [2] give the best performance. Goals:
- 1. Obtain a point cloud (PC) from a CAD model (by sampling) and apply PC vs. PC registration [1].
- 2. Transform to volumetric representation. Investigate which features can be applied.
- 3. Transform to mesh. Investigate which features can be applied.
[1] Zhou QY., Park J., Koltun V. (2016) Fast Global Registration. ECCV 2016. [2] Radu B. Rusu et al. 3D is here: Point Cloud Library (PCL: pointclouds.org). ICRA 2011.
Pablo Speciale <pablo@inf.ethz.ch> Martin Oswald <martin.oswald@inf.ethz.ch>
Required: Matlab / C++ Recommended: PCL library (pointclouds.org)
3D Vision, Spring Semester 2017
Goal:
Requirements / Tools: Supervisor:
Description:
Simple AR Game
- n Mobile Phone
Create a simple AR Game that reconstructs the surface of the table as playfield
In this project you can build on a real-time tracker and depthmap computation as basis to create a simple AR game or
- app. The goal is to use the depthmaps and reconstruct the
rough geometry of the game scene such that the game can interact with the real-world. You will get a sample Android project that demonstrates the tracking and depthmap computation. Depending on the scene, a combination of Marching Cubes or plane-based scene reconstruction will lead to a dense representation.
Petri Tanskanen (tpetri@inf.ethz.ch)
Required: Java and C++, OpenGL Rendering Recommended: Experience with Android app development
Your Own Project Learn about the techniques presented in the lecture Choose your own topic! Available hardware: Google Tango Tablets Microsoft HoloLens GoPro Cameras Intel RealSense Sensor We find one for you
Required: Related to 3D Vision / topics of the lecture
Your Next Steps
- Find a group (ideally: groups of 3)
- Find a project (one of ours or your own)
- Let us know about it:
- Contact us via the lecture Moodle (preferred)
- Contact Federico per email
- First come first serve!
- Do not contact supervisors directly!
- Talk with your supervisor
- Write a project proposal
- Don’t worry: You’ll get reminders!
Feb 20 Introduction
Feb 27 Geometry, Camera Model, Calibration Mar 6 Features, Tracking / Matching Mar 13
Project Proposals by Students
Mar 20 Structure from Motion (SfM) + papers Mar 27 Dense Correspondence (stereo / optical flow) + papers Apr 3 Bundle Adjustment & SLAM + papers Apr 10
Student Midterm Presentations
Arp17 Easter break Apr 24 Multi-View Stereo & Volumetric Modeling + papers May 1 Labour Day May 8 3D Modeling with Depth Sensors + papers May 15 3D Scene Understanding + papers May 22 4D Video & Dynamic Scenes + papers May 29
Student Project Demo Day = Final Presentations