Competitive Collaboration Joint Unsupervised Learning of Depth, - PowerPoint PPT Presentation

Competitive Collaboration Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation Anurag Ranjan Perceiving Systems Max Planck Institute for Intelligent Systems 1

Varun Jampani Lukas Balles Deqing Sun Kihwan Kim Jonas Wulff Michael Black 2

Tübingen, Germany 3

Outline Motion and Deep Learning Competitive Unsupervised Learning of Geometry Optical Flow with Structure Collaboratio Everything n Supervise Unsupervise d d 4

Motion and Optical Flow 5

Optical Flow 2D velocity for all pixels between two frames of a video sequence. 𝐽 𝑦, 𝑧, 𝑢 − 1 = 𝐽(𝑦 + 𝑣, 𝑧 + 𝑤, 𝑢) 6

Why do we need Optical Flow SLAM Action Recognition Super-resolution Optical Flow Video Compression Slomo VFX Unsupervised Segmentation Motion Magnification 7 Unsupervised Segmentation: Mahendran et al., VFX: Black et al., Motion Magnification: Liu et al., Action Recognition: Simoyan et al.

Optical Flow 2D velocity for all pixels between two frames of a video sequence. 𝐽 𝑦, 𝑧, 𝑢 − 1 = 𝐽(𝑦 + 𝑣, 𝑧 + 𝑤, 𝑢) 8

Estimating Optical Flow 𝐽 𝑦, 𝑧, 𝑢 − 1 = 𝐽(𝑦 + 𝑣, 𝑧 + 𝑤, 𝑢) min 𝑣,𝑤 ∥ 𝐽 𝑦, 𝑧, 𝑢 − 1 − 𝐽 𝑦 + 𝑣, 𝑧 + 𝑤, 𝑢 ∥ min 𝑣,𝑤 𝜍(𝐽 𝑢 − 1 − 𝑥arp 𝐽 𝑢 , 𝑣, 𝑤 ) Photometric Loss 9

min 𝑣,𝑤 𝜍(𝐽 𝑢 − 1 − 𝑥arp 𝐽 𝑢 , 𝑣, 𝑤 ) Photometric Loss 10

No prior on structure 11

Can we learn from data? 12

Optical Flow Estimation ∈ ℝ 𝑜×n Dosovitskiy et al. 2015 13

FlowNet Dosovitskiy et al. 2015 14

Problem FlowNet is too big. 33 M parameters. Needs to learn both large and small motions. Does not perform well. 15

Approach Image statistics are scale invariant. Use an image pyramid. Train a small network for each pyramid level. Compute residual flow at each level. Network captures small displacements. Pyramid captures large displacements. Burt and Adelson. The Laplacian pyramid as a compact image code. IEEE COM, 1983 16

SPyNet Spatial Pyramid Network for Optical Flow Estimation Ranjan et al. Optical Flow estimation using a Spatial Pyramid Network. CVPR 2017. 17

𝐽 1 , 𝐽 2 32x7x7 64x7x7 32x7x7 16x7x7 2x7x7 𝑤 𝑙 18

𝐻 𝑙 19

𝑣 𝑣 + + 0 𝑊 0 𝑊 1 𝐻 1 𝑥 𝐻 0 𝑤 0 𝑤 1 𝑒 𝑒 1 𝐽 0 1 1 𝐽 1 𝐽 2 𝑒 𝑒 2 𝐽 0 2 𝐽 2 2 𝐽 1 20

𝑣 𝑣 + + + 0 𝑊 0 𝑊 𝑊 1 2 𝐻 2 𝐻 1 𝑥 𝑥 𝐻 0 𝑤 0 𝑤 1 𝑤 2 𝑒 𝑒 1 𝐽 0 1 1 𝐽 1 𝐽 2 𝑒 𝑒 2 𝐽 0 2 𝐽 2 2 𝐽 1 21

Spatial Temporal Spatial Temporal SPyNet FlowNet 22

Frames Ground Truth FlowNetS FlowNetC SPyNet 23

Average EPE on Sintel (Clean + Final) 8,500 8,400 Voxel2Voxel* 8,300 8,200 8,100 FlowNetC 8,000 7,900 FlowNetS 7,800 7,700 SPyNet 7,600 7,500 1 10 100 Number of Model Parameters (in Millions) *error metric not consistent with the benchmarks 24

Average EPE on Sintel (Clean + Final) 9,000 Voxel2Voxel* [2016] 8,500 SPyNet [2017] FlowNetS [2015] 8,000 FlowNetC [2015] 7,500 7,000 6,500 6,000 PWC-Net [2018] 5,500 FlowNet2 [2017] 5,000 4,500 4,000 1 10 100 1000 Number of Model Parameters (in Millions) *error metric not consistent with the benchmarks 25

Sintel Clean Sintel Clean d0-10 d10-60 d60-140 s0-10 s10-40 s40+ SpyNet+ft 43.442 5.501 3.122 1.719 0.832 3.343 FlownetS+ft 5.992 3.561 2.193 1.424 3.815 40.098 FlownetC+ft 5.575 3.182 1.993 1.622 3.974 33.369 Sintel Final Sintel Final d0-10 d10-60 d60-140 s0-10 s10-40 s40+ SpyNet+ft 3.290 49.707 6.694 4.368 1.395 5.534 FlownetS+ft 7.252 4.610 1.873 5.826 43.236 2.993 FlownetC+ft 7.190 4.619 3.298 2.305 6.169 40.779 Distance from Motion Boundaries Average Displacement 26

Problem SPyNet [1] [1] Ranjan et al. Optical Flow estimation using a Spatial Pyramid Network. CVPR 2017. 28

Why humans? • Useful for recognition problems. Scenes contain human actions. • Two-stream architectures use fast classical optical flow methods. • Deep Networks have massive GPU memory requirements. Left Image: Delaitre et al. Recognizing human actions in still images, BMVC 2010 29 Right Image: Simonyan et al. Two-stream convolutional networks for action recognition in videos. NIPS 2014 .

Problem Flying Chairs MPI Sintel KITTI [3] [1] [2] No dataset for human optical flow for training neural networks. [1] Dosovitskiy et al. Flownet: Learning optical flow with convolutional networks. ICCV 2015. [2] Butler et al. A naturalistic open source movie for optical flow evaluation. ECCV 2012. 30 [3] Geiger et al. Vision meets robotics: The KITTI dataset. International Journal of Robotics Research 32.11 (2013): 1231-1237.

Idea Create a new dataset for human optical flow. Use it to train an existing fast and compact optical flow method. 31

Human Flow Dataset Human Motion Realistic + + Environment Capture data Human Body [3] [1] Model [2] + Cloth texture, Lighting, Noise, Motion Blur, Camera Blur Blender Simulate and Extract Motion Vectors [1] Ionescu et al. Human3.6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE PAMI2014. 32 [2] Loper et al. MoSh: Motion and Shape Capture from Sparse Markers. SIGGRAPH Asia 2014. [3] Yu et al. "Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop." arXiv preprint arXiv:1506.03365(2015).

Human Flow Dataset 33

SPyNet 𝑣 𝑣 + + + 0 𝑊 0 𝑊 𝑊 1 2 𝐻 2 𝐻 1 𝑥 𝑥 𝐻 0 𝑤 0 𝑤 1 𝑤 2 𝑒 𝑒 1 𝐽 0 1 1 𝐽 1 𝐽 2 𝑒 𝑒 2 𝐽 0 2 𝐽 2 2 𝐽 1 Ranjan et al. Optical Flow estimation using a Spatial Pyramid Network. CVPR 2017. 35

Evaluation of Optical Flow Networks Average EPE Human Flow Dataset 0.6 0.5 SPyNet PWC-Net 0.4 SPyNet+HF 0.3 PWC-Net+HF 0.2 0.1 0 0.010 0.100 1.000 10.000 Inference Time (s) 36

Evaluation of Optical Flow Networks Average EPE Human Flow Dataset 1 FlowNetS 0.9 0.8 0.7 PCA Flow 0.6 SPyNet 0.5 Epic Flow LDOF PWC-Net 0.4 SPyNet+HF FlowNet2 0.3 PWC-Net+HF Flow Fields 0.2 0.1 0 0.010 0.100 1.000 10.000 Inference Time (s) 37

Visuals – Video Ground Truth Human Flow SpyNet 38

Visuals – Video Human Flow SpyNet 41

Visuals – Video Human Flow SpyNet 42

Human Flow may not work on other parts of the scene. 43

Introduction to Scene Geometry 44

Motion of a Static Scene For static scenes: Depth + Camera Motion = Optical 45 Flow

Multi-view Geometry Pinhole Camera Matrix 𝑦 2 = 𝐿 𝑆 𝑦 1 = 𝐿𝑌, 𝑢 𝑌, 𝐽 2 𝐽 1 𝑒 𝑔 𝑦 1 𝑌 = 𝑒 ∥ 𝐽 1 𝑦 1 − 𝐽 2 𝑦 2 ∥= 0 min 𝑆,𝑢,𝑒 𝜍(𝐽 1 − 𝑥arp 𝐽 2 , 𝑆, 𝑢, 𝑒 ) Photometric Loss 46

Static Scene and Moving Objects 47

How to decompose a scene? 48

Competitive Collaboration 49

𝑆 𝒠 𝑠 50

𝑆 𝐺 𝒠 𝑠 𝒠 𝑔 Competitor Competitor 𝒠 51

Competition 𝑆 𝐺 𝒠 𝑠 𝒠 𝑔 Competitor Competitor 𝑁 Moderator 52

Collaboration 𝑆 𝐺 ∗ ∗ 𝒠 𝑠 𝒠 𝑔 Competitor Competitor 𝑁 Moderator 53

Mixed Domain Learning 𝐵 𝐶 𝑁 54

Competition Loss 𝐹 𝑑𝑝𝑛 = 𝑛 ∙ 𝐼 𝐵 , 5 + 1 − 𝑛 ∙ 𝐼(𝐶 , 5) 55

Collaboration Loss 𝐹 𝑑𝑝𝑚 = 𝐹 𝑑𝑝𝑛 + ቊ − log(𝑁 𝑧 + 𝜗) 𝑗𝑔 𝐹 𝐵 < 𝐹 𝐶 − log(1 − 𝑁 𝑧 + 𝜗) 𝑗𝑔𝐹 𝐵 ≥ 𝐹 𝐶 𝐹 𝐵 = 𝐼(𝐵 ( ), 5) 56

𝐵 𝐶 𝑁 57

Accuracy Model Training MNIST SVHN MNIST+SVHN Error Error Error Alice Basic 1.34 11.88 8.96 Alice CC 1.41 11.55 8.74 Bob CC 1.24 11.75 8.84 Alice+Bob+Mod CC 1.24 11.55 8.70 Alice 3x Basic 1.33 10.86 8.22 58

Moderator Behavior Alice Bob MNIST 0 % 100 % SVHN 100 % 0 % 59

Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation 60

Monocular Depth Prediction 𝐸 𝑆 𝐷 CameraMotion Estimation Zhou et al. CVPR 2017 61

Meister et al. AAAI ‘18, Janai et al. ECCV ‘18 Monocular Depth Prediction Optical Flow Estimation 𝐸 𝐺 𝑆 𝐷 CameraMotion Estimation Zhou et al. CVPR 2017 62

Monocular Depth Prediction Optical Flow Estimation 𝐸 𝐺 𝒠 𝑠 𝑆 𝒠 𝑔 𝐷 𝑁 CameraMotion Estimation Motion Segmentation 63

Photometric Photometric Loss Loss 𝐹 𝑆 = 𝜍(𝐽, 𝑥arp(𝐽 + , 𝑑, 𝑒 )) ⋅ 𝑛 𝐹 𝐺 = 𝜍(𝐽, 𝑥arp(𝐽 + , 𝑣 + )) ⋅ (1 − 𝑛) Monocular Depth Prediction Optical Flow Estimation 𝐸 𝐺 Loss 𝑆 𝐹 Loss 𝐷 𝑁 CameraMotion Estimation Motion Segmentation 𝐹 𝐷 = 𝐼(𝑱 ∥𝑣 𝑆 − 𝑣 𝐺 ∥<𝜇 𝑑 , 𝑛) 64

Competitive Collaboration Joint Unsupervised Learning of Depth, - PowerPoint PPT Presentation

Competitive Collaboration Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation Anurag Ranjan Perceiving Systems Max Planck Institute for Intelligent Systems 1 Varun Jampani Lukas Balles Deqing Sun

9/15/2013 Using Information Systems to Achieve Competitive Advantage Porters Competitive Forces

ECO 610: Lecture 7 Perfectly Competitive Markets Perfectly Competitive Markets: Outline

Mexico: a leading economy A country with competitive sectors ProMxicos strategies 3

Competitive Proposals 75 Contents Competitive Proposals (RFPs, RFQs) Procurement of

Multidimensional Scheduling (Polytope Scheduling Problem) Competitive Algorithms from Competitive

Competitive Path Assessment for MRTU Competitive Path Assessment for MRTU Preliminary Results for

California Cadet Corps Curriculum on Military Knowledge On Target! M13/C: Competitive

COMPETITIVE MULTITASK MARINE TECHNOLOGY Ocean Cleaner Technology S.L. is a competitive marine

The Planning of System Upgrades in a Competitive and Competitive and Evolving Environment

Competitive Obedience Intro, Rules, and Tips Topics What is Competitive Obedience? Levels

10/5/2010 Building Competitive Advantage Chapter 4 4 | 1 Success and Strategy Within an

Creating a Globally Competitive Workforce Glenn Walters 1 Overview Global Competitive

Competitive Integrated Employment The Time is Now!! Competitive, Integrated Employment: WIOA

Competitive Learning Neural Networks Neural Networks - Competitive 1 Bibliography Rumelhart, D.

Competitive Grants Reporting Requirements UNITED STATES DEPARTMENT OF LABOR Veterans

Class 9 @rwdkent Competitive Analysis Competitive Analysis Understand competition and how they

SNAPSEED, a Photo Editing App for Mobile Devices Nancy Matheson Snapseed is a photo-editing

Lens Blur Skill in Photoshop Why to Blur in Photoshop ? Focus on your subject

Image Registration and Motion Estimation Fabio Viola University of Cambridge The Goal Produce

Image Restoration Yao Wang Polytechnic Institute of NYU, Brooklyn, NY 11201 y , y , Partly

NVIDIA DRIVE December 2019 DRIVE PLATFORM Mass production INDUSTRY / MARKET SAFETY TECHNOLOGY

Real-Time Pedestrian Tracking, Prediction & Navigation Dinesh Manocha Univ. of North

Wireless-aware Design for Interactive Reality Applications November 4, 2019 Suman Banerjee,

U P T I M E U P T I M E U P T I M E R I S K L E S S M O R E U P T I M E R I S K U P T I M E U

Competitive Collaboration Joint Unsupervised Learning of Depth, - PowerPoint PPT Presentation

Competitive Collaboration Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation Anurag Ranjan Perceiving Systems Max Planck Institute for Intelligent Systems 1 Varun Jampani Lukas Balles Deqing Sun

9/15/2013 Using Information Systems to Achieve Competitive Advantage Porters Competitive Forces

ECO 610: Lecture 7 Perfectly Competitive Markets Perfectly Competitive Markets: Outline

Mexico: a leading economy A country with competitive sectors ProMxicos strategies 3

Competitive Proposals 75 Contents Competitive Proposals (RFPs, RFQs) Procurement of

Multidimensional Scheduling (Polytope Scheduling Problem) Competitive Algorithms from Competitive

Competitive Path Assessment for MRTU Competitive Path Assessment for MRTU Preliminary Results for

California Cadet Corps Curriculum on Military Knowledge On Target! M13/C: Competitive

COMPETITIVE MULTITASK MARINE TECHNOLOGY Ocean Cleaner Technology S.L. is a competitive marine

The Planning of System Upgrades in a Competitive and Competitive and Evolving Environment

Competitive Obedience Intro, Rules, and Tips Topics What is Competitive Obedience? Levels

10/5/2010 Building Competitive Advantage Chapter 4 4 | 1 Success and Strategy Within an

Creating a Globally Competitive Workforce Glenn Walters 1 Overview Global Competitive

Competitive Integrated Employment The Time is Now!! Competitive, Integrated Employment: WIOA

Competitive Learning Neural Networks Neural Networks - Competitive 1 Bibliography Rumelhart, D.

Competitive Grants Reporting Requirements UNITED STATES DEPARTMENT OF LABOR Veterans

Class 9 @rwdkent Competitive Analysis Competitive Analysis Understand competition and how they

SNAPSEED, a Photo Editing App for Mobile Devices Nancy Matheson Snapseed is a photo-editing

Lens Blur Skill in Photoshop Why to Blur in Photoshop ? Focus on your subject

Image Registration and Motion Estimation Fabio Viola University of Cambridge The Goal Produce

Image Restoration Yao Wang Polytechnic Institute of NYU, Brooklyn, NY 11201 y , y , Partly

NVIDIA DRIVE December 2019 DRIVE PLATFORM Mass production INDUSTRY / MARKET SAFETY TECHNOLOGY

Real-Time Pedestrian Tracking, Prediction &amp; Navigation Dinesh Manocha Univ. of North

Wireless-aware Design for Interactive Reality Applications November 4, 2019 Suman Banerjee,

U P T I M E U P T I M E U P T I M E R I S K L E S S M O R E U P T I M E R I S K U P T I M E U

Real-Time Pedestrian Tracking, Prediction & Navigation Dinesh Manocha Univ. of North