Learning Optical Flow with Limited Data Jia Xu ( ) T encent AI Lab - - PowerPoint PPT Presentation

learning optical flow with limited data
SMART_READER_LITE
LIVE PREVIEW

Learning Optical Flow with Limited Data Jia Xu ( ) T encent AI Lab - - PowerPoint PPT Presentation

Learning Optical Flow with Limited Data Jia Xu ( ) T encent AI Lab 2019-03-14 1 Introduction Input Output Dense correspondence for each pixel between two frames 2 Why Optical Flow? q Optical flow has a wide range of applications.


slide-1
SLIDE 1

Learning Optical Flow with Limited Data

1

Jia Xu ()

T encent AI Lab 2019-03-14

slide-2
SLIDE 2

2

Introduction

Dense correspondence for each pixel between two frames Input Output

slide-3
SLIDE 3

3

q Optical flow has a wide range of applications.

Why Optical Flow?

Object Tracking Autonomous Driving Video Action Recognition 3D Shape Reconstruction

slide-4
SLIDE 4

4

History of Optical Flow Estimation

  • 91925927

418 9418 8 628 8-4

  • 888
  • .0-4

8 488

slide-5
SLIDE 5

DC Flow

Xu, Ranftl, Koltun. Accurate Optical Flow via Direct Cost Volume Processing. CVPR 2017

slide-6
SLIDE 6

6

q Advantage: high performance while running at real time. q Disadvantage: need a large amount of labeled data è difficult to obtain.

CNNs for Optical Flow

FlowNet PWC-Net

Fischer et al. 2015, "FlowNet: Learning Optical Flow with Convolutional Networks" Sun et al. 2018, "PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume"

slide-7
SLIDE 7

7

q Advantage: high performance while running at real time. q Disadvantage: need a large amount of labeled data è difficult to obtain. n Pre-training on synthetic dataset: domain gap. n Unsupervised learning: performance gap, cannot predict flow of occluded pixels.

CNNs for Optical Flow

Meister et al. 2018, “UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss"

slide-8
SLIDE 8

8

Unsupervised Learning for Optical Flow

How to learn optical flow of occluded pixels in a totally unsupervised way?

slide-9
SLIDE 9

9

q Unsupervised Learning: detect occlusion and exclude occluded pixels. Ø The optical flow of non-occluded pixels can be accurately estimated. Ø How do we fully utilize those reliable non-occluded predictions? Ø Data Distillation!

Key Observation

Non- Occluded Occluded

Liu, King, Lyu, Xu. DDFlow: Learning Optical Flow with Unlabeled Data Distillation. AAAI 2019

slide-10
SLIDE 10

10

q Teacher model is trained with photometric loss !" for non-occluded pixels.

Framework

Teacher Model Teacher Model Student Model

Forward- backward consistency check

!1 & !2 !2 & !1

!2 !1

Warped Image

!1

&

Warped Image

!2

&

Image Warp Photometric Loss

Forward Flow

w(

Backward Flow

w)

Backward Occlusion

*)

Forward Occlusion

*(

slide-11
SLIDE 11

11

q Student model has the same network structure as teacher model.

Framework

Forward Flow

w "#

Backward Flow

w "$

Backward Occlusion

% &$

Forward Occlusion

% &# Teacher Model Teacher Model Student Model Student Model

Forward- backward consistency check Forward- backward consistency check

'1 & '2 '2 & '1 ' ,

2

& ' ,

1

' ,

1

& ' ,

2

'2 '1 '̃2 '̃1

Warped Image

'1

.

Warped Image

'2

.

Warped Image

' ,

1 .

Warped Image

' ,

2 .

Image Warp Image Warp Photometric Loss Photometric Loss

Forward Flow

w#

Backward Flow

w$

Backward Occlusion

%$

Forward Occlusion

%#

slide-12
SLIDE 12

12

q Student model is trained with both !" for non-occluded pixels and !# for

  • ccluded pixels. Only student model is needed during testing.

Framework

Forward Flow

w "#

Backward Flow

w "$

Backward Occlusion

% &$

Forward Occlusion

% &# Teacher Model Teacher Model Student Model Student Model

Forward- backward consistency check Forward- backward consistency check

'1 & '2 '2 & '1 ' ,

2

& ' ,

1

' ,

1

& ' ,

2

'2 '1 '̃2 '̃1

Cropped Occlusion

%$

.

Cropped Occlusion

%

# .

Valid Mask

/

#

Valid Mask

/$

Warped Image

'1

Warped Image

'2

Warped Image

' ,

1

Warped Image

' ,

2

Image Warp Image Warp Photometric Loss Photometric Loss Loss for Occluded Pixels

Forward Flow

w#

Backward Flow

w$

Backward Occlusion

%$

Forward Occlusion

%#

!# only functions

  • n pixels that are

non-occluded in

  • riginal images

but occluded in cropped patches.

slide-13
SLIDE 13

13

q Occlusion estimation: based on the forward-backward consistency prior q Photometric loss !" q Loss for occluded pixels !# q Teacher model: ! = !" q Student Model: ! = !" + !#

Loss Functions

No hyperparameter !

slide-14
SLIDE 14

14

q Optical Flow Ø EPE: average endpoint error between the predicted flow and the ground truth flow over all pixels. Ø Fl: percentage of erroneous pixels. A pixel is considered to be correctly estimated if flow end-point error is < 3 pixels or <5%. q Occlusion estimation Ø F-score: the harmonic average of the precision and recall.

Evaluation Metrics

slide-15
SLIDE 15

15

q DDFlow outperforms all existing unsupervised flow learning methods on all datasets.

Quantitative Comparisons

slide-16
SLIDE 16

16

q Our pre-trained model on Flying Chairs even outperforms the finetuned state-

  • f-the-art unsupervised models on Sintel dataset.

Quantitative Comparisons

slide-17
SLIDE 17

17

q 28.6 % relative improvement on KITTI 2012, 37.7% relative improvement on KITTI 2015.

Quantitative Comparisons

slide-18
SLIDE 18

18

q 28.6 % relative improvement on KITTI 2012, 37.7% relative improvement on KITTI 2015. q On KITTI 2012, DDFlow outperforms Flownet 2.0 for ranking metric Fl-noc.

Quantitative Comparisons

slide-19
SLIDE 19

19

q DDFlow achieves the best occlusion estimation performance on Sintel Clean and Sintel Final datasets. q On KITTI dataset, the ground truth occlusion masks only contain pixels moving out of the image boundary. Under such setting, our method can achieve comparable performance.

Quantitative Comparisons

slide-20
SLIDE 20

20

q Sample results on Sintel datasets. The first three rows are from Sintel Clean, while the last three are from Sintel Final.

Qualitative Comparisons

slide-21
SLIDE 21

21

q Example results on KITTI datasets. The first three rows are from KITTI 2012, and the last three are from KITTI 2015. q Note that on KITTI datasets, the occlusion masks are sparse and only contain pixels moving out of the image boundary.

Qualitative Comparisons

slide-22
SLIDE 22

22

q Comparing row 1, 2 and row 3, 4: occlusion handling can improve flow estimation performance on all datasets.

Quantitative: Ablation Study

slide-23
SLIDE 23

23

q Comparing row 1, 2 and row 3, 4: occlusion handling can improve flow estimation performance on all datasets. q Comparing row 1, 3 and row 2, 4: census transform constantly improve performance.

Quantitative: Ablation Study

slide-24
SLIDE 24

24

q Comparing row 1, 2 and row 3, 4: occlusion handling can improve flow estimation performance on all datasets. q Comparing row 1, 3 and row 2, 4: census transform constantly improve performance. q Comparing row 4, 5: data distillation can greatly improve the performance, especially for occluded pixels, with EPE-OCC decreases 18.5% on Sintel Clean, 16.1% on Sintel Final, 58.2% on KITTI 2012 and 42.1% on KITTI 2015.

Quantitative: Ablation Study

slide-25
SLIDE 25

27

q The top part is the input frame and the bottom part is the corresponding

  • ptical flow estimated by DDFlow.

Video Flow Estimation on Sintel Dataset

slide-26
SLIDE 26

29

q Code and models available on https://github.com/ppliuboy/DDFlow.

DDFlow code

slide-27
SLIDE 27

What is Next?

slide-28
SLIDE 28

Motivation

  • Can we completely get rid of synthetic data?
  • Can we win Sintel back?

31

Liu, King, Lyu, Xu. SelFlow: Self-Supervised Learning of Optical Flow. CVPR 2019

slide-29
SLIDE 29

Initially, !" and !# are non-occluded from $% to $%&", !"

'

and !#

' are their corresponding pixels. NOC-Model can

accurately estimate the flow of !" and !# using photometric loss.

NOC Model Flow "# "#+1

&2 &1

&2

&1

$% $%&"

slide-30
SLIDE 30

We inject random noise to !"#$and let noise cover %$ and %&, then %$ and %& become occluded from !" to ' !"#$. OCC-Model cannot accurately estimate flow of %$ and %& using photometric loss.

NOC Model OCC Model Flow Flow "# "#+1

&2 &1

&2

&1

" )

#+1

"#

&1 &2

!" !"#$ ' !"#$

slide-31
SLIDE 31

NOC Model OCC Model Flow Flow Guide "# "#+1

&2 &1

&2

&1

" )

#+1

"#

&1 &2

We distill reliable flow estimations of !" and !# from NOC-Model to guide the flow learning for OCC-Model. The guidance is only employed to pixels that are

  • ccluded from $% to &

$%'" but non-occluded from $% to $%'", such as !" and !#. $% $%'" & $%'" Self- supervision mask

slide-32
SLIDE 32

Quantitative Results

slide-33
SLIDE 33

Our unsupervised results outperform all existing unsupervised results on all datasets by a large margin.

slide-34
SLIDE 34

Our unsupervised results even outperform several famous fully-supervised methods.

slide-35
SLIDE 35

Our fine-tuned models achieve state-of-the-art results without using any external labeled data.

slide-36
SLIDE 36
slide-37
SLIDE 37

Qualitative Results

slide-38
SLIDE 38

Effect of Self-supervision

slide-39
SLIDE 39

Reference Image Flow Estimation without Self-supervision Flow Estimation with Self-supervision

slide-40
SLIDE 40

Without Self-supervision Reference Image Flow Estimation without Self-supervision Flow Estimation with Self-supervision

slide-41
SLIDE 41

Without Self-supervision With Self-supervision Reference Image Flow Estimation without Self-supervision Flow Estimation with Self-supervision

slide-42
SLIDE 42

Reference Image Flow Estimation without Self-supervision Flow Estimation with Self-supervision

slide-43
SLIDE 43

With Self-supervision Without Self-supervision Reference Image Flow Estimation without Self-supervision Flow Estimation with Self-supervision

slide-44
SLIDE 44

Reference Image Flow Estimation without Self-supervision Flow Estimation with Self-supervision

slide-45
SLIDE 45

Reference Image Flow Estimation without Self-supervision Flow Estimation with Self-supervision

slide-46
SLIDE 46

Reference Image Flow Estimation without Self-supervision Flow Estimation with Self-supervision

slide-47
SLIDE 47

Compared with PWC-Net, our fine-tuned model estimates optical flow with more accurate details.

slide-48
SLIDE 48

Reference Image Flow Estimation using PWC-Net Flow Estimation using Our Fine-tuned Model

slide-49
SLIDE 49

Reference Image Flow Estimation using PWC-Net Flow Estimation using Our Fine-tuned Model

slide-50
SLIDE 50

Reference Image Flow Estimation using PWC-Net Flow Estimation using Our Fine-tuned Model

slide-51
SLIDE 51

Reference Image Flow Estimation using PWC-Net Flow Estimation using Our Fine-tuned Model

slide-52
SLIDE 52

To demonstrate the generalization ability of our model, we further show our flow estimation on real-word videos (from the DAVIS dataset).

slide-53
SLIDE 53

Reference Image Flow from Our Unsupervised Model Flow from Our Fine-tuned Model

slide-54
SLIDE 54

Reference Image Flow from Our Unsupervised Model Flow from Our Fine-tuned Model

slide-55
SLIDE 55

Q & A

58

Hiring in Vision and Graphics ;) http://pages.cs.wisc.edu/~jiaxu/