Event-driven Video Frame Synthesis Zihao Wang 1 , Weixin Jiang 1 , - - PowerPoint PPT Presentation

event driven video frame synthesis
SMART_READER_LITE
LIVE PREVIEW

Event-driven Video Frame Synthesis Zihao Wang 1 , Weixin Jiang 1 , - - PowerPoint PPT Presentation

Event-driven Video Frame Synthesis Zihao Wang 1 , Weixin Jiang 1 , Kuan He 1 , Boxin Shi 2 , Aggelos Katsaggelos 1 , Oliver Cossairt 1 1 Northwestern University 2 Peking University 2 nd Intl Workshop on Physics Based Vision meets Deep Learning


slide-1
SLIDE 1

Event-driven Video Frame Synthesis

Zihao Wang1, Weixin Jiang1, Kuan He1, Boxin Shi2, Aggelos Katsaggelos1, Oliver Cossairt1

1 Northwestern University 2 Peking University 2nd Int’l Workshop on Physics Based Vision meets Deep Learning (PBDL) in conjunction with

slide-2
SLIDE 2

Physics-based vision meets deep learning

10/29/2019 PBDL2019, ICCV Workshop 2

Physics-based vision Designed priors following physics laws Initial point Target Objective

slide-3
SLIDE 3

Physics-based vision meets deep learning

10/29/2019 PBDL2019, ICCV Workshop 3

Initial point Physics-based optimization Physics-based vision Subject to noise, or incomplete modeling

slide-4
SLIDE 4

Physics-based vision meets deep learning

10/29/2019 PBDL2019, ICCV Workshop 4

Subject to noise, or incomplete modeling Noisy input Physics-based optimization Physics-based vision

slide-5
SLIDE 5

Physics-based vision meets deep learning

10/29/2019 PBDL2019, ICCV Workshop 5

usually have superior performance as long as you have sufficient data and GPUs End-to-end non-linear fitting Learning-based vision

slide-6
SLIDE 6

Physics-based vision meets deep learning

10/29/2019 PBDL2019, ICCV Workshop 6

End-to-end non-linear fitting Learning-based vision Learning from simulation!

slide-7
SLIDE 7

Physics-based vision meets deep learning

10/29/2019 PBDL2019, ICCV Workshop 7

Learning from simulation! The gap between simulation and real data End-to-end non-linear fitting simulation real Learning-based vision

slide-8
SLIDE 8

Physics-based vision meets deep learning

10/29/2019 PBDL2019, ICCV Workshop 8

Train with data augmentation End-to-end non-linear fitting Learning-based vision

slide-9
SLIDE 9

Physics-based vision meets deep learning

10/29/2019 PBDL2019, ICCV Workshop 9

End-to-end non-linear fitting Learning-based vision Require retraining even for similar tasks.

slide-10
SLIDE 10

Physics-based vision meets deep learning

10/29/2019 PBDL2019, ICCV Workshop 10

Require retraining even for similar tasks. End-to-end non-linear fitting 1-frame interp. 9-frame interp. 10-frame interp. NN1 NN2 NN3 Learning-based vision E.g. video frame interpolation

slide-11
SLIDE 11

Physics-based vision meets deep learning

10/29/2019 PBDL2019, ICCV Workshop 11

Physics Physics Physics Physics + learning based vision Learning Framework:

  • First use a unifying physics-based

approach to have rough estimation

  • Then use DNN to learn the residual.

Suitable application: multi-modal video synthesis

slide-12
SLIDE 12

Background: rethinking frame-based imaging

10/29/2019 Wang et al. PBDL2019, ICCV Workshop 12

Shutter on Photon integration Shutter off A/D conversion Readout Other processing

Exposure time Post-exposure time Long – blurry, over exposure Short – noisy, under expo. {ADC + synchronized read-out} – discrepancy between frames (frame interpolation)

slide-13
SLIDE 13

Motivation

10/29/2019 Wang et al. PBDL2019, ICCV Workshop 13

  • We need “smart” cameras that:
  • Can respond to high speed motions (eliminate blur)
  • Do not always operate at high speed (less data redundancy)
  • Potential solution: event cameras

Frame-based camera pipeline

Shutter on Photon integration Shutter off A/D conversion Readout Other processing

slide-14
SLIDE 14

What’s event camera? Another high-speed camera?

10/29/2019 Wang et al. PBDL2019, ICCV Workshop 14

Scenario: moving poster with shapes

Capture: 22 FPS Display: 1.1 FPS Data from DAVIS dataset

Each pixel:

  • Compare brightness variations
  • (blue: increase; red: decrease)
  • Small latency (micro-second level)
  • 106 FPS! (at max)
  • Works independently (asynchronous)
slide-15
SLIDE 15

But…

10/29/2019 Wang et al. PBDL2019, ICCV Workshop 15

  • Events ≠ temporal gradient
  • Infinitely many solutions to infer intensity from events.
  • Cannot capture weak variations
slide-16
SLIDE 16

But…

10/29/2019 Wang et al. PBDL2019, ICCV Workshop 16

  • Events ≠ temporal gradient
  • Infinitely many solutions to infer intensity from

events.

  • Cannot capture weak variations
  • Events are very noisy
  • Noise model not well understood
  • Gaussian on threshold
  • Event denoisers not advanced
  • Able to cancel isolated events (correlation)
  • Cannot handle complex scenarios, e.g.

illumination change

Example images overlaid with neighbor events data from DAVIS dataset and Pan et al. CVPR’19

slide-17
SLIDE 17

10/30/2019 PBDL2019, ICCV Workshop 17

We propose intensity frame + events for high frame-rate video synthesis

slide-18
SLIDE 18

Our approach: fusion of intensity frame + events

10/29/2019 PBDL2019, ICCV Workshop 21

DMR: Differentiable Model-based Reconstruction

slide-19
SLIDE 19

Differentiable model (event sensing)

10/29/2019 PBDL2019, ICCV Workshop 22

Per-pixel sensing model Differentiable model (approx.) denotes one pixel of is tth frame of tth event frame

slide-20
SLIDE 20

Differentiable model (frame sensing)

10/29/2019 PBDL2019, ICCV Workshop 23

We consider 3 temporal settings: interpolation, prediction and motion deblur.

slide-21
SLIDE 21

Reconstruction loss and optimization

10/29/2019 PBDL2019, ICCV Workshop 24

Pixel loss Sparsity loss

Use stochastic gradient descent (SGD) to minimize the loss. As loss decreases, results get closer to ground truth.

Objective

Frame pixel error Event pixel error

slide-22
SLIDE 22

Results (DMR)

10/29/2019 29

  • Interpolation case
  • Given start & end frames + events in-between, recover intermediate frames

Low-speed intensity frames (2 frames)

The middle frame is withheld for evaluation

Event frames (20 frames) High-speed video (21 frames) Frame #10 Error map of Frame #10

Wang et al. PBDL2019, ICCV Workshop

slide-23
SLIDE 23

Results (DMR)

  • Prediction case
  • Given start frame and future events, recover future frames

10/29/2019 30

CF [ACCV’18] Ours

Wang et al. PBDL2019, ICCV Workshop

slide-24
SLIDE 24

Results (DMR)

10/29/2019 31

Blurry images Events during exposure EDI [CVPR’19] Ours Video recovery

  • Motion deblur case
  • Given a blurry image + events in-exposure, recover intermediate sharp frames.

Wang et al. PBDL2019, ICCV Workshop

slide-25
SLIDE 25

Overview of our approach

10/29/2019 PBDL2019, ICCV Workshop 32

slide-26
SLIDE 26

Residual “denoiser”

10/30/2019 33 Wang et al. PBDL2019, ICCV Workshop

  • Use CNN to learn the residual of DMR output w.r.t. ground truth
  • Designed to enhance DMR results
  • Easy to train
  • Model DMR artifacts as residual “noise”
  • Actually beyond Gaussian denoising
  • Single frame based
  • Interface well with DMR
slide-27
SLIDE 27

Results (residual denoiser)

10/29/2019 34 Wang et al. PBDL2019, ICCV Workshop

Ours (DMR) DnCNN [TIP’17] FFDNet [TIP’18] Ours (RD) Ground truth

slide-28
SLIDE 28

10/29/2019 35

Ours (DMR) DnCNN [TIP’17] FFDNet [TIP’18] Ours (RD) Ground truth

Wang et al. PBDL2019, ICCV Workshop

slide-29
SLIDE 29

Results

10/29/2019 PBDL2019, ICCV Workshop 36

  • Comparison with non-event-based frame interpolation approach
  • Events can provide additional information which is useful for challenging motions.

Ground truth Ours (DMR + RD) SepConv [CVPR’17]

slide-30
SLIDE 30

10/29/2019 PBDL2019, ICCV Workshop 37

DMR DMR DMR RD image + events interpolation prediction motion deblur Thank you! Zihao (Winston) Wang zwinswang@gmail.com