SLIDE 1 Michael Rubinstein
MIT CSAIL
Motion Denoising
with Application to Time-lapse Photography
Ce Liu
Microsoft Research NE
Peter Sand Fredo Durand
MIT
Bill Freeman
MIT
SLIDE 2
Time-lapse Videos
Construction Natural phenomena Medical Biological/Botanical
SLIDE 3 For Personal Use Too!
Source: YouTube
9 months 7 years 16 years
http://www.danhanna.com/aging_project/p.html
SLIDE 4
“Stylized Jerkiness”
SLIDE 5
Motion Denoising
Time World Time-lapse Space Motion denoising
SLIDE 6
Motion Denoising
Motion denoising
SLIDE 7
- Video summarization (video time-lapse)
- Time-lapse editing
Time-lapse in Vision/Graphics Research
[Bennett and McMillan 2007] [Pritch et al. 2008] [Sunkavalli et al. 2007]
SLIDE 8
- Naïve low-pass (temporal) filtering
– Pixels of different objects are averaged
- Smoothing motion trajectories
– Motion estimation in time-lapse videos is hard! * Motion discontinuities * Color inconsistencies
Motion Denoising is Challenging!
KLT tracks
SLIDE 9
- Key idea: long-term events in videos can be statistically
explained within some local spatiotemporal support, while short- term events are more distinctive
– Assumption: world is smooth – Short-term variation = noise, long-term variation = signal
- Our algorithm reshuffles the pixels in both space and time to
maintain long-term events in the video, while removing short- term noisy motions
Formulation
SLIDE 10 𝐹 𝑥 = |𝐽 𝑞 + 𝑥(𝑞) − 𝐽(𝑞
𝑞
)| + 𝛽 𝐽(𝑞 + 𝑥 𝑞 ) − 𝐽 𝑠 + 𝑥(𝑠)
2 𝑞,𝑠∈𝑂𝑢(𝑞)
+ 𝛿 𝜇𝑞𝑟|𝑥 𝑞 − 𝑥 𝑟 |
𝑞,𝑟∈𝑂(𝑞)
Formulation
𝑞 = (𝑦, 𝑧, 𝑢) 𝐽 – input video, 𝐽(𝑞 + 𝑥 𝑞 ) – output video 𝑂𝑢 𝑞 - Temporal neighbors of 𝑞, 𝑂 𝑞 - Spatiotemporal neighbors of 𝑞 𝑥 𝑞 ∈ 𝜀𝑦, 𝜀𝑧, 𝜀𝑢 : |𝜀𝑦| ≤ Δ𝑡, 𝜀𝑧 ≤ Δ𝑡, 𝜀𝑢 ≤ Δ𝑢 - displacement field 𝜇𝑞𝑟 = exp −𝛾 𝐽 𝑞 − 𝐽 𝑟
2 , 𝛾 = 2
𝐽 𝑞 − 𝐽 𝑟
2 −1
Fidelity (to input) Temporal coherence (of the result) Regularization (of the warp)
SLIDE 11
- Optimized discretely on a 3D MRF
– Nodes represent pixels – state space of each pixel = volume of possible spatiotemporal shifts
- Complicated (huge!) inference problem
– E.g. 5003 nodes, 103 states per node – Optimize using Loopy Belief Propagation
Optimization
SLIDE 12
message passing
– Message structure stored on disk; read and write message chunks on need
𝜔𝑞 𝑥 𝑞 = 𝐽 𝑞 + 𝑥 𝑞 − 𝐽 𝑞 𝜔𝑞𝑠
𝑢
𝑥 𝑞 ,𝑥 𝑠 = 𝛽 𝐽 𝑞 + 𝑥 𝑞 − 𝐽 𝑠 + 𝑥 𝑠
𝟑 +
𝛿𝜇𝑞𝑠|𝑥 𝑞 − 𝑥 𝑠 | 𝜔𝑞𝑟
𝑢
𝑥 𝑞 ,𝑥 𝑟 = 𝛿𝜇𝑞𝑟|𝑥 𝑞 − 𝑥 𝑟 |
Optimization
Linear in state space + Pre-compute Quadratic in state space (non convex) Quadratic in state space But can be computed in linear time (distance transforms)
SLIDE 13
- Spatiotemporal video pyramid
– Smooth spatially – Sample temporally
- Displacements in the coarse level
used as centers for the search volume in the finer level
Multi-scale Processing
SLIDE 14
Results
x y future past
SLIDE 15
Comparing with Other Optimization Techniques
SLIDE 16
Results
x y future past
SLIDE 17
Results
SLIDE 18 Comparison with Naïve Temporal Filtering
x t
SLIDE 19
Support Size
SLIDE 20
Motion-scale Decomposition
SLIDE 21
Motion-scale Decomposition
SLIDE 22
Other Scenarios
SLIDE 23
- User-controlled motion scales
– Not necessarily binary decomposition into long-term and short-term
- Modify the time-lapse capturing process to help post-
processing
– E.g. use short videos instead of still images and find best “path” through the video
- Explore motion-denoising with time-lapse from other
domains
– Embryos research, satellite imagery
Future Work
SLIDE 24
http://csail.mit.edu/mrub/timelapse
Thank you!