Video Stabilization
CS448V — Computational Video Manipulation April 2019
Video Stabilization CS448V Computational Video Manipulation April - - PowerPoint PPT Presentation
Video Stabilization CS448V Computational Video Manipulation April 2019 Fundamental problem that became even more relevant in recent years Important for producing high quality video and as a first step of many algorithms Important for
CS448V — Computational Video Manipulation April 2019
“In forming a video loop, we assume that the input video has already been stabilized.” [Liao et al. ’15]
Both at capture time and in post
Both at capture time and in post
Tripod
Both at capture time and in post
Tripod OIS
Both at capture time and in post
Tripod Gimbal OIS
Both at capture time and in post
capture time
Both at capture time and in post
capture time
Both at capture time and in post
capture time post production
Both at capture time and in post
capture time post production
manual automatic 2D 3D
Both at capture time and in post
capture time post production
manual automatic 2D 3D [Liu et al. ’13]
Input frames
Input frames
Detect features
Input frames
Detect features
Raw pixels, SURF , SIFT, …
Input frames
Detect features Calculate relation between photos
Raw pixels, SURF , SIFT, …
Input frames
Detect features Calculate relation between photos
Raw pixels, SURF , SIFT, … Homography, 3D camera location, …
Input frames
Detect features Calculate relation between photos
Raw pixels, SURF , SIFT, … Homography, 3D camera location, …
Smooth relation between photos
Input frames
Detect features Calculate relation between photos
Raw pixels, SURF , SIFT, … Homography, 3D camera location, …
Smooth relation between photos
Low pass filter, spline fitting, bilateral filter, …
Input frames
Detect features Calculate relation between photos
Raw pixels, SURF , SIFT, … Homography, 3D camera location, …
Smooth relation between photos Create frames using smoothed relation
Low pass filter, spline fitting, bilateral filter, …
Input frames
Detect features Calculate relation between photos
Raw pixels, SURF , SIFT, … Homography, 3D camera location, …
Smooth relation between photos Create frames using smoothed relation
Low pass filter, spline fitting, bilateral filter, … Warp frames, reconstruct from 3D, …
Input frames
Detect features Calculate relation between photos
Raw pixels, SURF , SIFT, … Homography, 3D camera location, …
Smooth relation between photos Create frames using smoothed relation
Output frames
Low pass filter, spline fitting, bilateral filter, … Warp frames, reconstruct from 3D, …
Input frames
Detect features Calculate relation between photos
Raw pixels, SURF , SIFT, … Homography, 3D camera location, …
Smooth relation between photos Create frames using smoothed relation
Output frames
Low pass filter, spline fitting, bilateral filter, … Warp frames, reconstruct from 3D, …
Toy example:
Input frames
Detect features Calculate relation between photos
Raw pixels, SURF , SIFT, … Homography, 3D camera location, …
Smooth relation between photos Create frames using smoothed relation
Output frames
Low pass filter, spline fitting, bilateral filter, … Warp frames, reconstruct from 3D, …
Toy example:
SIFT
Input frames
Detect features Calculate relation between photos
Raw pixels, SURF , SIFT, … Homography, 3D camera location, …
Smooth relation between photos Create frames using smoothed relation
Output frames
Low pass filter, spline fitting, bilateral filter, … Warp frames, reconstruct from 3D, …
Toy example:
SIFT 2D translation
Input frames
Detect features Calculate relation between photos
Raw pixels, SURF , SIFT, … Homography, 3D camera location, …
Smooth relation between photos Create frames using smoothed relation
Output frames
Low pass filter, spline fitting, bilateral filter, … Warp frames, reconstruct from 3D, …
Toy example:
SIFT 2D translation Gaussian
Input frames
Detect features Calculate relation between photos
Raw pixels, SURF , SIFT, … Homography, 3D camera location, …
Smooth relation between photos Create frames using smoothed relation
Output frames
Low pass filter, spline fitting, bilateral filter, … Warp frames, reconstruct from 3D, …
Toy example:
SIFT 2D translation Gaussian Warp
Input frames Output frames warp
2D
Input frames Output frames warp
2D
[Snavely et al. ’06]
Input frames Output frames
3D
Liu et al. SIGGRAPH 2013
Detect features Calculate relation between photos Smooth relation between photos Create frames using smoothed relation
Input frames Output frames
Detect features Calculate relation between photos Smooth relation between photos Create frames using smoothed relation
warping-based motion representation Input frames Output frames
Detect features Calculate relation between photos Smooth relation between photos Create frames using smoothed relation
warping-based motion representation adaptive space-time path smoothing Input frames Output frames
frame t frame t+1
frame t frame t+1 p = [v1
p
v2
p
v3
p
v4
p]
w1
p
w2
p
w3
p
w4
p 4
∑
i=1
wi
p = 1
Given that:
frame t frame t+1 p = [v1
p
v2
p
v3
p
v4
p]
w1
p
w2
p
w3
p
w4
p 4
∑
i=1
wi
p = 1
Given that: We would like: ̂ p = [ ̂ v1
p
̂ v2
p
̂ v3
p
̂ v4
p]
w1
p
w2
p
w3
p
w4
p
̂ p = ̂ Vpwp
frame t frame t+1 p = [v1
p
v2
p
v3
p
v4
p]
w1
p
w2
p
w3
p
w4
p 4
∑
i=1
wi
p = 1
Given that: Data term: ∑
p
∥ ̂ Vpwp − ̂ p∥2 We would like: ̂ p = [ ̂ v1
p
̂ v2
p
̂ v3
p
̂ v4
p]
w1
p
w2
p
w3
p
w4
p
̂ p = ̂ Vpwp
Shape-preserving term: Distance from similarity transform
Shape-preserving term: Distance from similarity transform sounds familiar?…
data shape-preserving
with without
Shape-preserving term
with without
Shape-preserving term
with without
Shape-preserving term
frame t frame t+1
We now have a local homography Fi(t) for each cell i of frame t
Outlier rejection: dual-scale RANSAC
Outlier rejection: dual-scale RANSAC global homography Course discard outliers
threshold
Outlier rejection: dual-scale RANSAC global homography Course discard outliers
threshold Fine local homographies discard outliers
threshold
Adaptive regularization
Calculate α per frame Fitting error: average residual of feature matching Smoothness error: L2 distance between neighboring homographies
Estimate for different α and pick minimal error
Detect features Calculate relation between photos Smooth relation between photos Create frames using smoothed relation
warping-based motion representation adaptive space-time path smoothing Input frames Output frames
Optimize one
Optimize all Optimize one
t
Data term: blue should match red t
Data term: blue should match red t Smoothness term: blue at time t should match the (60) frames around t
Data term: blue should match red t
min ∑
t
∥P(t) − C(t)∥2 + λt ∑
r∈Ωt
ωt,r(C) ⋅ ∥P(t) − P(r)∥2
data term smoothness term Smoothness term: blue at time t should match the (60) frames around t
Slides adapted from Sylvain Paris
1D image = line of pixels
1D image = line of pixels
pixel intensity pixel position
Better visualized as a plot
Gaussian blur
Ip = ∑
q
Gσs(∥p − q∥)Iq
space p q
Gaussian blur
Ip = ∑
q
Gσs(∥p − q∥)Iq
space p q
Bilateral filter
[Aurich 95, Smith 97, Tomasi 98]
spatial and range distances
Ip = 1 Wp ∑
q
Gσs(∥p − q∥)Gσr(|Ip − Iq|)Iq
space range p q
Back to stabilization…
min ∑
t
∥P(t) − C(t)∥2 + λt ∑
r∈Ωt
ωt,r(C) ⋅ ∥P(t) − P(r)∥2
data term smoothness term
min ∑
t
∥P(t) − C(t)∥2 + λt ∑
r∈Ωt
ωt,r(C) ⋅ ∥P(t) − P(r)∥2
data term smoothness term
ωt,r = Gt(∥r − t∥) ⋅ Gm(∥C(r) − C(t)∥)
min ∑
t
∥P(t) − C(t)∥2 + λt ∑
r∈Ωt
ωt,r(C) ⋅ ∥P(t) − P(r)∥2
data term smoothness term
ωt,r = Gt(∥r − t∥) ⋅ Gm(∥C(r) − C(t)∥)
distance between frames
min ∑
t
∥P(t) − C(t)∥2 + λt ∑
r∈Ωt
ωt,r(C) ⋅ ∥P(t) − P(r)∥2
data term smoothness term
ωt,r = Gt(∥r − t∥) ⋅ Gm(∥C(r) − C(t)∥)
distance between frames distance between camera poses
min ∑
t
∥P(t) − C(t)∥2 + λt ∑
r∈Ωt
ωt,r(C) ⋅ ∥P(t) − P(r)∥2
data term smoothness term setting the weights λt
min ∑
t
∥P(t) − C(t)∥2 + λt ∑
r∈Ωt
ωt,r(C) ⋅ ∥P(t) − P(r)∥2
data term smoothness term setting the weights λt Run optimization with global weight For each frame While too much cropping or distortion Decrease weight and re-run
i
t
j∈N(i)
single path smoothness between neighboring paths i N(i)
Detect features Calculate relation between photos Smooth relation between photos Create frames using smoothed relation
warping-based motion representation adaptive space-time path smoothing Input frames Output frames
always check supplemental…
always check supplemental…
always check supplemental…
always check supplemental…
Detect features Calculate relation between photos Smooth relation between photos Create frames using smoothed relation Input Output
Detect features Calculate relation between photos Smooth relation between photos Create frames using smoothed relation Input Output
Detect features Calculate relation between photos Smooth relation between photos Create frames using smoothed relation Input Output
space range p q