Facial Landmark Tracking for Mobile
Instructor - Simon Lucey
16-623 - Designing Computer Vision Apps
Facial Landmark Tracking for Mobile Instructor - Simon Lucey - - PowerPoint PPT Presentation
Facial Landmark Tracking for Mobile Instructor - Simon Lucey 16-623 - Designing Computer Vision Apps Thatcher Effect Thompson, P. (1980). "Margaret Thatcher: a new illusion". Perception. 9 (4) Thatcher Effect Thompson, P. (1980).
Facial Landmark Tracking for Mobile
Instructor - Simon Lucey
16-623 - Designing Computer Vision Apps
Thatcher Effect
Thompson, P. (1980). "Margaret Thatcher: a new illusion". Perception. 9 (4)
Thatcher Effect
Thompson, P. (1980). "Margaret Thatcher: a new illusion". Perception. 9 (4)
Evil Twin App
Snapchat’s Filters
Taken from http://petapixel.com/2016/06/30/snapchats-powerful-facial-recognition-technology-works/
Snapchat’s Filters
Taken from http://petapixel.com/2016/06/30/snapchats-powerful-facial-recognition-technology-works/
Facial Landmark Alignment History
Cootes, Edwards and Taylor, 1998 (Active Appearance Models) Matthews and Baker, 2004 (Active Appearance Models Revisited) Cootes and Taylor, 1992 (Active Shape Models) Cristinacce and Cootes, 2004. (Constrained Local Models) Zhou, Gu, and Zhang, 2003 (Bayesian Tangent Shape Models)
How many points?
(a) MultiPIE/IBUG (b) XM2VTS (c) FRGC-V2 (d) AR (e) LFPW (f) HELEN (g) AFW (h) AFLW
Sagonas, Christos, et al. "300 faces in-the-wild challenge: Database and results." Image and Vision Computing 47 (2016): 3-18.
Human Annotator Error
Sagonas, Christos, et al. "300 faces in-the-wild challenge: Database and results." Image and Vision Computing 47 (2016): 3-18.
300 Faces in the Wild DB
Sagonas, Christos, et al. "300 faces in-the-wild challenge: Database and results." Image and Vision Computing 47 (2016): 3-18.
“Indoor” “Outdoor”
Today
Constrained Local Models
10
Constrained Local Models
10
Constrained Local Models
10
CLM - Extracting Patch Responses
11
(110x110 pixels)
CLM - Extracting Patch Responses
11
(110x110 pixels)
CLM - Extracting Patch Responses
11
(110x110 pixels) “Current Estimate for point n”
CLM - Extracting Patch Responses
11
(110x110 pixels) “Current Estimate for point n”
“Groundtruth for point n”
Constrained Local Models
12
(110x110 pixels) “nth Constrained Local Search Area”
ith Patch Expert
ith Constrained Local Search Area
Constrained Local Models
13
(e.g., 30x30 pixels) “nth Patch Response”
“Uses Convolution“
∗
CLM - Extracting Patch Responses
(110x110 pixels)
14
CLM - Extracting Patch Responses
(110x110 pixels)
14
D1(p1) = DN(pN) =
CLM - Extracting Patch Responses
(110x110 pixels)
.............................
15
Step 1. Pre-compute N local responses.
Dt
i(pi)
∆p∗ = arg min
∆p N
X
i=1
Dt
i(pi + ∆pi) + λ R(p + ∆p)
pt+1 pt ∆p∗
CLM Algorithm
Step 2. Step 3.
“Update the patch center positions.”
Step 4. repeat steps 1 - 3.
“Iterate until convergence.”
(a)
231
................. “Generate responses.” “Optimization step.”
16
Step 1. Pre-compute N local responses.
Dt
i(pi)
∆p∗ = arg min
∆p N
X
i=1
Dt
i(pi + ∆pi) + λ R(p + ∆p)
pt+1 pt ∆p∗
CLM Algorithm
Step 2. Step 3.
“Update the patch center positions.”
Step 4. repeat steps 1 - 3.
“Iterate until convergence.”
(a)
231
................. “Generate responses.” “Optimization step.”
Solving Step 2 is crucial, and most challenging component!!!
16
Point Distribution Model
17W(x; p)
R(p + ∆p)
Point Distribution Model
17W(x; p)
R(p + ∆p)
Point Distribution Model
17W(x; p)
∆pT L∆p
Mode Seeking
all responses that is consistent with the global geometry.
18
Mode Seeking
all responses that is consistent with the global geometry.
18
strong.
Mode Seeking
all responses that is consistent with the global geometry.
18
strong.
iteratively.
Constrained Mean Shift
19
Constrained Mean Shift
19
fitting.
“Constrained Mean Shift”
Saragih, Lucey and Cohn, ICCV 2009. (Constrained Mean Shifts)
Constrained Mean Shift
20
Constrained Mean Shift
20
Saragih, Lucey and Cohn, ICCV 2009. (Constrained Mean Shifts)
Results
21
ASM CQF BTSM CMS ASM CQF
Results
21
ASM CQF BTSM CMS ASM CQF
Results
22
0.2 0.4 0.6 0.8 1 2 4 6 8 10 ProportionofImages ShapeRMSError ASM(88ms) CQF(98ms) GMM(2410ms) KDE(121ms)
ASM (88ms) CQF (98ms) BTSM (2410ms) CMS (121ms) Shape RMS Error Proportion of Images
Saragih, Lucey, Cohn, ICCV 2009. Saragih, Lucey, Cohn, IJCV 2011.
23
CMS Results
Saragih, Lucey, Cohn, AFGR 2011.
Saragih, Lucey, Cohn, AFGR 2011.
CLM Extensions
have been proposed over the years.
as - Conditional Local Neural Fields (CLNF) obtains almost state of the art performance for facial landmark tracking.
provides an open source implementation.
http://www.cl.cam.ac.uk/research/rainbow/projects/openface/
CLM Drawbacks
performance CLM’s real-time performance is heavily linked to a large number of convolution operations.
application due to its computational cost.
Problem
27
⊗
CPU GPU NCC
Solution
28
CPU GPU Multiplication x R =
Solution
28
CPU GPU Multiplication x R =
Today
(SDM) to explicitly learn .
Reminder: SDMs
R
X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.
(SDM) to explicitly learn .
Reminder: SDMs
R
X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.
(SDM) to explicitly learn .
from given data by sampling
Reminder: SDMs
R
X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.
(SDM) to explicitly learn .
from given data by sampling
Reminder: SDMs
R
X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.
(SDM) to explicitly learn .
from given data by sampling
Reminder: SDMs
R
X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.
31
Ground-Truth Artificial Noise
31
Ground-Truth Artificial Noise
Reminder: SDMs
and geometry:
∆p = R[I(p) − T (0)]
X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.
Reminder: SDMs
and geometry:
∆p = R[I(p) − T (0)]
X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.
Reminder: SDMs
and geometry:
∆p = R[I(p) − T (0)]
X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.
Reminder: SDMs
and geometry:
∆p = R[I(p) − T (0)]
X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.
Reminder: SDMs
and geometry:
∆p = R[I(p) − T (0)]
X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.
Reminder: SDMs
and geometry:
∆p = R[I(p) − T (0)]
X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.
Reminder: SDMs
and geometry:
∆p = R[I(p) − T (0)]
X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.
Reminder: SDMs
and geometry:
∆p = R[I(p) − T (0)]
X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.
Reminder: SDMs
and geometry:
∆p = R[I(p) − T (0)]
R(k)
X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.
SDMs vs. CLMs
extraction, and matrix multiplication (no convolution).
IntraFace
http://www.humansensing.cs.cmu.edu/intraface
IntraFace
http://www.humansensing.cs.cmu.edu/intraface
Calculating SIFT for each landmark
costly for real-time performance.
SIFT Speed - Mac vs. iPhone 7
Reminder: Binary Features
chipset architectures,
1 : I(x + ∆1) > I(x + ∆2) 0 : otherwise
ψI(x, ∆1, ∆2) =
ψI(x) = X
i
2i−1ψI(x, ∆(i)
1 , ∆(i) 2 )
Why do Binary Features Work?
perception itself.
gain and bias of pixels, but the local ordering should remain relatively constant.
Reminder: BRIEF Descriptor
randomly and sparsely,
{∆1, ∆2}
Hamming distance in least squares
1 0 1 1 0 0 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 0 178 50 128 Hamming distance Squared distance
1
vs.
16384
φ1 =
> ∈ [0, 1]
φ1 =
> ∈ [0, 1]
. . .
> ∈ [0, 1]
φF =
Local Binary Feature - SDMs
Local Binary Feature - SDMs
1
Local Binary Feature - SDMs
φ2
1 1
Local Binary Feature - SDMs
φ2
φ3
1 1 1
Local Binary Feature - SDMs
φ2
φ3
1 1 1
[0 1 0]
Local Binary Feature - SDMs
concatenating
(b)
…
Local Binary Feature - SDMs
Projection Estimated Shape
Ground Truth ShapeTest
Mapping Local Binary Features Feature Mapping Linear Projection Estimated Shape
Ground Truth ShapeLocal Binary Feature - SDMs
70
300-W (68 landmarks)2 Method Fullset Common Subset Challenging Subset FPS ESR[5] 7.58 5.28 17.00 120 SDM[32] 7.52 5.60 15.40 70 LBF 6.32 4.95 11.98 320 LBF fast 7.37 5.38 15.50 3100
atasets, respectively. The errors of ESR and SDM are from our
LBF-SDMs - Drawbacks
mobile face tracking.
approach is only touching a sparse set of pixels.
Binary Approximated SIFT
for SDMs.
Binary Approximated SIFT
for SDMs.
θ = atan2(ryI(x), rxI(x))
Binary Approximated SIFT
for SDMs.
θ = atan2(ryI(x), rxI(x))
ryI(x)
rxI(x)
Binary Approximated SIFT
for SDMs.
ryI(x)
rxI(x) ryI(x) > 0 rxI(x) > 0 |ryI(x)| > |rxI(x)| 23 = 8 orientation bins
Binary Approximated SIFT
for SDMs.
comparison.
8 orientations
1 1 1 1
1 1 1 1
Calculating BASIFT extremely efficient.
1 1 1 1
Inspiration from the FFT
Sparse Compositional SDM
(a) Component 1 Structure (b) Component 2 Structure (c) Component 3 Structure (d) CompositionSpeed Comparison
90 FPS - iPhone 7
L L L
90 FPS - iPhone 7
More Examples.....
More Examples.....
More Examples.....
More Examples.....
More Examples.....