Facial Landmark Tracking for Mobile Instructor - Simon Lucey - - PowerPoint PPT Presentation

facial landmark tracking for mobile
SMART_READER_LITE
LIVE PREVIEW

Facial Landmark Tracking for Mobile Instructor - Simon Lucey - - PowerPoint PPT Presentation

Facial Landmark Tracking for Mobile Instructor - Simon Lucey 16-623 - Designing Computer Vision Apps Thatcher Effect Thompson, P. (1980). "Margaret Thatcher: a new illusion". Perception. 9 (4) Thatcher Effect Thompson, P. (1980).


slide-1
SLIDE 1

Facial Landmark Tracking for Mobile

Instructor - Simon Lucey

16-623 - Designing Computer Vision Apps

slide-2
SLIDE 2

Thatcher Effect

Thompson, P. (1980). "Margaret Thatcher: a new illusion". Perception. 9 (4)

slide-3
SLIDE 3

Thatcher Effect

Thompson, P. (1980). "Margaret Thatcher: a new illusion". Perception. 9 (4)

slide-4
SLIDE 4

Evil Twin App

slide-5
SLIDE 5

Snapchat’s Filters

Taken from http://petapixel.com/2016/06/30/snapchats-powerful-facial-recognition-technology-works/

slide-6
SLIDE 6

Snapchat’s Filters

Taken from http://petapixel.com/2016/06/30/snapchats-powerful-facial-recognition-technology-works/

slide-7
SLIDE 7

Facial Landmark Alignment History

  • Active Appearance Models (AAMs) - LK inspired

Cootes, Edwards and Taylor, 1998 (Active Appearance Models) Matthews and Baker, 2004 (Active Appearance Models Revisited) Cootes and Taylor, 1992 (Active Shape Models) Cristinacce and Cootes, 2004. (Constrained Local Models) Zhou, Gu, and Zhang, 2003 (Bayesian Tangent Shape Models)

  • Active Shape Models (ASMs)
slide-8
SLIDE 8

How many points?

(a) MultiPIE/IBUG (b) XM2VTS (c) FRGC-V2 (d) AR (e) LFPW (f) HELEN (g) AFW (h) AFLW

Sagonas, Christos, et al. "300 faces in-the-wild challenge: Database and results." Image and Vision Computing 47 (2016): 3-18.

slide-9
SLIDE 9

Human Annotator Error

Sagonas, Christos, et al. "300 faces in-the-wild challenge: Database and results." Image and Vision Computing 47 (2016): 3-18.

slide-10
SLIDE 10

300 Faces in the Wild DB

  • Contains 300 images from indoor and outdoor conditions.

Sagonas, Christos, et al. "300 faces in-the-wild challenge: Database and results." Image and Vision Computing 47 (2016): 3-18.

“Indoor” “Outdoor”

slide-11
SLIDE 11

Today

  • Constrained Local Models
  • Supervised Descent Models
  • Efficient Sparse SDMs
  • Efficient Dense SDMs
slide-12
SLIDE 12

Constrained Local Models

10

  • Start with an initial
slide-13
SLIDE 13

Constrained Local Models

10

  • Start with an initial
slide-14
SLIDE 14

Constrained Local Models

10

  • Start with an initial
  • Warp to a fixed size template (e.g., 110x110 pixels).
slide-15
SLIDE 15

CLM - Extracting Patch Responses

11

(110x110 pixels)

  • Try to fit a single point.
slide-16
SLIDE 16

CLM - Extracting Patch Responses

11

(110x110 pixels)

  • Try to fit a single point.
slide-17
SLIDE 17

CLM - Extracting Patch Responses

11

(110x110 pixels) “Current Estimate for point n”

  • Try to fit a single point.
slide-18
SLIDE 18

CLM - Extracting Patch Responses

11

(110x110 pixels) “Current Estimate for point n”

  • Try to fit a single point.

“Groundtruth for point n”

slide-19
SLIDE 19

Constrained Local Models

12

(110x110 pixels) “nth Constrained Local Search Area”

slide-20
SLIDE 20

ith Patch Expert

ith Constrained Local Search Area

Constrained Local Models

13

(e.g., 30x30 pixels) “nth Patch Response”

“Uses Convolution“

slide-21
SLIDE 21

CLM - Extracting Patch Responses

(110x110 pixels)

  • 68 points in total.

14

slide-22
SLIDE 22

CLM - Extracting Patch Responses

(110x110 pixels)

  • 68 points in total.

14

slide-23
SLIDE 23

D1(p1) = DN(pN) =

CLM - Extracting Patch Responses

(110x110 pixels)

  • Get responses for all N patch experts.

.............................

15

slide-24
SLIDE 24

Step 1. Pre-compute N local responses.

Dt

i(pi)

∆p∗ = arg min

∆p N

X

i=1

Dt

i(pi + ∆pi) + λ R(p + ∆p)

pt+1 pt ∆p∗

CLM Algorithm

  • Unlike Graph Fitting, CLM algorithm is iterative:-

Step 2. Step 3.

“Update the patch center positions.”

Step 4. repeat steps 1 - 3.

“Iterate until convergence.”

(a)

231

................. “Generate responses.” “Optimization step.”

16

slide-25
SLIDE 25

Step 1. Pre-compute N local responses.

Dt

i(pi)

∆p∗ = arg min

∆p N

X

i=1

Dt

i(pi + ∆pi) + λ R(p + ∆p)

pt+1 pt ∆p∗

CLM Algorithm

  • Unlike Graph Fitting, CLM algorithm is iterative:-

Step 2. Step 3.

“Update the patch center positions.”

Step 4. repeat steps 1 - 3.

“Iterate until convergence.”

(a)

231

................. “Generate responses.” “Optimization step.”

Solving Step 2 is crucial, and most challenging component!!!

16

slide-26
SLIDE 26

Point Distribution Model

17

W(x; p)

R(p + ∆p)

slide-27
SLIDE 27

Point Distribution Model

17

W(x; p)

R(p + ∆p)

slide-28
SLIDE 28

Point Distribution Model

17

W(x; p)

∆pT L∆p

slide-29
SLIDE 29

Mode Seeking

  • One can interpret the problem as seeking the “mode” across

all responses that is consistent with the global geometry.

18

slide-30
SLIDE 30

Mode Seeking

  • One can interpret the problem as seeking the “mode” across

all responses that is consistent with the global geometry.

18

  • Gaussian assumptions simplify the problem but are often too

strong.

slide-31
SLIDE 31

Mode Seeking

  • One can interpret the problem as seeking the “mode” across

all responses that is consistent with the global geometry.

18

  • Gaussian assumptions simplify the problem but are often too

strong.

  • If non-Gaussian, however, mode(s) MUST be found

iteratively.

  • Uses variations of the EM-Algorithm,
  • Can be slow (iterative).
  • Guaranteed of only local minima.
slide-32
SLIDE 32

Constrained Mean Shift

19

slide-33
SLIDE 33

Constrained Mean Shift

19

  • Saragih et al. proposed constrained mean shift.
  • Can be applied extremely efficiently using LUT.
  • Vary kernel density estimate (KDE) to perform coarse to fine

fitting.

“Constrained Mean Shift”

Saragih, Lucey and Cohn, ICCV 2009. (Constrained Mean Shifts)

slide-34
SLIDE 34

Constrained Mean Shift

20

slide-35
SLIDE 35

Constrained Mean Shift

20

Saragih, Lucey and Cohn, ICCV 2009. (Constrained Mean Shifts)

slide-36
SLIDE 36

Results

21

ASM CQF BTSM CMS ASM CQF

slide-37
SLIDE 37

Results

21

ASM CQF BTSM CMS ASM CQF

slide-38
SLIDE 38

Results

22

0.2 0.4 0.6 0.8 1 2 4 6 8 10 ProportionofImages ShapeRMSError ASM(88ms) CQF(98ms) GMM(2410ms) KDE(121ms)

ASM (88ms) CQF (98ms) BTSM (2410ms) CMS (121ms) Shape RMS Error Proportion of Images

slide-39
SLIDE 39

Saragih, Lucey, Cohn, ICCV 2009. Saragih, Lucey, Cohn, IJCV 2011.

23

CMS Results

slide-40
SLIDE 40

Saragih, Lucey, Cohn, AFGR 2011.

slide-41
SLIDE 41

Saragih, Lucey, Cohn, AFGR 2011.

slide-42
SLIDE 42

CLM Extensions

  • Numerous extensions to CLMs

have been proposed over the years.

  • A variant - which is referred to

as - Conditional Local Neural Fields (CLNF) obtains almost state of the art performance for facial landmark tracking.

  • A package called Open-Face

provides an open source implementation.

http://www.cl.cam.ac.uk/research/rainbow/projects/openface/

  • T. Baltrusaitis, L.-P. Morency, and P. Robinson. Constrained local neural fields for robust facial landmark detection in the wild. In ICCVW, 2013.
slide-43
SLIDE 43

CLM Drawbacks

  • Although exhibiting state of the art performance excellent

performance CLM’s real-time performance is heavily linked to a large number of convolution operations.

  • T. Baltrusaitis, L.-P. Morency, and P. Robinson. Constrained local neural fields for robust facial landmark detection in the wild. In ICCVW, 2013.
  • This makes the approach ill-suited for mobile device

application due to its computational cost.

slide-44
SLIDE 44

Problem

27

CPU GPU NCC

slide-45
SLIDE 45

Solution

28

CPU GPU Multiplication x R =

slide-46
SLIDE 46

Solution

28

CPU GPU Multiplication x R =

slide-47
SLIDE 47

Today

  • Constrained Local Models
  • Supervised Descent Models
  • Efficient Sparse SDMs
  • Efficient Dense SDMs
slide-48
SLIDE 48
  • Xiong & De La Torre proposed the Supervised Descent Method

(SDM) to explicitly learn .

Reminder: SDMs

R

X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.

slide-49
SLIDE 49
  • Xiong & De La Torre proposed the Supervised Descent Method

(SDM) to explicitly learn .

  • can be learned from data

Reminder: SDMs

R

X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.

slide-50
SLIDE 50
  • Xiong & De La Torre proposed the Supervised Descent Method

(SDM) to explicitly learn .

  • can be learned from data
  • One can generate a synthetic training set

from given data by sampling


Reminder: SDMs

R

X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.

slide-51
SLIDE 51
  • Xiong & De La Torre proposed the Supervised Descent Method

(SDM) to explicitly learn .

  • can be learned from data
  • One can generate a synthetic training set

from given data by sampling


Reminder: SDMs

R

X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.

slide-52
SLIDE 52
  • Xiong & De La Torre proposed the Supervised Descent Method

(SDM) to explicitly learn .

  • can be learned from data
  • One can generate a synthetic training set

from given data by sampling


  • Training objective:
  • Learns to directly predict the geometric displacement conditioned on appearance.

Reminder: SDMs

R

X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.

slide-53
SLIDE 53

31

Ground-Truth Artificial Noise

slide-54
SLIDE 54

31

Ground-Truth Artificial Noise

slide-55
SLIDE 55

Reminder: SDMs

  • Like IC-LK, SDM assumes a linear relationship between appearance

and geometry:

∆p = R[I(p) − T (0)]

X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.

slide-56
SLIDE 56

Reminder: SDMs

  • Like IC-LK, SDM assumes a linear relationship between appearance

and geometry:

∆p = R[I(p) − T (0)]

  • Iteratively updates until convergence

X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.

slide-57
SLIDE 57

Reminder: SDMs

  • Like IC-LK, SDM assumes a linear relationship between appearance

and geometry:

∆p = R[I(p) − T (0)]

  • Iteratively updates until convergence

X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.

slide-58
SLIDE 58

Reminder: SDMs

  • Like IC-LK, SDM assumes a linear relationship between appearance

and geometry:

∆p = R[I(p) − T (0)]

  • Iteratively updates until convergence

X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.

slide-59
SLIDE 59

Reminder: SDMs

  • Like IC-LK, SDM assumes a linear relationship between appearance

and geometry:

∆p = R[I(p) − T (0)]

  • Iteratively updates until convergence

X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.

slide-60
SLIDE 60

Reminder: SDMs

  • Like IC-LK, SDM assumes a linear relationship between appearance

and geometry:

∆p = R[I(p) − T (0)]

  • Iteratively updates until convergence

X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.

slide-61
SLIDE 61

Reminder: SDMs

  • Like IC-LK, SDM assumes a linear relationship between appearance

and geometry:

∆p = R[I(p) − T (0)]

  • Iteratively updates until convergence

X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.

slide-62
SLIDE 62

Reminder: SDMs

  • Like IC-LK, SDM assumes a linear relationship between appearance

and geometry:

∆p = R[I(p) − T (0)]

  • Iteratively updates until convergence

X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.

slide-63
SLIDE 63

Reminder: SDMs

  • Like IC-LK, SDM assumes a linear relationship between appearance

and geometry:

… …

∆p = R[I(p) − T (0)]

  • Iteratively updates until convergence
  • However, is iteration specific.

R(k)

X, Xuehan, and F. De la Torre. "Supervised descent method and its applications to face alignment." CVPR 2013.

slide-64
SLIDE 64

SDMs vs. CLMs

  • SDMs are inherently more computationally efficient than CLMs.
  • Computational cost is associated only with image warping, feature

extraction, and matrix multiplication (no convolution).

slide-65
SLIDE 65

IntraFace

http://www.humansensing.cs.cmu.edu/intraface

slide-66
SLIDE 66

IntraFace

http://www.humansensing.cs.cmu.edu/intraface

slide-67
SLIDE 67
  • SIFT Feature - SDMs
slide-68
SLIDE 68
  • SIFT Feature - SDMs
  • 1. Compute image gradients
  • 2. Pool into local histograms
  • 3. Concatenate histograms
  • 4. Normalize histograms
slide-69
SLIDE 69
  • SIFT Feature - SDMs
  • 1. Compute image gradients
  • 2. Pool into local histograms
  • 3. Concatenate histograms
  • 4. Normalize histograms

Calculating SIFT for each landmark

  • n a mobile device is in general too

costly for real-time performance.

slide-70
SLIDE 70

SIFT Speed - Mac vs. iPhone 7

  • A. Fagg, S. Sridharan, and S. Lucey, "Fast, Dense Feature SDM on an iPhone" Arix 2016.
slide-71
SLIDE 71

Reminder: Binary Features

  • Proposed by Calonder et al. ECCV 2010 (BRIEF).
  • Borrows idea that binary comparison is very fast on modern

chipset architectures,

1 : I(x + ∆1) > I(x + ∆2) 0 : otherwise

{

ψI(x, ∆1, ∆2) =

  • Combine features together compactly,

ψI(x) = X

i

2i−1ψI(x, ∆(i)

1 , ∆(i) 2 )

slide-72
SLIDE 72

Why do Binary Features Work?

  • Success of binary features says something about

perception itself.

  • Absolute values of pixels do not matter.
  • Makes sense as variable lighting source will effect the

gain and bias of pixels, but the local ordering should remain relatively constant.

slide-73
SLIDE 73

Reminder: BRIEF Descriptor

  • Do not need to “touch” all pixels, can choose pairs

randomly and sparsely,

{∆1, ∆2}

slide-74
SLIDE 74

Hamming distance in least squares

1 0 1 1 0 0 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 0 178 50 128 Hamming distance Squared distance

1

vs.

16384

  • H. Alismail, B. Browning, S. Lucey. "Enhancing Direct Camera Tracking with Feature Descriptors" ACCV 2016.
slide-75
SLIDE 75
  • Local Binary Feature - SDMs
  • S. Ren, et al. "Face alignment at 3000 fps via regressing local binary features." CVPR 2014.
slide-76
SLIDE 76
  • Local Binary Feature - SDMs
  • S. Ren, et al. "Face alignment at 3000 fps via regressing local binary features." CVPR 2014.
slide-77
SLIDE 77
  • Local Binary Feature - SDMs

φ1 =

> ∈ [0, 1]

  • S. Ren, et al. "Face alignment at 3000 fps via regressing local binary features." CVPR 2014.
slide-78
SLIDE 78
  • Local Binary Feature - SDMs

φ1 =

> ∈ [0, 1]

. . .

> ∈ [0, 1]

φF =

  • S. Ren, et al. "Face alignment at 3000 fps via regressing local binary features." CVPR 2014.
slide-79
SLIDE 79

Local Binary Feature - SDMs

slide-80
SLIDE 80

Local Binary Feature - SDMs

  • φ1

1

slide-81
SLIDE 81

Local Binary Feature - SDMs

  • φ1

φ2

1 1

slide-82
SLIDE 82

Local Binary Feature - SDMs

  • φ1

φ2

φ3

1 1 1

slide-83
SLIDE 83

Local Binary Feature - SDMs

  • φ1

φ2

φ3

1 1 1

[0 1 0]

slide-84
SLIDE 84

Local Binary Feature - SDMs

concatenating

  • (a)

(b)

  • S. Ren, et al. "Face alignment at 3000 fps via regressing local binary features." CVPR 2014.
slide-85
SLIDE 85

Local Binary Feature - SDMs

  • Learning Linear

Projection Estimated Shape

Ground Truth Shape
  • Train

Test

  • Learning Feature

Mapping Local Binary Features Feature Mapping Linear Projection Estimated Shape

Ground Truth Shape
  • S. Ren, et al. "Face alignment at 3000 fps via regressing local binary features." CVPR 2014.
slide-86
SLIDE 86

Local Binary Feature - SDMs

70

  • 21

300-W (68 landmarks)2 Method Fullset Common Subset Challenging Subset FPS ESR[5] 7.58 5.28 17.00 120 SDM[32] 7.52 5.60 15.40 70 LBF 6.32 4.95 11.98 320 LBF fast 7.37 5.38 15.50 3100

atasets, respectively. The errors of ESR and SDM are from our

  • S. Ren, et al. "Face alignment at 3000 fps via regressing local binary features." CVPR 2014.
slide-87
SLIDE 87

LBF-SDMs - Drawbacks

  • Offer unprecedented speed, making its application ideal for

mobile face tracking.

  • In reality, however, performance is often noisy/jittery as the

approach is only touching a sparse set of pixels.

slide-88
SLIDE 88

Binary Approximated SIFT

  • Recently, Fagg et al. proposed Binary Approximated SIFT

for SDMs.

  • A. Fagg, S. Sridharan, and S. Lucey, "Fast, Dense Feature SDM on an iPhone" Arxiv 2016.
slide-89
SLIDE 89

Binary Approximated SIFT

  • Recently, Fagg et al. proposed Binary Approximated SIFT

for SDMs.

θ = atan2(ryI(x), rxI(x))

  • A. Fagg, S. Sridharan, and S. Lucey, "Fast, Dense Feature SDM on an iPhone" Arxiv 2016.
slide-90
SLIDE 90

Binary Approximated SIFT

  • Recently, Fagg et al. proposed Binary Approximated SIFT

for SDMs.

θ = atan2(ryI(x), rxI(x))

ryI(x)

rxI(x)

  • A. Fagg, S. Sridharan, and S. Lucey, "Fast, Dense Feature SDM on an iPhone" Arxiv 2016.
slide-91
SLIDE 91

Binary Approximated SIFT

  • Recently, Fagg et al. proposed Binary Approximated SIFT

for SDMs.

ryI(x)

rxI(x) ryI(x) > 0 rxI(x) > 0 |ryI(x)| > |rxI(x)| 23 = 8 orientation bins

  • A. Fagg, S. Sridharan, and S. Lucey, "Fast, Dense Feature SDM on an iPhone" Arxiv 2016.
slide-92
SLIDE 92

Binary Approximated SIFT

  • Recently, Fagg et al. proposed Binary Approximated SIFT

for SDMs.

  • The approach approximates binning through binary

comparison.

8 orientations

1 1 1 1

  • A. Fagg, S. Sridharan, and S. Lucey, "Fast, Dense Feature SDM on an iPhone" Arxiv 2016.
slide-93
SLIDE 93
  • BA-SIFT Feature - SDMs

1 1 1 1

slide-94
SLIDE 94
  • BA-SIFT Feature - SDMs

Calculating BASIFT extremely efficient.

1 1 1 1

slide-95
SLIDE 95

Inspiration from the FFT

  • A. Fagg, S. Sridharan, and S. Lucey, "Fast, Dense Feature SDM on an iPhone" Arxiv 2016.
slide-96
SLIDE 96

Sparse Compositional SDM

(a) Component 1 Structure (b) Component 2 Structure (c) Component 3 Structure (d) Composition
  • A. Fagg, S. Sridharan, and S. Lucey, "Fast, Dense Feature SDM on an iPhone" Arxiv 2016.
slide-97
SLIDE 97

Speed Comparison

  • A. Fagg, S. Sridharan, and S. Lucey, "Fast, Dense Feature SDM on an iPhone" Arxiv 2016.
slide-98
SLIDE 98

90 FPS - iPhone 7

  • A. Fagg, S. Sridharan, and S. Lucey, "Fast, Dense Feature SDM on an iPhone" Arxiv 2016.

L L L

slide-99
SLIDE 99

90 FPS - iPhone 7

  • A. Fagg, S. Sridharan, and S. Lucey, "Fast, Dense Feature SDM on an iPhone" Arix 2016.
slide-100
SLIDE 100

More Examples.....

slide-101
SLIDE 101

More Examples.....

slide-102
SLIDE 102

More Examples.....

slide-103
SLIDE 103

More Examples.....

slide-104
SLIDE 104

More Examples.....