Regularizing Part Geometry Instructor - Simon Lucey 16-623 - - - PowerPoint PPT Presentation

regularizing part geometry
SMART_READER_LITE
LIVE PREVIEW

Regularizing Part Geometry Instructor - Simon Lucey 16-623 - - - PowerPoint PPT Presentation

Regularizing Part Geometry Instructor - Simon Lucey 16-623 - Designing Computer Vision Apps Today Parts Based Registration Regularizing Parts (Heuristic) Regularizing Parts (Learned) 2 What is an Object? Face Body What is an


slide-1
SLIDE 1

Regularizing Part Geometry

Instructor - Simon Lucey

16-623 - Designing Computer Vision Apps

slide-2
SLIDE 2

Today

  • Parts Based Registration
  • Regularizing Parts (Heuristic)
  • Regularizing Parts (Learned)

2

slide-3
SLIDE 3

What is an Object?

Face Body

slide-4
SLIDE 4

What is an Object?

Face Body

slide-5
SLIDE 5

What is an Object?

Face Body

Left Eye Right Eye Nose Mouth Right Lower Arm Head Left Upper Arm Left Lower Arm Left Thigh Left Calf Right Upper Arm Torso Right Thigh Right Calf

slide-6
SLIDE 6

What is a Part?

Face

Left Eye Left Eye Not Left Eye

slide-7
SLIDE 7

When is a Collection of Parts an Object?

slide-8
SLIDE 8

When is a Collection of Parts an Object?

Face

slide-9
SLIDE 9

When is a Collection of Parts an Object?

Face Face

slide-10
SLIDE 10

When is a Collection of Parts an Object?

Face Not a Face Face

slide-11
SLIDE 11

What is an Object?

“A collection of semantically meaningful components with geometrical constraints on their spatial configuration’’

Data

slide-12
SLIDE 12

Registration

slide-13
SLIDE 13

Registration

This is an image of a body.

slide-14
SLIDE 14

Registration

This is an image of a body. Show me the...

slide-15
SLIDE 15

Registration

This is an image of a body.

Right Lower Arm? Head? Left Upper Arm? Left Lower Arm? Left Thigh? Left Calf? Right Upper Arm? Torso? Right Thigh? Right Calf?

Show me the...

slide-16
SLIDE 16

Parts Based Registration

...

Right Lower Arm Head

slide-17
SLIDE 17

Parts Based Registration

...

Right Lower Arm Head

slide-18
SLIDE 18

Parts Based Registration

...

Right Lower Arm Head

slide-19
SLIDE 19

Parts Based Registration

...

Right Lower Arm Head

min

x N

X

i=1

Di(xi) + λ R(x)

slide-20
SLIDE 20

...

Right Lower Arm Head

Parts Based Registration

min

x N

X

i=1

Di(xi) + λ R(x)

slide-21
SLIDE 21

...

Right Lower Arm Head

Parts Based Registration

min

x N

X

i=1

Di(xi) + λ R(x) Does the image at look like the part?

xi

ith

slide-22
SLIDE 22

...

Right Lower Arm Head

Parts Based Registration

min

x N

X

i=1

Di(xi) + λ R(x) Does the image at look like the part?

xi

ith

Do the joint locations of the parts match the object?

slide-23
SLIDE 23

Today

  • Parts Based Registration
  • Regularizing Parts (Heuristic)
  • Regularizing Parts (Learned)

10

slide-24
SLIDE 24

Why Regularize?

slide-25
SLIDE 25

Why Regularize?

Appearance Variability

slide-26
SLIDE 26

Why Regularize?

Appearance Variability

[7] Gross et al.’08

slide-27
SLIDE 27

Why Regularize?

Appearance Variability

[7] Gross et al.’08

False Negatives

slide-28
SLIDE 28

Why Regularise?

slide-29
SLIDE 29

Why Regularise?

Appearance Ambiguity

slide-30
SLIDE 30

Why Regularise?

? ?

[7] Gross et al.’08

Appearance Ambiguity

slide-31
SLIDE 31

Why Regularise?

? ?

[7] Gross et al.’08

Appearance Ambiguity False Positives

slide-32
SLIDE 32

Why Regularise?

Head Right Shoulder

[1] Felzenszwalb et al.’09

Left Lip Corner Right Eye Corner

slide-33
SLIDE 33

What is a Regulariser?

slide-34
SLIDE 34

What is a Regulariser?

p(x|I) ∝ p(I|x)p(x)

slide-35
SLIDE 35

What is a Regulariser?

p(x|I) ∝ p(I|x)p(x)

Likelihood

slide-36
SLIDE 36

What is a Regulariser?

p(x|I) ∝ p(I|x)p(x)

Likelihood Prior

slide-37
SLIDE 37

What is a Regulariser?

p(x|I) ∝ p(I|x)p(x) max

x

p(I|x) p(x)

Likelihood Prior

slide-38
SLIDE 38

What is a Regulariser?

p(x|I) ∝ p(I|x)p(x) max

x

p(I|x) p(x) min

x

− log {p(I|x) p(x)}

Likelihood Prior

slide-39
SLIDE 39

What is a Regulariser?

p(x|I) ∝ p(I|x)p(x) max

x

p(I|x) p(x) min

x

− log {p(I|x)} − log {p(x)}

=

min

x

− log {p(I|x) p(x)}

Likelihood Prior

slide-40
SLIDE 40

What is a Regulariser?

p(x|I) ∝ p(I|x)p(x) max

x

p(I|x) p(x) min

x

− log {p(I|x)} − log {p(x)}

= =

min

x

X

i

D(xi) + λR(x) min

x

− log {p(I|x) p(x)}

Likelihood Prior

slide-41
SLIDE 41

What is a Regulariser?

p(x|I) ∝ p(I|x)p(x) max

x

p(I|x) p(x) min

x

− log {p(I|x)} − log {p(x)}

= =

min

x

X

i

D(xi) + λR(x) R(x) ∝ − log{p(x)} min

x

− log {p(I|x) p(x)}

Likelihood Prior

slide-42
SLIDE 42

What is a Regulariser?

Face Not a Face Face

slide-43
SLIDE 43

What is a Regulariser?

Face Not a Face Face Regularisation helps disambiguate candidates locations of parts by enforcing geometric dependencies between parts!

slide-44
SLIDE 44

What is a Regulariser?

Face Not a Face Face Regularisation helps disambiguate candidates locations of parts by enforcing geometric dependencies between parts!

p(I|x1)

p(I|x2)

p(x|I)

θ θ

θ

[22] Saragih et al.’10

slide-45
SLIDE 45

What is a Regulariser?

Face Not a Face Face Regularisation helps disambiguate candidates locations of parts by enforcing geometric dependencies between parts!

p(I|x1)

p(I|x2)

p(x|I)

θ θ

θ

[22] Saragih et al.’10

slide-46
SLIDE 46

What is a Regulariser?

Face Not a Face Face Regularisation helps disambiguate candidates locations of parts by enforcing geometric dependencies between parts!

p(I|x1)

p(I|x2)

p(x|I)

θ θ

θ

[22] Saragih et al.’10

slide-47
SLIDE 47

Heuristic Regularisation

slide-48
SLIDE 48

Heuristic Regularisation

slide-49
SLIDE 49

Heuristic Regularisation

slide-50
SLIDE 50

Heuristic Regularisation

slide-51
SLIDE 51

Heuristic Regularisation

u v R(x) = k 5 uk2 + k 5 vk2

[2] Horn and Schunck’81

slide-52
SLIDE 52

Heuristic Regularisation

R(x) = k 5 uk2 + k 5 vk2

[2] Horn and Schunck’81

slide-53
SLIDE 53

Heuristic Regularisation

R(x) = k 5 uk2 + k 5 vk2

[2] Horn and Schunck’81

slide-54
SLIDE 54

Heuristic Regularisation

R(x) = k 5 uk2 + k 5 vk2

[2] Horn and Schunck’81

slide-55
SLIDE 55

Heuristic Regularisation

R(x) = k 5 uk2 + k 5 vk2

[2] Horn and Schunck’81

slide-56
SLIDE 56

Heuristic Regularisation

R(x) = k 5 uk2 + k 5 vk2

[2] Horn and Schunck’81

slide-57
SLIDE 57

Heuristic Regularisation

R(x) = k 5 uk2 + k 5 vk2

[2] Horn and Schunck’81

slide-58
SLIDE 58

Heuristic Regularisation

R(x) = k 5 uk2 + k 5 vk2

[2] Horn and Schunck’81

  • Regularisation enforces a notion of

smoothness

  • Points in close spatial proximity should

move in similar ways

  • Not always true, but quite effective when

we know nothing else about the object of interest

  • Extension using the L1-penalty considered

state of the art in heuristic regularisation

slide-59
SLIDE 59

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 x = [x1; . . . ; xn]

slide-60
SLIDE 60

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = k 5 uk2 x = [x1; . . . ; xn]

slide-61
SLIDE 61

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = k 5 uk2 = X

i

X

j∈Ni

[(xi − x0

i ) − (xj − x0 j)]2

x = [x1; . . . ; xn]

slide-62
SLIDE 62

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = k 5 uk2 = X

i

X

j∈Ni

[(xi − x0

i ) − (xj − x0 j)]2

=

  • x − x0

2

L

x = [x1; . . . ; xn]

slide-63
SLIDE 63

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = k 5 uk2 = X

i

X

j∈Ni

[(xi − x0

i ) − (xj − x0 j)]2

=

  • x − x0

2

L

x = [x1; . . . ; xn]

Graph Laplacian

slide-64
SLIDE 64

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = k 5 uk2 = X

i

X

j∈Ni

[(xi − x0

i ) − (xj − x0 j)]2

=

  • x − x0

2

L

x = [x1; . . . ; xn] Lij =      |Ni| if i = j −1 if j ∈ Ni

  • therwise

Graph Laplacian

slide-65
SLIDE 65

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = k 5 uk2 = X

i

X

j∈Ni

[(xi − x0

i ) − (xj − x0 j)]2

=

  • x − x0

2

L

L = ΦΛΦT x = [x1; . . . ; xn] Lij =      |Ni| if i = j −1 if j ∈ Ni

  • therwise

Graph Laplacian

slide-66
SLIDE 66

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = k 5 uk2 = X

i

X

j∈Ni

[(xi − x0

i ) − (xj − x0 j)]2

= X

i

λiΦiΦT

i

=

  • x − x0

2

L

L = ΦΛΦT x = [x1; . . . ; xn] Lij =      |Ni| if i = j −1 if j ∈ Ni

  • therwise

Graph Laplacian

slide-67
SLIDE 67

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = k 5 uk2 = X

i

X

j∈Ni

[(xi − x0

i ) − (xj − x0 j)]2

= X

i

λiΦiΦT

i

R(x) = X

i

λi

  • ΦT

i (x − x0)

  • 2

=

  • x − x0

2

L

L = ΦΛΦT x = [x1; . . . ; xn] Lij =      |Ni| if i = j −1 if j ∈ Ni

  • therwise

Graph Laplacian

slide-68
SLIDE 68

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = X

i

λi

  • ΦT

i (x − x0)

  • 2
slide-69
SLIDE 69

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = X

i

λi

  • ΦT

i (x − x0)

  • 2
slide-70
SLIDE 70

Φi

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = X

i

λi

  • ΦT

i (x − x0)

  • 2
slide-71
SLIDE 71

Φi

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = X

i

λi

  • ΦT

i (x − x0)

  • 2

x − x0

slide-72
SLIDE 72

Φi

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = X

i

λi

  • ΦT

i (x − x0)

  • 2

x − x0

Φ

T i

  • x

− x

slide-73
SLIDE 73

Φi

Laplacian Regularisation

Λ =    λ1 ... λn    x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = X

i

λi

  • ΦT

i (x − x0)

  • 2

x − x0

Φ

T i

  • x

− x

slide-74
SLIDE 74

Φi

Laplacian Regularisation

Λ =    λ1 ... λn    i λi x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = X

i

λi

  • ΦT

i (x − x0)

  • 2

x − x0

Φ

T i

  • x

− x

slide-75
SLIDE 75

Φi

Laplacian Regularisation

Λ =    λ1 ... λn    i λi Φ0 x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = X

i

λi

  • ΦT

i (x − x0)

  • 2

x − x0

Φ

T i

  • x

− x

slide-76
SLIDE 76

Φi

Laplacian Regularisation

Λ =    λ1 ... λn    i λi Φ0 Φi x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = X

i

λi

  • ΦT

i (x − x0)

  • 2

x − x0

Φ

T i

  • x

− x

slide-77
SLIDE 77

Φi

Laplacian Regularisation

Λ =    λ1 ... λn    i λi Φ0 Φi Φn−1 x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = X

i

λi

  • ΦT

i (x − x0)

  • 2

x − x0

Φ

T i

  • x

− x

slide-78
SLIDE 78

Φi

Laplacian Regularisation

Λ =    λ1 ... λn    i λi Φ0 Φi Φn−1 Φn x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) = X

i

λi

  • ΦT

i (x − x0)

  • 2

x − x0

Φ

T i

  • x

− x

slide-79
SLIDE 79

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 R(x) =

  • x − x0

2

L

slide-80
SLIDE 80

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 N(x) / exp ⇢ 1 2kx µk2

Σ−1

  • Gaussian Distribution:

R(x) =

  • x − x0

2

L

slide-81
SLIDE 81

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 N(x) / exp ⇢ 1 2kx µk2

Σ−1

  • Gaussian Distribution:

µ = x0 R(x) =

  • x − x0

2

L

slide-82
SLIDE 82

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 N(x) / exp ⇢ 1 2kx µk2

Σ−1

  • Gaussian Distribution:

µ = x0 R(x) =

  • x − x0

2

L

Σ ∝ L−1

slide-83
SLIDE 83

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 N(x) / exp ⇢ 1 2kx µk2

Σ−1

  • Gaussian Distribution:

µ = x0 = ΦΛ−1ΦT R(x) =

  • x − x0

2

L

Σ ∝ L−1

slide-84
SLIDE 84

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 N(x) / exp ⇢ 1 2kx µk2

Σ−1

  • Gaussian Distribution:

µ = x0 = ΦΛ−1ΦT = ΨΩΨT R(x) =

  • x − x0

2

L

Σ ∝ L−1

slide-85
SLIDE 85

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 N(x) / exp ⇢ 1 2kx µk2

Σ−1

  • Gaussian Distribution:

µ = x0 = ΦΛ−1ΦT = ΨΩΨT i σi Ω =    σ1 ... σn    R(x) =

  • x − x0

2

L

Σ ∝ L−1

slide-86
SLIDE 86

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 N(x) / exp ⇢ 1 2kx µk2

Σ−1

  • Gaussian Distribution:

µ = x0 = ΦΛ−1ΦT = ΨΩΨT i σi Ω =    σ1 ... σn    = X

i

σiΨiΨT

i

R(x) =

  • x − x0

2

L

Σ ∝ L−1

slide-87
SLIDE 87

Laplacian Regularisation

x1 x2 x3 x4 x5 x6 x7 x8 x9 N(x) / exp ⇢ 1 2kx µk2

Σ−1

  • Gaussian Distribution:

µ = x0 = ΦΛ−1ΦT = ΨΩΨT i σi Ω =    σ1 ... σn    = X

i

σiΨiΨT

i

Ψ0 Ψ1 Ψn R(x) =

  • x − x0

2

L

Σ ∝ L−1

slide-88
SLIDE 88

Example: Optical Flow

arg min

∆x ||I0(x) + ∂I0(x)

∂xT ∆x − I1(x)||2

2

∆x =    ∆x1 . . . ∆xN   

N = no. of pixels

slide-89
SLIDE 89

General Topology

x0 6= What if grid?

slide-90
SLIDE 90

General Topology

x0 6= What if grid?

slide-91
SLIDE 91

General Topology

x0 6= What if grid? = X

i

X

j∈Ni

[(xi − x0

i ) − (xj − x0 j)]2

R(x)

slide-92
SLIDE 92

General Topology

x0 6= What if grid? = X

i

X

j∈Ni

[(xi − x0

i ) − (xj − x0 j)]2

R(x) x = x0 + Ψα

slide-93
SLIDE 93

Smooth Deformation Basis

Ψ0 Ψ1 Ψ2 Ψ3 Ψ25 Ψ50

~Frequency x y z

slide-94
SLIDE 94

Smooth Deformation Basis

Ψ0 Ψ1 Ψ2 Ψ3 Ψ25 Ψ50

~Frequency x y z

slide-95
SLIDE 95

Heuristic Regularisation: Recap

  • The concept of frequency that is penalised can be specialised

to the topology of the object though defining specialised graph- laplacian

  • Regularisation is important because image measurements are not enough
  • Priors model the space of valid instances of object’s geometry
  • Regularisers penalise object geometry outside the space of valid instances.
  • The smoothness the assumption is a good heuristic
  • Laplacian regularisers enforce smoothness by penalising high frequency

variations more heavily than lower frequency variations Prior Regulariser

  • But... is that the best we can do?
slide-96
SLIDE 96

Today

  • Parts Based Registration
  • Regularizing Parts (Heuristic)
  • Regularizing Parts (Learned)

30

slide-97
SLIDE 97

Data Driven (Learned) Regularisers

What if we have annotated data?

[3] Huang et al.’07

slide-98
SLIDE 98

Topology of Samples vs. Parts

Samples Data Parts

slide-99
SLIDE 99

<3d

Topology of Samples vs. Parts

Samples Data Parts Sample Topology

slide-100
SLIDE 100

<d <3d

Topology of Samples vs. Parts

Samples Data Parts Part Topology Sample Topology

slide-101
SLIDE 101

Exhaustive Search (N > 1)

33

slide-102
SLIDE 102

Exhaustive Search (N > 1)

33

slide-103
SLIDE 103

φ{I}

φ{T1(0)}

φ{T2(0)}

I

φ{TN(0)}

Exhaustive Search (N > 1)

34

∗ ∗

Feature Extraction Local Search ............

D2 D1 DN

............ Part Responses

Color Encoding of Filter Responses

Felzenszwalb, Girshick, McAllester & Ramanan, 2010

slide-104
SLIDE 104

Exhaustive Search (N > 1)

34

D2

D1 DN

............

slide-105
SLIDE 105

D2

D1 DN

............

p∗

1

p∗

2

p∗

N

O(M N) p∗ = arg min

p N

X

i=1

Di(pi) + λ R(p) p∗ = [pT

1 , . . . , pT N]T

Computational Cost

35

slide-106
SLIDE 106

D2

D1 DN

............

p∗

1

p∗

2

p∗

N

O(M N) p∗ = arg min

p N

X

i=1

Di(pi) + λ R(p) p∗ = [pT

1 , . . . , pT N]T

Computational Cost

35

slide-107
SLIDE 107

D2

D1 DN

............

p∗

N = arg min pN DN(pN)

p∗

2 = arg min p2 D2(p2)

p∗

1 = arg min p1 D1(p1)

Computational Cost

36

slide-108
SLIDE 108

D2

D1 DN

............

O(M) p∗

1

p∗

2

p∗

N

p∗

N = arg min pN DN(pN)

p∗

2 = arg min p2 D2(p2)

p∗

1 = arg min p1 D1(p1)

Computational Cost

36

O(M) O(M)

slide-109
SLIDE 109

p1 p2 p3 p4 p5

Exhaustive Search

37

O(M N)

“We can do much better than this if the graph is sparse.”

slide-110
SLIDE 110

p1 p2 p3 p4 p5 O(NM 2)

Exhaustive Search

37

“We can do much better than this if the graph is sparse.”

slide-111
SLIDE 111

Types of Graphs

slide-112
SLIDE 112

Types of Graphs

Maximally Connected

R(x) = R(x1, . . . , xn)

slide-113
SLIDE 113

Types of Graphs

Maximally Connected

R(x) = R(x1, . . . , xn)

Unconnected R(x) = X

i

φi(xi)

slide-114
SLIDE 114

Types of Graphs

Maximally Connected

R(x) = R(x1, . . . , xn)

General Graph

R(x) = X

i

φi(xi) + X

i,j∈E

ψij(xi, xj)

Unconnected R(x) = X

i

φi(xi)

slide-115
SLIDE 115

Types of Graphs

Maximally Connected

R(x) = R(x1, . . . , xn)

General Graph

R(x) = X

i

φi(xi) + X

i,j∈E

ψij(xi, xj)

Unconnected R(x) = X

i

φi(xi) Optimal Sparse Graph

slide-116
SLIDE 116

Types of Graphs

Maximally Connected

R(x) = R(x1, . . . , xn)

General Graph

R(x) = X

i

φi(xi) + X

i,j∈E

ψij(xi, xj)

Unconnected R(x) = X

i

φi(xi) Optimal Sparse Graph LASSO

slide-117
SLIDE 117

Types of Graphs

Maximally Connected

R(x) = R(x1, . . . , xn)

General Graph

R(x) = X

i

φi(xi) + X

i,j∈E

ψij(xi, xj)

Unconnected R(x) = X

i

φi(xi) Optimal Sparse Graph LASSO

min

θj

X

i

⇥xij X

k6=j

θjkxik⇥2 + λ |θj|

[24] Gu et al’07

slide-118
SLIDE 118

Types of Graphs

Maximally Connected

R(x) = R(x1, . . . , xn)

General Graph

R(x) = X

i

φi(xi) + X

i,j∈E

ψij(xi, xj)

Unconnected R(x) = X

i

φi(xi) Optimal Sparse Graph LASSO

min

θj

X

i

⇥xij X

k6=j

θjkxik⇥2 + λ |θj|

Loopy... but often converges to good solutions!

[24] Gu et al’07

slide-119
SLIDE 119

Lasso

min

θ

kY Xθk2 + λkθkc

c

min

θ

kY Xθk2 s.t.kθkc

c  γ

slide-120
SLIDE 120

θ1 θ2

Lasso

min

θ

kY Xθk2 + λkθkc

c

min

θ

kY Xθk2 s.t.kθkc

c  γ

slide-121
SLIDE 121

θ1 θ2

Lasso

min

θ

kY Xθk2 + λkθkc

c

min

θ

kY Xθk2 s.t.kθkc

c  γ

s.t. kθk2  γ min

θ

kY Xθk2

Ridge Regression:

slide-122
SLIDE 122

θ1 θ2

Lasso

min

θ

kY Xθk2 + λkθkc

c

min

θ

kY Xθk2 s.t.kθkc

c  γ

s.t. kθk2  γ min

θ

kY Xθk2

Ridge Regression:

slide-123
SLIDE 123

θ1 θ2

Lasso

min

θ

kY Xθk2 + λkθkc

c

min

θ

kY Xθk2 s.t.kθkc

c  γ

s.t. kθk2  γ min

θ

kY Xθk2

Ridge Regression:

s.t. |θ| ≤ γ min

θ

kY Xθk2

Lasso:

slide-124
SLIDE 124

θ1 θ2

Lasso

min

θ

kY Xθk2 + λkθkc

c

min

θ

kY Xθk2 s.t.kθkc

c  γ

s.t. kθk2  γ min

θ

kY Xθk2

Ridge Regression:

s.t. |θ| ≤ γ min

θ

kY Xθk2

Lasso:

[14] Tibshirani’96

slide-125
SLIDE 125

Optimal Sparse Graphs

[24] Gu et al.’07

Mean Procrustes Correlation Sparse Graph Face B

  • d

y Hand

slide-126
SLIDE 126

Optimal Sparse Graphs

Maximally Connected Star Unconnected Sparse

[24] Gu et al.’07

slide-127
SLIDE 127

Tree Regularization

  • Sparse graph of particular interest is a tree,

42

Felzenszwalb & Huttenlocher, 2005

slide-128
SLIDE 128

Tree Regularization

  • Sparse graph of particular interest is a tree,

42

Felzenszwalb & Huttenlocher, 2005

slide-129
SLIDE 129

Dynamic Programming

  • Globally optimal solution to any tree graph can be found

using “Dynamic Programming”.

43

slide-130
SLIDE 130

scorej(pj) = Dj(pj) + P

k∈kids(j) mk(pj)

Dynamic Programming

  • Globally optimal solution to any tree graph can be found

using “Dynamic Programming”.

43

slide-131
SLIDE 131

scorej(pj) = Dj(pj) + P

k∈kids(j) mk(pj)

Dynamic Programming

  • Globally optimal solution to any tree graph can be found

using “Dynamic Programming”.

43

p1 p2 p3 p4 p5

slide-132
SLIDE 132

scorej(pj) = Dj(pj) + P

k∈kids(j) mk(pj)

Dynamic Programming

  • Globally optimal solution to any tree graph can be found

using “Dynamic Programming”.

43

p1 p2 p3 p4 p5

“parent”

slide-133
SLIDE 133

scorej(pj) = Dj(pj) + P

k∈kids(j) mk(pj)

Dynamic Programming

  • Globally optimal solution to any tree graph can be found

using “Dynamic Programming”.

43

p1 p2 p3 p4 p5

“kids” “kids” “kids” “kids”

slide-134
SLIDE 134

Tree-Structured Graphs

slide-135
SLIDE 135

Tree-Structured Graphs

slide-136
SLIDE 136

Forwards Backwards

Tree-Structured Graphs

slide-137
SLIDE 137

Star Spanning Tree Forwards Backwards

Tree-Structured Graphs

slide-138
SLIDE 138

Star Spanning Tree

Rc(xj) = φj(xj) + X

i∈Cj

ψj(xj, xi) + X

k

Rc(xk)

R(x) = Rc(xroot)

Forwards Backwards

Tree-Structured Graphs

slide-139
SLIDE 139

Maximum Spanning Tree Star Spanning Tree

Rc(xj) = φj(xj) + X

i∈Cj

ψj(xj, xi) + X

k

Rc(xk)

R(x) = Rc(xroot)

Forwards Backwards

Tree-Structured Graphs

slide-140
SLIDE 140

Maximum Spanning Tree Prim’s Algorithm

[17] Prim’57

Star Spanning Tree

Rc(xj) = φj(xj) + X

i∈Cj

ψj(xj, xi) + X

k

Rc(xk)

R(x) = Rc(xroot)

Forwards Backwards

Tree-Structured Graphs

slide-141
SLIDE 141

Maximum Spanning Tree Prim’s Algorithm

V∗ = {i}

Initialise:

[17] Prim’57

Star Spanning Tree

Rc(xj) = φj(xj) + X

i∈Cj

ψj(xj, xi) + X

k

Rc(xk)

R(x) = Rc(xroot)

Forwards Backwards

Tree-Structured Graphs

slide-142
SLIDE 142

Maximum Spanning Tree Prim’s Algorithm

V∗ = {i}

Initialise:

V∗ = V

While :

[17] Prim’57

Star Spanning Tree

Rc(xj) = φj(xj) + X

i∈Cj

ψj(xj, xi) + X

k

Rc(xk)

R(x) = Rc(xroot)

Forwards Backwards

Tree-Structured Graphs

slide-143
SLIDE 143

Maximum Spanning Tree Prim’s Algorithm

V∗ = {i}

Initialise:

V∗ = V

While :

j = arg min

j

Wij

s.t i ∈ V∗ , j / ∈ V∗

[17] Prim’57

Star Spanning Tree

Rc(xj) = φj(xj) + X

i∈Cj

ψj(xj, xi) + X

k

Rc(xk)

R(x) = Rc(xroot)

Forwards Backwards

Tree-Structured Graphs

slide-144
SLIDE 144

Maximum Spanning Tree Prim’s Algorithm

Wij = ( if i = j P

k xi.xj

x2

i +x2 j

  • therwise

V∗ = {i}

Initialise:

V∗ = V

While :

j = arg min

j

Wij

s.t i ∈ V∗ , j / ∈ V∗

[17] Prim’57

Star Spanning Tree

Rc(xj) = φj(xj) + X

i∈Cj

ψj(xj, xi) + X

k

Rc(xk)

R(x) = Rc(xroot)

Forwards Backwards

Tree-Structured Graphs

slide-145
SLIDE 145

Maximum Spanning Tree Prim’s Algorithm

Wij = ( if i = j P

k xi.xj

x2

i +x2 j

  • therwise

V∗ ← V∗ ∪ {j}

V∗ = {i}

Initialise:

V∗ = V

While :

j = arg min

j

Wij

s.t i ∈ V∗ , j / ∈ V∗

[17] Prim’57

Star Spanning Tree

Rc(xj) = φj(xj) + X

i∈Cj

ψj(xj, xi) + X

k

Rc(xk)

R(x) = Rc(xroot)

Forwards Backwards

Tree-Structured Graphs

slide-146
SLIDE 146

Maximum Spanning Tree Prim’s Algorithm

Wij = ( if i = j P

k xi.xj

x2

i +x2 j

  • therwise

V∗ ← V∗ ∪ {j}

V∗ = {i}

Initialise:

V∗ = V

While :

j = arg min

j

Wij

s.t i ∈ V∗ , j / ∈ V∗

[17] Prim’57

Star Spanning Tree

Rc(xj) = φj(xj) + X

i∈Cj

ψj(xj, xi) + X

k

Rc(xk)

R(x) = Rc(xroot)

Forwards Backwards

Tree-Structured Graphs

[1] Felzenszwalb et al.’09

slide-147
SLIDE 147

Graph Potentials

ψij(xi, xj) : encodes knowledge of distribution of relative spatial location of parts

slide-148
SLIDE 148

Graph Potentials

Spring Models:

[1] Felzenszwalb et al.’09 [25] Yang and Ramanan’11

ψij(xi, xj) : encodes knowledge of distribution of relative spatial location of parts

slide-149
SLIDE 149

Graph Potentials

Spring Models:

[1] Felzenszwalb et al.’09 [25] Yang and Ramanan’11

ψij(xi, xj) : encodes knowledge of distribution of relative spatial location of parts

ψij(xi, xj) = wT

ij [dx; d2 x; dy; d2 y]

[dx; dy] = xi − xj

slide-150
SLIDE 150

Graph Potentials

Spring Models:

[1] Felzenszwalb et al.’09 [25] Yang and Ramanan’11

ψij(xi, xj) : encodes knowledge of distribution of relative spatial location of parts

ψij(xi, xj) = wT

ij [dx; d2 x; dy; d2 y]

[dx; dy] = xi − xj = ([xi; xj] − µij)T Σ−1

ij ([xi; xj] − µij)

slide-151
SLIDE 151

Graph Potentials

Spring Models:

[1] Felzenszwalb et al.’09 [25] Yang and Ramanan’11

ψij(xi, xj) : encodes knowledge of distribution of relative spatial location of parts

ψij(xi, xj) = wT

ij [dx; d2 x; dy; d2 y]

[dx; dy] = xi − xj = ([xi; xj] − µij)T Σ−1

ij ([xi; xj] − µij)

R(x) = (x − µ)T Σ−1(x − µ)

Sparse Gaussian Prior:

slide-152
SLIDE 152

Graph Potentials

Spring Models:

[1] Felzenszwalb et al.’09 [25] Yang and Ramanan’11

ψij(xi, xj) : encodes knowledge of distribution of relative spatial location of parts

ψij(xi, xj) = wT

ij [dx; d2 x; dy; d2 y]

[dx; dy] = xi − xj = ([xi; xj] − µij)T Σ−1

ij ([xi; xj] − µij)

R(x) = (x − µ)T Σ−1(x − µ)

Sparse Gaussian Prior:

Σ−1

slide-153
SLIDE 153

Graph Potentials

Spring Models:

[1] Felzenszwalb et al.’09 [25] Yang and Ramanan’11

ψij(xi, xj) : encodes knowledge of distribution of relative spatial location of parts

ψij(xi, xj) = wT

ij [dx; d2 x; dy; d2 y]

[dx; dy] = xi − xj = ([xi; xj] − µij)T Σ−1

ij ([xi; xj] − µij)

R(x) = (x − µ)T Σ−1(x − µ)

Sparse Gaussian Prior:

Σ−1

Sparsity structure follows a tree topology!

slide-154
SLIDE 154

Learning Sample Topology

<3d

Sample Topology

slide-155
SLIDE 155

Procrustes Analysis

x = [x1; . . . ; xn; y1; . . . ; yn]

slide-156
SLIDE 156

Procrustes Analysis

x = [x1; . . . ; xn; y1; . . . ; yn]

[4] Matthews et al.’07

slide-157
SLIDE 157

Procrustes Analysis

x = [x1; . . . ; xn; y1; . . . ; yn]

Procrustes

[4] Matthews et al.’07

slide-158
SLIDE 158

Procrustes Analysis

x = [x1; . . . ; xn; y1; . . . ; yn]

Procrustes Learning

[4] Matthews et al.’07