High-Dimensional Pattern Recognition via Sparse Representation Allen - - PowerPoint PPT Presentation

high dimensional pattern recognition via sparse
SMART_READER_LITE
LIVE PREVIEW

High-Dimensional Pattern Recognition via Sparse Representation Allen - - PowerPoint PPT Presentation

Introduction Sparse Representation 1 -Minimization Low-Rank Representation Future Topics Discussion High-Dimensional Pattern Recognition via Sparse Representation Allen Y. Yang Department of EECS, UCB yang@eecs.berkeley.edu HP Labs,


slide-1
SLIDE 1

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

High-Dimensional Pattern Recognition via Sparse Representation

Allen Y. Yang Department of EECS, UCB yang@eecs.berkeley.edu HP Labs, March 2012

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-2
SLIDE 2

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Shifting Paradigms in High-Dimensional Pattern Recognition

Face Recognition

Yale B CMU Multi-PIE Facebook Photo Tagging

Object Recognition

ETHZ Cows vs Cars Caltech 101 Caltech 256

3D Reconstruction

Oxford Corridor Berkeley Downtown Google Earth

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-3
SLIDE 3

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Shifting Paradigms in Distributed Computing

The Internet

! "#" $ %& & ' ( $ ' )# %( * + ' ( #, - # - ' + ! + . ' ( / , 0 ! ' . $ 12 2 3453 . 367" 89: 34; 35437<4= 1> 351912 ? 5 <@ <? 4; ( 1A347 %47934 ; <? : 7B<7 B3 A3C ? 3D3: 7B3 : 1C C <4 <75844397 C 3D3C ; E <; > <? 4C = 64? 53: <F<? 9; 72 1; 7' 84163<9 5844395? 3; " G 9 < E ? : 3 4<9F? 9F <: : 43; ; ; 619; 143: A= 7B3 ' H6147IG 2 6147 J <9KL %47934L 7B3 A<9K); ; 39? 14 351912 ? ; 7 <C ; 1 ; <? : B3 A3C ? 3D3: 7B<7 7B3 =39 E <; 89: 34D <C 83: <9: 518C : F1 86 A= M N 14 M O 657" PG: 1 917 43F<4: 7B3 : 1C C <4 <; 89: 34D <C 83: <7 7B? ; 61? 97 <F<? 9; 7 7B3 =39L P B3 ; <? : " %9 7B3 17B34 B<9: L %47934 ; <? : 7B<7 B3 7B18FB7 7B<7 P7B3 =39 ? ; ; 7? C C < C ? 77C 3 A? 7 89: 34D<C 83: LP <9: P518C : F1 86 <917B34 M N 14 M O 657"P G 9 <: : ? 7? 19L %47934L E B1 ; <? : B3 E <; ; 63<K? 9F 634; 19<C C =L ; <? : B3 7B18FB7 7B<7 7B3 : 1C C <4 <F<? 9; 7 2 1; 7 ' 84163<9 5844395? 3; E <; P> <? 4C = 64? 53: "P %47934 ; <? : B? ; <9<C =; ? ; 1> 7B3 D <4? 18; 3H5B<9F3 4<73 D<C 83; E <; A<; 3: 19 ; 85B 351912 ? 5 6<47? 58C <4; <; E <F3 4<73 : ? @ 34397? <7? 19; " %47934 ; <? : 7B343 B<: A339 C ? 77C 3 ? 2 6<57 19 ! "#" 74<: 3 : 3Q5? 7 A= 7B3 : 35C ? 93 1>7B3 : 1C C <4 A35<8; 3 <7 7B3 7? 2 3 1>7B3 R C <S< , 5514: L 7B3 : 1C C <4 E <; 3H7432 3C = 1D34D <C 83: <9: 7B<7 7B3 Q4; 7 M O 657 : 35C ? 93 B<: C ? 77C 3 ? 2 6<57" T 3 ; <? : 7B343 E 343 ? 9: ? 5<7? 19; 91E 7B<7 7B3 74<: 3 : 3Q5? 7 E <; A3F? 99? 9F 71 C 3D3C1@ " * 849? 9F 71 J 4<S? C <9: & 3H? 51L %47934 2 <: 3 ? 7 5C 3<4 7B<7 ? 7 E 18C : A3 <C 2 1; 7 ? 2 61; ; ? AC 3 > 14 7B1; 3 518974? 3; 71 3<49 3918FB > 143? F9 3H5B<9F3 71 6<= 7B3 ; 34D? 53 19 7B3? 4 : 3A7; " T 3 ; <? : 7B3 A3; 7 E <= 71 : 3<CE ? 7B 7B? ; E <; 71 8; 3 7B3 61C ? 5? 3; 187C ? 93: ? 9 * 43<; 84= #35437<4= U<2 3; J <K34); : 3A7 ? 9? 7? <7? D3"

50B webpages average 1B voxels average 1M pixels

From desktop computing to mobile computing Vision-based robot control and navigation

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-4
SLIDE 4

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Accurate recognition of HD pattens presents unique challenges

1

Programming real-time systems on low-power mobile devices is difficult.

2

Distributed, real-time applications demand extremely high accuracy.

3

Scenarios require the ability to obtain detailed 3-D representation of the models.

(a) Frueh & Zakhor ’03 (b) Su et al. ’09

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-5
SLIDE 5

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Outline

Main Message The rich phenomena of sparse representation in HD data can provide novel pattern recognition solutions and successfully mitigate the curse of dimensionality and other challenges.

1

Robust face recognition via ℓ1-minimization and group sparsity

2

Accelerated sparse optimization algorithms and parallelization

3

Sparsity in matrix rank to extract robust image representation for 3-D reconstruction Ongoing projects:

1

Informative feature selection via Sparse PCA

2

Robust 3-D motion registration via sparse online low-rank projection

3

Compressive phase retrieval via semidefinite-programming

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-6
SLIDE 6

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Face Recognition via Sparse Representation

1

Face-subspace model [Belhumeur et al. ’97, Basri & Jacobs ’03] Assume b belongs to Class i from K classes. b = αi,1ai,1 + αi,2ai,2 + · · · + αi,n1ai,ni , = Aiαi.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-7
SLIDE 7

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Face Recognition via Sparse Representation

1

Face-subspace model [Belhumeur et al. ’97, Basri & Jacobs ’03] Assume b belongs to Class i from K classes. b = αi,1ai,1 + αi,2ai,2 + · · · + αi,n1ai,ni , = Aiαi.

2

Nevertheless, Class i is the unknown label we need to solve: Sparse representation b = [A1, A2, · · · , AK ] 2 4

α1 α2

. . .

αK

3 5 = Ax.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-8
SLIDE 8

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Face Recognition via Sparse Representation

1

Face-subspace model [Belhumeur et al. ’97, Basri & Jacobs ’03] Assume b belongs to Class i from K classes. b = αi,1ai,1 + αi,2ai,2 + · · · + αi,n1ai,ni , = Aiαi.

2

Nevertheless, Class i is the unknown label we need to solve: Sparse representation b = [A1, A2, · · · , AK ] 2 4

α1 α2

. . .

αK

3 5 = Ax.

3

x∗ = [ 0 ··· 0 αT

i

0 ··· 0 ]T ∈ Rn.

Sparse representation x∗ encodes membership through its nonzero coefficients!

Reference: Wright, AY, Sastry, Ma, Robust face recognition via sparse representation. IEEE PAMI, 2009.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-9
SLIDE 9

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Sparse Optimization via ℓ1-Minimization

An inverse problem with an underdetermined system of linear equations, A in general is full rank: b = Ax where A ∈ Rd×n, (d < n) Two interpretations:

1

Compressive sensing: A is a sensing matrix.

2

Sparse representation: A is a prior dictionary.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-10
SLIDE 10

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Sparse Optimization via ℓ1-Minimization

An inverse problem with an underdetermined system of linear equations, A in general is full rank: b = Ax where A ∈ Rd×n, (d < n) Two interpretations:

1

Compressive sensing: A is a sensing matrix.

2

Sparse representation: A is a prior dictionary.

Infinitely many solutions for x, without extra regularization (P0) : x∗

0 = arg min x0

  • subj. to

Ax = b (P1) : x∗

1 = arg min x1

  • subj. to

Ax = b

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-11
SLIDE 11

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Image Occlusion, Corruption, and Disguise

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-12
SLIDE 12

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Image Occlusion, Corruption, and Disguise

1

Sparse representation + sparse error b = Ax + e

2

Cross-and-bouquet model [Wright et al. ’09, ’10] b = `A | I´ „x e « = Bw When size of A grows proportionally with the sparsity in x, asymptotically CAB can correct 100% noise in e.

Reference: Wright and Ma, Dense Error Correction via ℓ1 Minimization, IEEE Trans. IT, 2011.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-13
SLIDE 13

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Performance on the YaleB database

Top 100 IEEExplore Download in June, 2010. 800+ citations on Google.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-14
SLIDE 14

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Face Alignment Problem: Misalignment violates linear subspace model

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-15
SLIDE 15

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Face Alignment Problem: Misalignment violates linear subspace model

1

Find an image transformation τ (2-D function that transforms image coordinates) min e1

  • subj. to

b ◦ τi = Aix + e (1) per each class Ai, while minimize the alignment error e.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-16
SLIDE 16

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Face Alignment Problem: Misalignment violates linear subspace model

1

Find an image transformation τ (2-D function that transforms image coordinates) min e1

  • subj. to

b ◦ τi = Aix + e (1) per each class Ai, while minimize the alignment error e.

2

Iterative linear approximation [Lucas & Kanade ’81, Hager & Belhumeur ’98]: b ◦ τi + ∇τ(b ◦ τi) · ∆τi ≈ Aix + e. (2) Convert to a linear sparse optimization constraint.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-17
SLIDE 17

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Face Alignment Problem: Misalignment violates linear subspace model

1

Find an image transformation τ (2-D function that transforms image coordinates) min e1

  • subj. to

b ◦ τi = Aix + e (1) per each class Ai, while minimize the alignment error e.

2

Iterative linear approximation [Lucas & Kanade ’81, Hager & Belhumeur ’98]: b ◦ τi + ∇τ(b ◦ τi) · ∆τi ≈ Aix + e. (2) Convert to a linear sparse optimization constraint.

3

Compensated training images are fed back to the sparse representation model: b = [τ −1

1

(A1), · · · , τ −1

K (AK )]x + e.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-18
SLIDE 18

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Region of Convergence for 2-D Alignment

Using affine transformations, ℓ1-min approach can compensate 3-D rotation up to 30 degree and 2-D translation up to 5 pixels.

Reference: Ganesh, Ma, Wagner, Wright, AY, Zhou, Face recognition by sparse representation, Cambridge University Press, 2011.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-19
SLIDE 19

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Minimizing Group Sparsity

Group sparsity for structured data (in face recognition): A = [A1, · · · , AK ] (P0,p) : x∗

0,p = argmin x K

X

k=1

I(xkp > 0),

  • subj. to

Ax . = ˆA1 · · · AK ˜ 2 6 4 x1 . . . xK 3 7 5 = b ℓ0-min becomes a special case that enforces entry-wise sparsity.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-20
SLIDE 20

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Minimizing Group Sparsity

Group sparsity for structured data (in face recognition): A = [A1, · · · , AK ] (P0,p) : x∗

0,p = argmin x K

X

k=1

I(xkp > 0),

  • subj. to

Ax . = ˆA1 · · · AK ˜ 2 6 4 x1 . . . xK 3 7 5 = b ℓ0-min becomes a special case that enforces entry-wise sparsity. Convexification of NP-hard group sparsity minimization: (P1,p) : x∗

1,p = arg min K

X

k=1

xkp

  • subj. to

Ax = b where p ≥ 1. A popular choice is p = 2 [Eldar & Mishali ’09, Stojnic et al. ’09, Sprechmann et al. ’10,

Elhamifar & Vidal ’11].

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-21
SLIDE 21

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Robust Face Recognition as a Group Sparsity Recovery Problem

Whether enforcing group sparsity boosts robust face recognition performance?

1

A naive solution: b = Ax + e = [A1, A2, · · · , AK , I][x1; x2; · · · ; xK ; e], whereby e is treated as the (K + 1)th group, has a trivial solution of 1-group-sparity: e = b; x = 0.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-22
SLIDE 22

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Robust Face Recognition as a Group Sparsity Recovery Problem

Whether enforcing group sparsity boosts robust face recognition performance?

1

A naive solution: b = Ax + e = [A1, A2, · · · , AK , I][x1; x2; · · · ; xK ; e], whereby e is treated as the (K + 1)th group, has a trivial solution of 1-group-sparity: e = b; x = 0.

2

More appropriate solution: Mixed sparsity minimization (MSM) problem (MP0,p) : {x∗

0,p, e∗ 0} = argmin (x,e)

ℓ0,p(x) + γe0,

  • subj. to

ˆA1 · · · AK ˜ 2 6 4 x1 . . . xK 3 7 5 + e = b

3

A biduality approach to derive convex surrogates of sparse minimization problems {x∗

bidual, e∗ bidual} = argmin x K

X

k=1

` xk∞ + γe1 ´

  • subj. to

Ax + e = b. New perspective of ℓ0/ℓ1 equivalence, and tightest lower bound of the primal NP-hard problem.

Reference: Singaragu, Tron, Elhamifar, AY, Sastry, On the Lagrangian biduality of sparsity minimization problems, ICASSP, 2012.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-23
SLIDE 23

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Improved Face Recognition Performance via MOSEK

(c) Unoccluded Images (d) Occluded Images

Figure: Images from one session of the AR database.

Group Sparsity ℓ1 ℓ1,2 ℓ1,∞ unoccluded 92% 93.6% 94.7%

  • ccluded

49.7% 53.6% 57.6% Total 65.3% 68.3% 69.7% Speed 53.7s 256.5s 60.9s

Table: 100-subject test set consists of 700 un-occluded images and 1200 occluded images.

The biduality objective function is a cost-effective convex surrogate to mixed group sparsity problems with improved classification performance.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-24
SLIDE 24

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Numerical Implementation of Sparse Optimization

General linear-programming toolboxes do exist: cvx, SparseLab. However, standard interior-point methods are very expensive in HD space. Standard Form minx 1T x

  • subj. to

Ax = b x ≥ 0

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-25
SLIDE 25

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

ℓ1-Min Literature Review: Towards a real-time face recognition system

1

Primal-Dual Interior-Point

Log-Barrier [Frisch ’55, Karmarkar ’84, Megiddo ’89, Monteiro-Adler ’89, Kojima-Megiddo-Mizuno ’93]

2

Homotopy

Homotopy [Osborne-Presnell-Turlach ’00, Malioutov-Cetin-Willsky ’05, Donoho-Tsaig ’06] Polytope Faces Pursuit (PFP) [Plumbley ’06] Least Angle Regression (LARS) [Efron-Hastie-Johnstone-Tibshirani ’04]

3

Gradient Projection

Gradient Projection Sparse Representation (GPSR) [Figueiredo-Nowak-Wright ’07] Truncated Newton Interior-Point Method (TNIPM) [Kim-Koh-Lustig-Boyd-Gorinevsky ’07]

4

Iterative Thresholding

Soft Thresholding [Donoho ’95] Sparse Reconstruction by Separable Approximation (SpaRSA) [Wright-Nowak-Figueiredo ’08]

5

Proximal Gradient [Nesterov ’83, Nesterov ’07]

FISTA [Beck-Teboulle ’09] Nesterov’s Method (NESTA) [Becker-Bobin-Cand´ es ’09]

6

Augmented Lagrangian Methods [Yang-Zhang ’09, AY et al ’10]

Bergman [Yin et al. ’08] YALL1 [Yang-Zhang ’09] SALSA [Figueiredo et al. ’09] Primal ALM, Dual ALM [AY et al ’10]

Reference (MATLAB implementation available on our website): AY, Ganesh, Ma, Sastry, A review of fast ℓ1-minimization algorithms for robust face recognition. ICIP, 2010.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-26
SLIDE 26

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Augmented Lagrangian Method (ALM)

ℓ1-Min: x∗ = arg min x1

  • subj. to

b = Ax (adding a quadratic penalty term for the equality constraint) Lµ(x) = x1 + µ 2 b − Ax2

2

  • subj. to

b = Ax.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-27
SLIDE 27

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Augmented Lagrangian Method (ALM)

ℓ1-Min: x∗ = arg min x1

  • subj. to

b = Ax (adding a quadratic penalty term for the equality constraint) Lµ(x) = x1 + µ 2 b − Ax2

2

  • subj. to

b = Ax. Augmented Lagrange Function: Lµ(x, y) = x1 + y, b − Ax + µ 2 b − Ax2

2,

where y is the Lagrange multipliers for the constraint b = Ax. Theorem: Convergence of ALM [Bertsekas ’03] When optimize Lµ(x, y) w.r.t. a sequence µk → ∞, and {yk} is bounded, then the limit point of {xk} is the global minimum of the original problem, namely, ℓ1-min.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-28
SLIDE 28

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Augmented Lagrangian Method (ALM)

ℓ1-Min: x∗ = arg min x1

  • subj. to

b = Ax (adding a quadratic penalty term for the equality constraint) Lµ(x) = x1 + µ 2 b − Ax2

2

  • subj. to

b = Ax. Augmented Lagrange Function: Lµ(x, y) = x1 + y, b − Ax + µ 2 b − Ax2

2,

where y is the Lagrange multipliers for the constraint b = Ax. Theorem: Convergence of ALM [Bertsekas ’03] When optimize Lµ(x, y) w.r.t. a sequence µk → ∞, and {yk} is bounded, then the limit point of {xk} is the global minimum of the original problem, namely, ℓ1-min. An alternating direction method for optimization.

1

Fix y, update xk+1: soft-thresholding.

2

Fix x, update yk+1: method of multipliers.

3

µ → ∞, repeat (1) and (2).

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-29
SLIDE 29

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Simulation: Speed of ℓ1-Min Solvers

Table: x0 ∈ R1000, x00 = 200, b = Ax0 ∈ R600.

Algorithm Estimate Runtime Interior Point 63 s Homotopy 1.7 s ALM 0.16 s

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-30
SLIDE 30

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Parallelization of First-Order ℓ1-Min Solvers

1

Primal-Dual Interior-Point

Log-Barrier [Frisch ’55, Karmarkar ’84, Megiddo ’89, Monteiro-Adler ’89, Kojima-Megiddo-Mizuno ’93]

2

Homotopy

Homotopy [Osborne-Presnell-Turlach ’00, Malioutov-Cetin-Willsky ’05, Donoho-Tsaig ’06] Polytope Faces Pursuit (PFP) [Plumbley ’06] Least Angle Regression (LARS) [Efron-Hastie-Johnstone-Tibshirani ’04]

3

Gradient Projection

Gradient Projection Sparse Representation (GPSR) [Figueiredo-Nowak-Wright ’07] Truncated Newton Interior-Point Method (TNIPM) [Kim-Koh-Lustig-Boyd-Gorinevsky ’07]

4

Iterative Thresholding

Soft Thresholding [Donoho ’95] Sparse Reconstruction by Separable Approximation (SpaRSA) [Wright-Nowak-Figueiredo ’08]

5

Proximal Gradient [Nesterov ’83, Nesterov ’07]

FISTA [Beck-Teboulle ’09] Nesterov’s Method (NESTA) [Becker-Bobin-Cand´ es ’09]

6

Augmented Lagrangian Methods [Yang-Zhang ’09, AY et al ’10]

Bergman [Yin et al. ’08] YALL1 [Yang-Zhang ’09] SALSA [Figueiredo et al. ’09] Primal ALM, Dual ALM [AY et al ’10]

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-31
SLIDE 31

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

ℓ1-Min Simulation: Face Alignment Time vs Number of Subjects

Figure: Problem Parallelism 50 100 150 200 250 10 20 30 40 Number of training users Elapsed time (s) GPU CPU, library threading CPU, manual threading

Reference (Parallel C/CUDA-C implementation available upon request): Wagner, Shia, AY, Sastry, Ma, Fast ℓ1-minimization and parallelization for face recognition, Asilomar Conf, 2011.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-32
SLIDE 32

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Sparsity in rank of matrices as low-rank representation of images

Most symmetric image patterns (if treated as matrices) are low-rank

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-33
SLIDE 33

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Sparsity in rank of matrices as low-rank representation of images

Most symmetric image patterns (if treated as matrices) are low-rank Camera projection and pose variation distort/destroy the low-rank representation Canonical representation of low-rank texture Minimizing the rank of images within some texture regions may recover the hidden information about the orientation of the pattens in 3-D space.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-34
SLIDE 34

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Transform Invariant Low-rank Texture (TILT)

Objective function [Zhang et al. ’10] min

A,E,τ rank(A) + λE0

  • subj. to

I ◦ τ = A + E, where A is low-rank and E is sparse, τ parametrizes an image transformation.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-35
SLIDE 35

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Transform Invariant Low-rank Texture (TILT)

Objective function [Zhang et al. ’10] min

A,E,τ rank(A) + λE0

  • subj. to

I ◦ τ = A + E, where A is low-rank and E is sparse, τ parametrizes an image transformation. An iterative solution using Robust PCA [Candes et al. ’10] min

A,E,∆τ A∗ + λE1

  • subj. to

I ◦ τk + ∇I∆τ = A + E, which also has a corresponding ALM formulation [Lin et al. ’10] A∗ + λE1 + Y , I ◦ τk + ∇I∆τ − A − E + µ 2 I ◦ τk + ∇I∆τ − A − E2

F

Its cost is approximately a small constant times the cost of SVD.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-36
SLIDE 36

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

More TILT Examples

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-37
SLIDE 37

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

How to use low-rank texture in 3-D reconstruction

Challenges in image matching Ambiguity of matching local features (points, lines, patches, etc) stems from the fact that traditional features do NOT possess any 3-D geometric information alone. Introduce TILT as a new class of image features for matching.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-38
SLIDE 38

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Group two TILT features in one image

Detect the intersection of two TILT patterns: min [A1, A2]∗ + λ[E1, E2]1

  • subj. to

[I1 ◦ τ1, I2 ◦ τ2] = [A1, A2] + [E1, E2] More Examples

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-39
SLIDE 39

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Use one TILT feature in two views to match the images

Pairwise matching ⇒ Stitching a full 3-D reconstruction Multiresolution Matching 3-D Reconstruction

Reference: Mobahi, Zhou, AY, Ma, Hollistic 3D reconstruction of urban structures from low-rank textures, ICCV Workshop, 2011.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-40
SLIDE 40

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Future Topic: Large-Scale 3-D Reconstruction on Bing Map using TILT

Figure: Cells − → Complexes − → Facades − → City Model.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-41
SLIDE 41

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Future Topic: Select Strong Features in Object Recognition via Sparse PCA

Reference: Naikal, AY, Sastry, ICCV (most remembered poster), 2011.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-42
SLIDE 42

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Future Topic: Online Motion Registration and Outlier Rejection

1

3-D depth camera has become popular in robot navigation and SLAM thanks to the Kinect.

2

Tracked rigid features factorized as motion matrix and shape matrix [Tomasi & Kanade ’92]: X . = 2 6 4

x1,1 ··· x1,m

. . .

···

. . .

xF,1 ··· xF,m

3 7 5 = 2 4

Πg1

. . .

ΠgF

3 5 h x1,1, ··· , x1,m

1, ··· , 1

i ∈ R3F×m. (3)

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-43
SLIDE 43

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Future Topic: Online Motion Registration and Outlier Rejection

1

3-D depth camera has become popular in robot navigation and SLAM thanks to the Kinect.

2

Tracked rigid features factorized as motion matrix and shape matrix [Tomasi & Kanade ’92]: X . = 2 6 4

x1,1 ··· x1,m

. . .

···

. . .

xF,1 ··· xF,m

3 7 5 = 2 4

Πg1

. . .

ΠgF

3 5 h x1,1, ··· , x1,m

1, ··· , 1

i ∈ R3F×m. (3)

3

In the presence of outliers and missing data, one can solve the Robust PCA problem min

L,E L∗ + λE1

  • subj. to

PΩ(L + E) = PΩ(X), (4)

4

We propose an online low-rank projection algorithm that updates the rank-4 motion model min

E,A E1

  • subj. to

Wi = AV T + E. (5)

Reference: Slaughter, AY, Bagwell, Checkles, Sentis, Vishwanath, ICRA, 2012.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-44
SLIDE 44

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Compressive Phase Retrieval in Diffraction Crystallography

1

Compressive phase retrieval In the mix of compressive sensing, what if the observations b lose the phase information? (the sensing apparatus is only sensitive to the intensity of signal response)

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-45
SLIDE 45

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Compressive Phase Retrieval in Diffraction Crystallography

1

Compressive phase retrieval In the mix of compressive sensing, what if the observations b lose the phase information? (the sensing apparatus is only sensitive to the intensity of signal response)

2

A new formulation via a lifting technique: X . = xxH ∈ Cn×n minX trace(X) + λX1

  • subj. to

yi = |bi|2 = trace(ΦiX), i = 1, · · · , N, X 0, (6)

3

We barely scratched the surface with advanced light source (LBL synchrotron).

References: Ohlsson, AY, Dong, Sastry, Snowbird Workshop 2012; SysID 2012; CDC (submitted), 2012

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-46
SLIDE 46

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

What we have learned from the Yearbook of Pattern Recognition?

1

High dimensionality of data could be both curse and blessing for pattern recognition.

2

Rich phenomena of sparsity in representing image pixels and their matrix ranks enable new solutions to compensate the culprits of corruption, occlusion, and distortion in images/video.

3

In this process, the proper use of the true structure of individual problems can and does help in finding the optimal solution exceeding the pessimistic lower-bound of the theory.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-47
SLIDE 47

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

What we have learned from the Yearbook of Pattern Recognition?

1

High dimensionality of data could be both curse and blessing for pattern recognition.

2

Rich phenomena of sparsity in representing image pixels and their matrix ranks enable new solutions to compensate the culprits of corruption, occlusion, and distortion in images/video.

3

In this process, the proper use of the true structure of individual problems can and does help in finding the optimal solution exceeding the pessimistic lower-bound of the theory.

Figure: From an engineering standpoint, real-time, multi-modal applications over the network prefer a new cloud-computing model not just for data storage solution, but a unified data analysis and modeling solution. This new computational model is in high demand from both the government and industry.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-48
SLIDE 48

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Acknowledgments

UC Berkeley

Dr S. Sastry, N. Naikal, Dr D. Singaraju, V. Shia, Dr H. Ohlsson, R. Dong

Lawrence Berkeley Lab

Dr S. Marchesini

  • Univ. Illinois
  • A. Ganesh, H. Mobahi,
  • A. Wagner, Z. Zhou

Columbia

Dr J. Wright

MSR Asia

Dr Y. Ma

UT Austin

Dr S. Vishwanath,

  • C. Slaughter

Publications

Wright, Yang, Ganesh, Sastry, Ma. “Robust face recognition via sparse representation.” IEEE PAMI, 2009. Ganesh, Ma, Wagner, Wright, Yang, Zhou. “Face recognition by sparse representation.” Cambridge University Press, 2011. Yang, Ganesh, Zhou, Sastry, Ma.“A review of fast ℓ1-minimization algorithms in robust face recognition.” arXiv, 2010. Wagner, Shia, Yang, Sastry, Ma. “Fast ℓ1-minimization and parallelization for face recognition.” Asilomar, 2011. Singaraju, Tron, Elhamifar, Yang, Sastry. “On the Lagrangian biduality of sparsity minimization problems.” ICASSP, 2012. Mobahi, Zhou, Yang, Ma. “Holistic 3D reconstruction of urban structures from low-rank textures.” ICCV Workshop, 2011. Naikal, Yang, Sastry. “Informative feature selection for objection recognition via Sparse PCA.” ICCV, 2011. Slaughter, Yang, Bagwell, Checkles, Sentis, Vishwanath. “Sparse online low-rank projection and outlier rejection (SOLO) for 3-D rigid-body motion registration.” ICRA, 2012. Ohlsson, Yang, Dong, Sastry. “Compressive phase retrieval from squared output measurements via semidefinite programming.” SysID, 2012.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-49
SLIDE 49

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Sparse Representation vs Dense Representation in Classification

Dense Representation performance comparable if the samples are well conditioned xℓ2 = arg min x2

  • subj. to

b = Ax Sparse representation is more discriminative when samples are highly coherent

Reference: Zhang et al, Sparse representation or collaborative representation, ICCV, 2011.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation

slide-50
SLIDE 50

Introduction Sparse Representation ℓ1-Minimization Low-Rank Representation Future Topics Discussion

Sketch of the Biduality Approach

Primal (NP-Hard)

argmin

{x+≥0,x−≥0,z,g}

` αT g + βT z ´ g ∈ {0, 1}K , z ∈ {0, 1}n, A(x+ − x−) + (e+ − e−) = b Πg ≥

1 M (x+ + x−)

z ≥

1 M (e+ + e−)

⇒ Lagrangian Dual Concave lower bound and LP ⇔ Bidual (Convex and LP)

argmin

{x+≥0,x−≥0,z,g}

` αT g + βT z ´ g ∈ [0, 1]K , z ∈ [0, 1]n, A(x+ − x−) + (e+ − e−) = b Πg ≥

1 M (x+ + x−)

z ≥

1 M (e+ + e−)

Lagrangian Bidual of Mixed Sparsity Minimization Problem x∗

bidual = argmin x K

X

k=1

` xk∞ + γe1 ´

  • subj. to

Ax + e = b. The LP from bidual is an (optimal) lower bound of the primal problem.

Reference: Singaragu, Tron, Elhamifar, AY, Sastry, On the Lagrangian biduality of sparsity minimization problems, ICASSP, 2012. Chr´ etien, An alternating ℓ1 approach to the compressed sensing problem, 2010.

http://www.eecs.berkeley.edu/~yang High-Dimensional Pattern Recognition via Sparse Representation