Pattern Recognition Theory
- Prof. Marios Savvides
Lecture 12 : Correlation Filters
Pattern Recognition Theory Lecture 12 : Correlation Filters Pattern - - PowerPoint PPT Presentation
Prof. Marios Savvides Pattern Recognition Theory Lecture 12 : Correlation Filters Pattern Matching a How to match two patterns? How do you locate where the pattern is in a long sequence of patterns? b Are these two patterns the
Lecture 12 : Correlation Filters
2
How to match two patterns? How do you locate where the
pattern is in a long sequence of patterns?
Are these two patterns the same? How to compute a match metric?
Marios Savvides
3
2
( ) ( ) 2 e = = = + −
T T T T
a - b a - b a - b a a b b a b
2 1
( ) _ _
N i
a i energy
a
=
= =
T
a a
2
T T T T
2 1
( ) _ _
N i
i energy
=
= =
T
b b b b
1
( ) ( ) _
N i
i i correlation term
=
= =
T
a b a b Marios Savvides
4
Assume we normalize energy of a and b to 1 i.e. aTa=bTb=1 Then to minimize error, we seek to maximize the correlation
So performing correlation, the maximum correlation point is
2 1
( ) _ _
N i
a i energy
a
=
= =
T
a a
2
T T T T
2 1
( ) _ _
N i
i energy
=
= =
T
b b b b
1
( ) ( ) _
N i
i i correlation term
=
= =
T
a b a b Marios Savvides
5
Normalized correlation between a(x) and b(x) gives 1 if they match
perfectly (i.e. only if a(x) = b(x) ) and close to 0 if they don’t match.
Problem: Reference patterns rarely have same appearance Solution: Find the pattern that is consistent (i.e., yields large
correlation) among the observed variations.
T T T
r(x) test pattern s(x) reference pattern
Marios Savvides
6
Marios Savvides
7
Built-in Shift-Invariance: shift in the input image leads to a corresponding
shift in the output peak. Classification result remains unchanged.
Matched filters are just replicas of the object that we are trying to find.
Problem is we need as many matched filters as the different appearances that object can look under different conditions. (i.e. a matched filter for every pose, illumination and expression variation).
Using Matched filters is computationally and memory very expensive. We can synthesize distortion tolerant filters that can recognize more than
We can build different types of distortion-tolerance in each filter (e.g.
scale, rotation, illumination etc).
We will show advanced correlation filters exhibit graceful degredation
with input image distortions.
Marios Savvides
8
To Input SLM Fourier Lens Fourier Lens
Correlation peaks for
To Filter SLM CCD Detector Laser Beam Fourier Transform Inverse Fourier Transform
SLM: Spatial Light Modulator CCD: Charge-Coupled Detector
Marios Savvides
9
Use Fast Fourier Transforms… How? Fourier Transform property tells us:
Convolution Theorem: Convolution of two functions a(x) and b(x) in the spatial domain
is equivalently computed in the Fourier domain by multiplying the FT{a(x)} with the FT{b(x)}.
i.e. in matlab this would be
– Convolution=IFFT ( FFT(a) .* FFT(b) ) (assuming 1 D)
Correlation Theorem: Similar to convolution except the correlation of functions a(x)
and b(x) in the spatial domain is equivalently computed in the Fourier domain by multiplying FT{a(x)} with conjugate of FT{b(x)}. – Correlation=IFFT ( FFT(a) .* conj(FFT(b)) )
Marios Savvides
10
Take a signal a(x) a=[ 0 0 1 2 3 2 1 0 0 0 0 0 0 0 0 1 1 0 0];
Looks like this ->
[1]Compute its Discrete Fourier Transform or FFT, a(x)->A(u).
Marios Savvides
11
Is just like convolution except we convolve the
Taking the conjugate in the Fourier domain time-
Since convolution automatically time-reverses
Marios Savvides
12
Marios Savvides
13
Marios Savvides
14
Marios Savvides
15
Marios Savvides
16
Marios Savvides
17
Notice nice peak..with height of 1 at the location where the images
perfectly align. The peak height indicates confidence of match. Because s(x,y) is random signal, no other shifts of the signal match and there is
Marios Savvides
18
Matched Filter is simply the reference image that you want to
Matched Filter is optimal for detecting the known reference
OK...but what are the short-comings of Matched Filters?
Matched Filters only match with the reference signal, and even
slight distortions will not produce a match.
Thus you need as many matched filters as number of training
images (N).
Not feasible from a computational viewpoint as you will need to
perform N correlations and store all N filters.
Marios Savvides
19
Training Training Images Images captured captured by by camera camera
Filter Design Filter Design Module Module
Correlation Correlation Filter H Filter H (Template) (Template)
Frequency Frequency Domain array Domain array Frequency Frequency Domain array Domain array Frequency Frequency Domain array Domain array
N x N pixels N x N pixels N x N pixels (complex) N x N pixels (complex) N x N pixels N x N pixels (complex) (complex)
Marios Savvides
20
Test Image Test Image captured captured by by camera camera
Correlation Correlation Filter H Filter H (Template) (Template)
Frequency Frequency Domain array Domain array
N x N pixels N x N pixels N x N pixels N x N pixels
Resulting Resulting Frequency Frequency Domain array Domain array
21
Idea is to overcome the limitations of MFs, by building a
How? So we want to keep the ideas from MF, in that we want the
And we want the filter to be in the convex hull of the training
Marios Savvides
22
Assume h is a vector containing the impulse response of the filter
(vectorized from 2D).
Let xi be vectorized ith training image into a column vector. We want the peak at the origin to be 1, i.e. we want the inner-product
(since correlations are just inner-products of the filter h and signal x at different spatial shifts).
Where matrix X=[ x1 x2 x3 x4 ] contains the images xi in vectorized format
along its columns.
c is a column vector containing the peak contraints- in this case a vector
Marios Savvides
23
We also said that the ECP-SDF filter is a linear combination of Matched
Filters (or equivalently) a linear weighted combination of the training images.
i.e. h= X a
Where Subst h= X a in XTh=c to get
1 2 1 1 2 2 3 3
N N N
Marios Savvides
24
Use XTXa=c to solve for the linear combination weights a
OK.. Now we know the linear combination weights so lets
Marios Savvides
25
Use these images for training (peak=1 for all correlation outputs)
Marios Savvides
26
Marios Savvides
27
Marios Savvides
28
ECP-SDF does not generalize well…atleast not for
Discrimination is poor, as we saw we got high peaks for
Does not guarantee peak is at origin, we may get a sharp
Solution:- Minimum Average Correlation Filters (MACE)
Marios Savvides
29
MACE filter minimizes the average energy of the correlation outputs while
maintaining the correlation value at the origin at a pre-specified level.
Sidelobes are reduced greatly and discrimination performance is
improved MACE Filter produces a MACE Filter produces a sharp peak at the origin sharp peak at the origin
Correlation Plane
30
Minimizing spatial correlation energy can be done directly in the frequency
domain by expressing the ith correlation plane (ci) energy Ei as follows: h D h h X X h
i i i d k i d d p i d i
k X k H p c E
+ + = =
= = = =
* 1 2 2 1 1 2 1
) ( ) ( ) (
⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ) ( . ) 3 ( ) 2 ( ) 1 ( d X X X X
i i i i
Dh h h X X h
+ = + =
= ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎣ ⎡ = =
N i i i N N i i N ave
E E
1 * 1 1 1
The Average correlation plane energy for the N training images is given
by Eave
31
The value at the origin of the correlation of the i-th training image is: * 2 0 * 1 1
d d j p i i i i p p
π + = =
Specify the correlation peak values for all the training image using column
vector u
Minimizing the average correlation energies h+Dh subject to the
constraints X+h=u leads to the MACE filter solution.
32
33
34
For Matched Filter we showed that the peak height is what
For MACE filters, the optimization is not only to keep the peak
Thus it makes sense to use a metric that can measure
We need a metric that can measure the Peak sharpness! As
35
pixel region pixel region
σ in a in a bigger region centered at the peak bigger region centered at the peak
PSR invariant to constant illumination changes Match declared when PSR is large, i.e., peak must not only
36
Use these images for training (peak=1 for all correlation outputs)
37
38
Marios Savvides
39
Facial Expression Database (AMP Lab, CMU) 13 People 75 images per person Varying Expressions 64x64 pixels Constant illumination 1 filter per person made
40
Response to Training Images Response to Faces Images from Person A
Response to 75 face images of the other 12 people=900 PSRs
41
49 Faces from PIE Database illustrating the variations in illumination
42
We used three face images to synthesize a correlation filter The three selected training images consisted of 3 extreme
43
10 20 30 40 50 60 10 20 30 40 50 60 Equal Error Rate using Individual Eigenface Subspace Method on PIE Database with No Background Illumination Person Equal Error Rate
44
Threshold Threshold
45
1 2 1 2 3 4 5 3 4 5 6 7 6 7 8 9 10 8 9 10 11 12 13 11 12 13 14 15 14 15 16 17 18 16 17 18 19 20 19 20 21 21
46 99.9% 1 99.9% 2 94.2% 79 98.4% 22 91.0% 122 18,19,20 99.9% 1 99.9% 1 82.1% 244 97.8% 30 78.0% 300 8,9,10 99.1% 10 99.1% 10 73.3% 365 50.9% 670 36.1% 872 7,10,19 99.7% 3 99.9% 1 71.4% 390 93.2% 93 72.4% 337 5,7,9,10 100% 99.9% 1 89.3% 145 97.1% 40 91.4% 110 5,6,7,8,9, 10 100% 100% 97.3% 36 97.3% 31 97.6% 33 5,6,7,8,9,10, 11,18,19,20 % Rec Rate # Errors % Rec Rate # Errors %Rec Rate # Errors %Rec Rate # Errors %Rec Rate # Errors Training Images UMACE Filters MACE Filters Fisherfaces 3D Linear Subspace IPCA Frontal Lighting PIE dataset (face images captured with room lights off)
47
We have shown that these correlation filters seem to be
What about partial face recognition? In practice a face detector will detect and retrieve part of the
*M. Savvides, B.V.K. Vijaya Kumar and P.K. Khosla, "Robust, Shift-Invariant Biometric Identification from Partial Face Images", Defense & Security Symposium, special session on Biometric Technologies for Human Identification (OR51) 2004.
48
Vertical cropping of test face image pixels (correlation filters are trained on FULL size images) Using Training set #1 (3 Using Training set #1 (3 extreme lighting images) extreme lighting images) Using Training set #2 (3 Using Training set #2 (3 frontal lighting images) frontal lighting images)
49
Using Training set #1 (3 Using Training set #1 (3 extreme lighting images) extreme lighting images) Using Training set #2 (3 Using Training set #2 (3 frontal lighting images) frontal lighting images)
50
Zero intensity background Zero intensity background Textured background Textured background
*M. Savvides, B.V.K. Vijaya Kumar and P.K. Khosla, "Robust, Shift-Invariant Biometric Identification from Partial Face Images", Defense & Security Symposium, special session on Biometric Technologies for Human Identification (OR51) 2004.
51
52
53
54
*M.Savvides and B.V.K. Vijaya Kumar, "Efficient Design of Advanced Correlation Filters for Robust Distortion- Tolerant Face Identification", IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) 2003.
55
56
A biometric filter (stored on a card) can be lost or stolen
Can we re-issue a different one (just as we re-issue a different credit
card)?
There are only a limited set of biometric images per person (e.g.,
We can use standard encryption methods to encrypt the biometrics
and then decrypt them for use during the recognition stage, however there is a ‘window’ of opportunity where a hacker can obtain the decrypted biometric during the recognition stage.
We have figure out a way to encrypt them and ‘work’ or
authenticate in the encrypted domain and NOT directly in the
*M. Savvides, B.V.K. Vijaya Kumar and P.K. Khosla, "Authentication-Invariant Cancelable Biometric Filters for Illumination- Tolerant Face Verification", Defense & Security Symposium, special session on Biometric Technologies for Human Identification, 2004.
57
Training Images Random PSF Random Number Generator PIN Encrypted Training Images Encrypted MACE Filter seed
58
Test Image Random Convolution Kernel Random Number Generator PIN (can be obtained from another biometric e.g. fingerprint) Encrypted Test Image Encrypted MACE Filter
PSR seed
59
We can show theoretically that performing this convolution
Thus, working in this encrypted domain does not change the
60
61
Convolved with Convolved with Random Random Convolution Convolution Kernel 1 Kernel 1 Convolved with Convolved with Random Random Convolution Convolution Kernel 2 Kernel 2 Original Original Training Training Images Images
62