1
Centre for Vision, Speech and Signal Processing
Multimodal Biometrics Josef Kittler Centre for Vision, Speech and - - PowerPoint PPT Presentation
Centre for Vision, Speech and Signal Processing Multimodal Biometrics Josef Kittler Centre for Vision, Speech and Signal Processing University of Surrey, Guildford GU2 7XH J.Kittler@surrey.ac.uk Acknowledgements: Dr Norman Poh 1 Biometric
1
Centre for Vision, Speech and Signal Processing
2
§False rejection §False acceptance §Total error rate/Half total error rate §Operating point §Equal error rate (civilian) §Zero false acceptance (high security forensic) §Zero false rejection (low risk banking)
MFCC
The image part with relationship ID rId6 was not found in the file.GMM
The image part with relationship ID rId6 was not found in the file.3
Different traits- different properties
4
n Motivation for multiple biometrics
n To enhance performance n To increase population coverage by reducing the failure
to enroll rate
n To improve resilience to spoofing n To permit choice of biometric modality for
authentication
n To extend the range of environmental conditions under
which authentication can be performed
5
n Fusion architectures n Score level fusion: Problem formulation n Estimation error n Multiple expert paradigm n Quality based fusion of biometric
n Discussion and conclusions
6
n Integration of multiple biometric
n Sensor (data) level fusion
n Linear/nonlinear combination of registered
n Representation space augmentation
n Feature level fusion n Soft decision level fusion n Decision level fusion
7
PCA LDA MFCC PLP DCT
The image part with relationship ID rId5 was not found in the file.GMM MLP MSE GMM HMM
The image part with relationship ID rId5 was not found in the file.Features Data threshold score
Legend
8
9
10
11
12
13
PCA LDA MFCC PLP DCT
The image part with relationship ID rId6 was not found in the file.GMM MLP MSE GMM HMM Fusion
The image part with relationship ID rId6 was not found in the file.Feature Fusion Data Fusion Score Fusion
threshold score
Legend
14
Data Fusion
threshold score
Legend
15
Feature Fusion
threshold score
Legend
16
Fusion
The image part with relationship ID rId5 was not found in the file.Score Fusion threshold score
Legend
17
18
19
n Given n Bayes decision rule
n Assign subject to class if
n Note
20
n n The integration over x is marginalisation
n x is a feature vector determined by all traits n Implicitly a multiple classifier fusion
n Marginalised estimate of class posterior
21
n Feature level fusion
n Each modality has its own set of features xi n Score is a function of all xi jointly n Fusion process marginalisation is over the joint
n In addition, there could be modality specific
22
n Score level fusion
n Each modality has its own set of features xi n The fused score is a product of individual
n Fusion process marginalisation is over modality
23
n basic score level fusion is by product n product can be approximated by a sum if
n the resulting decision rule becomes
25
n Decision level fusion
n Builds on score level fusion n Different fusion rules (rank, vote, ect)
n
n Each modality produces a hard decision n
n Final decision n In a two class case, a hard decision is made
26
27
29
30
31
n Consider a vector of normalised scores n with mean n and covariance matrix
32
33
n Rearranging n Variance can be bounded
n For uncorrelated scores - variance reduces by a factor
n For negatively correlated scores – variance can be
brought to zero
n For negatively correlated scores the variance drops
most when
34
VOICE FACE
35
36
37
n simple rules (sum, product, max, min,
n trained fusion rule (logistic regression, decision
templates, sparse based representation, svm, deep architectures)
n multistage systems (stacking) n machine learning tools
n Separability measures n Feature selection n Clustering n Distance metric n Classification
38
n aposteriori probability estimate
39
n Motivation for score normalisation
n Non-homogeneous scores (distance, similarity) n Different ranges n Different distributions
n Desirable properties
n Robustness n Efficiency
n Most effective methods
n Nonlinear mapping with saturation for very large/small scores n Increased sensitivity near the boundaries (Ross and Jain)
40
Quality-based normalization Cohort-based normalization (online) Changing signal quality Changing signal quality Client/user- specific normalization (offline) User- dependent score characteristics
42
PCA LDA MFCC PLP DCT
The image part with relationship ID rId3 was not found in the file.GMM MLP MSE GMM HMM Fusion Face quality detectors Speech quality detectors
43
n Different spectral bands n Voice/image sensed lips dynamics n Visual/language modalities for person
83
n Video surveillance very important tool for crime
n Watch list n Forensic video analysis
n Hard biometrics (face) not always available n Other video analytics tools are useful alternatives
n Soft biometrics (clothing, gait) n Tracking
84
n Person Re-Identification
n Recognising a person from non-
n Formulated as a ranking problem
n The majority of existing methods are
n Images or videos
n Joint vision and language modelling
n Image and video captioning, Visual
n Can language help vision in Re-ID?
n Augmenting existing datasets
n CUHK03: ~2700 descriptions n VIPeR: ~1300 descriptions
n Crowd-sourced, 8 annotators n Annotation
n Free style sentences, not attributes n Encouraged to cover details n On average 45 words per description n Per image rather than per identity
A front profile of a young, slim and average height, black female with long brown hair. She wears sunglasses and possibly earrings and
coloured print on its chest, blue jeans and white sports shoes.
A short and slim young woman carrying a tortilla coloured rectangular shoulder bag with caramel straps,
straight auburn hair worn loose. She wears a dark brown short sleeved top along with bell bottomed ice blue jeans and her shoes can’t be seen but she might be wearing light colored flat shoes.
n ResNet-50 for visual information n Word2Vec embedding n Neural networks: CNN and LSTM n Multi-class setting, 2 examples per
n Data augmentation n Metric learning with learnt
n Canonical Correlation
training
n Consider features x and y extracted from
n Basic principle – find direction in the
n Gauging linear relationship between two
n Finding two sets of basis vectors such that the
n Determine correlation coefficients
91
n Training set of pairs of vectors n Maximisation of the correlation of the projections n Leads to an eigenvalue problem n With cov matrices regularised by
92
n Three sets:
n Training, query, gallery n Training: image and language pairs
n Various settings, query x gallery:
n V x V, L x L, V x L, V x VL, VL x VL
n Asymmetric settings:
n Transfer language info. With CCA
n XQDA as metric learning
n Crossmodal & multimodal matching facilitated by
n Performance gain due to
n Joint training n Fusion of modalities
95
A tall, slim man, probably an Asian in his early twenties. He has dark short
is holding something in his hand, probably a letter or envelope cover. He is wearing a multi-colour polo t- shirt with blue, white, black and red stripes on it. He is wearing a pair of dark colour pants and brown shoes.
SE-ResNet based Vision Model
(50 layers , [ 3 X 3 ] kernel)
SE-ResNet based Language Model
(50 layers , [ 1 X 2 ] kernel)
fimg ftxt LID
Vision
LID
Text
Joint CCA Embedding Space Learning
n Role of multimodal biometrics n Fusion levels n Math formulation of different alternatives n The concept of marginalisation/multiple classifier
n Notion of quality based, user specific and cohort
n Multimodal sensing and fusion of a single
n Example: fusion of vision/language modalities
96
n
Using Joint User-Specic and Sample Bootstraps. IEEE Trans. Pattern Analysis and Machine Intelligence, 29(3):492-498, 2007.
n
Verication Systems. IEEE Transactions on Audio, Speech and Language Processing, 16(3):594-606, 2008.
n
Multimodal Biometric Test Bed. Pattern Recognition, 43(3):1094{1105, 2010.
n
Multimodal Biometric Fusion Algorithms. IEEE Trans. Information Forensics and Security,4(4):849{866, 2009.
n
Information for Multimodal Biometric Fusion", IEEE Trans. on Systems, Man, Cybernatics Part A : Systems and Humans, 40(3):539{554, 2010.
n
Tresadern, P., et al., Mobile Biometrics: Combined Face and Voice Verification for a Mobile
n
Poh, N. and S. Bengio, F-ratio Client-Dependent Normalisation on Biometric Authentication Tasks, in IEEE Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP)2005:
n
Poh, N. and J. Kittler, Incorporating Variation of Model-specific Score Distribution in Speaker Verification Systems. IEEE Transactions on Audio, Speech and Language Processing, 2008. 16(3): p. 594-606.
97
n
Poh, N., et al., Group-specific Score Normalization for Biometric Systems, in IEEE Computer Society Workshop on Biometrics, CVPR2010. p. 38-45.
n
Poh, N. and M. Tistarelli. Customizing biometric authentication systems via discriminative score calibration. in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. 2012. IEEE.
n
Poh, N. and J. Kittler, A Biometric Menagerie Index for Characterising Template/Model- specific Variation, in Proc. of the 3rd Int'l Conf. on Biometrics2009: Sardinia. p. 816-827.
n
Poh, N. and J. Kittler, A Methodology for Separating Sheep from Goats for Controlled Enrollment and Multimodal Fusion, in Proc. of the 6th Biometrics Symposium2008:
n
Poh, N., et al., A User-specific and Selective Multimodal Biometric Fusion Strategy by Ranking Subjects. Pattern Recognition Journal, 46(12): 3341-57 , 2013:
n
Poh, N., G. Heusch, and J. Kittler, On Combination of Face Authentication Experts by a Mixture of Quality Dependent Fusion Classifiers, in LNCS 4472, Multiple Classifiers System (MCS)2007: Prague. p. 344-356.
n
Poh, N. and J. Kittler, A Unified Framework for Multimodal Biometric Fusion Incorporating Quality Measures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012. 34(1): p. 3-18.
98
n
Poh, N., A. Rattani, and F. Roli, Critical analysis of adaptive biometric systems. Biometrics, IET, 2012. 1(4): p. 179-187.
n
Merati, A., N. Poh, and J. Kitter, Extracting Discriminative Information from Cohort Models, in IEEE 3rd Int'l Conf. on Biometrics: Theory, Applications, and Systems (BTAS)2010. p. 1- 6.
n
Poh, N., A. Merati, and J. Kitter, Making Better Biometric Decisions with Quality and Cohort Information: A Case Study in Fingerprint Verification, in Proc. 17th European Signal Processing Conf. (Eusipco)2009: Glasgow. p. 70-74.
n
Merati, A., N. Poh, and J. Kittler, User-Specific Cohort Selection and Score Normalization for Biometric Systems. Information Forensics and Security, IEEE Transactions on, 2012. 7(4): p. 1270-1277.
n
Poh, N., A. Merati, and J. Kittler. Heterogeneous Information Fusion: A Novel Fusion Paradigm for Biometric Systems. in International Joint Conference on Biometrics. 2011.
n
Poh, N., A. Martin, and S. Bengio, Performance Generalization in Biometric Authentication Using Joint User-Specific and Sample Bootstraps. IEEE Trans. on Pattern Analysis and Machine, 2007. 29(3): p. 492-498.
n
Poh, N. and J. Kittler, A Method for Estimating Authentication Performance Over Time, with Applications to Face Biometrics, in 12th IAPR Iberoamerican Congress on Pattern Recognition (CIARP)2007. p. 360-369.
99
n
Poh, N. and S. Bengio, Can Chimeric Persons Be Used in Multimodal Biometric Authentication Experiments?, in LNCS 3869, 2nd Joint AMI/PASCAL/IM2/M4 Workshop on Multimodal Interaction and Related Machine Learning Algorithms MLMI2005: Edinburgh.
n
Poh, N., et al., Benchmarking Quality-dependent and Cost-sensitive Score-level Multimodal Biometric Fusion Algorithms. IEEE Trans. on Information Forensics and Security, 2009. 4(4): p. 849-866.
n
Poh, N., et al., An Evaluation of Video-to-video Face Verification. IEEE Trans. on Information Forensics and Security, 2010. 5(4): p. 781-801.
n
for Unconstrained Face Recognition, IEEE Trans. on Information Forensics and Security, 2014.
100