[PPT] - Multimodal Biometrics Josef Kittler Centre for Vision, Speech and PowerPoint Presentation

SLIDE 1

1

Centre for Vision, Speech and Signal Processing

Multimodal Biometrics Josef Kittler

Centre for Vision, Speech and Signal Processing University of Surrey, Guildford GU2 7XH J.Kittler@surrey.ac.uk Acknowledgements: Dr Norman Poh

SLIDE 2

2

Biometric authentication and Performance characterisation

§False rejection §False acceptance §Total error rate/Half total error rate §Operating point §Equal error rate (civilian) §Zero false acceptance (high security forensic) §Zero false rejection (low risk banking)

MFCC

The image part with relationship ID rId6 was not found in the file.

GMM

The image part with relationship ID rId6 was not found in the file.

SLIDE 3

3

Multimodal biometrics

Different biometric

modalities developed

–finger print –iris –face (2D, 3D) –voice –hand –lips dynamics –gait

Different traits- different properties

usability
acceptability
performance
robustness in changing environment
reliability
applicability (different scenarios)

SLIDE 4

4

Benefits of multimodality

n Motivation for multiple biometrics

n To enhance performance n To increase population coverage by reducing the failure

to enroll rate

n To improve resilience to spoofing n To permit choice of biometric modality for

authentication

n To extend the range of environmental conditions under

which authentication can be performed

SLIDE 5

5

OUTLINE

n Fusion architectures n Score level fusion: Problem formulation n Estimation error n Multiple expert paradigm n Quality based fusion of biometric

modalities

n Discussion and conclusions

SLIDE 6

6

Fusion architectures

n Integration of multiple biometric

modalities

n Sensor (data) level fusion

n Linear/nonlinear combination of registered

variables

n Representation space augmentation

n Feature level fusion n Soft decision level fusion n Decision level fusion

SLIDE 7

7

Decision level fusion

PCA LDA MFCC PLP DCT

The image part with relationship ID rId5 was not found in the file.

GMM MLP MSE GMM HMM

The image part with relationship ID rId5 was not found in the file.

Features Data threshold score

Legend

SLIDE 8

8

Decision-level fusion

n How useful? clients impostors score modality1 score modality2 T1 T2

SLIDE 9

9

Decision-level fusion

n Accepted by either modality clients impostors score modality1 score modality2 T2 T1

SLIDE 10

10

Decision-level fusion

n Accepted by both clients impostors score modality1 score modality2 T2 T1

SLIDE 11

11

Decision-level fusion

clients impostors score modality1 score modality2

Better performance by adapting the thresholds

SLIDE 12

12

Score-level fusion

n Should improve performance clients impostors score modality1 score modality2

SLIDE 13

13

Levels of Fusion

PCA LDA MFCC PLP DCT

The image part with relationship ID rId6 was not found in the file.

GMM MLP MSE GMM HMM Fusion

The image part with relationship ID rId6 was not found in the file.

Feature Fusion Data Fusion Score Fusion

less information to deal with

threshold score

Legend

SLIDE 14

14

Data level fusion

The image part with relationship ID rId5 was not found in the file. The image part with relationship ID rId5 was not found in the file.

Data Fusion

less information to deal with

threshold score

Legend

SLIDE 15

15

Feature level fusion

The image part with relationship ID rId5 was not found in the file. The image part with relationship ID rId5 was not found in the file.

Feature Fusion

less information to deal with

threshold score

Legend

SLIDE 16

16

Score level fusion

The image part with relationship ID rId5 was not found in the file.

Fusion

The image part with relationship ID rId5 was not found in the file.

Score Fusion threshold score

Legend

SLIDE 17

17

Biometric system

Pattern representation Pattern recognition problem N – number of classes b - biometric trait x - feature vector

priori probability of

class

measurement distri-

butions of patterns in class

SLIDE 18

18

Bayesian decision making

P(ω1 | bk) P(ω2 | bk) xk Aposteriori class probabilities P(ω3 | bk) Bayes minimum Error rule

SLIDE 19

19

Problem formulation

n Given n Bayes decision rule

n Assign subject to class if

P(ω| b1,…, bK) = max P( | b1,…, bK)

n Note

SLIDE 20

20

Fusion options

n n The integration over x is marginalisation

ver the distribution

n x is a feature vector determined by all traits n Implicitly a multiple classifier fusion

Bagging, boosting, drop out, hard sample mining

n Marginalised estimate of class posterior

SLIDE 21

21

Fusion options

n Feature level fusion

n Each modality has its own set of features xi n Score is a function of all xi jointly n Fusion process marginalisation is over the joint

distribution of all modalities

n In addition, there could be modality specific

marginalisation at the feature extraction level

SLIDE 22

22

Fusion options

n Score level fusion

n Each modality has its own set of features xi n The fused score is a product of individual

modality specific scores

n Fusion process marginalisation is over modality

specific distributions

SLIDE 23

23

Problem formulation: comments

n basic score level fusion is by product n product can be approximated by a sum if

does not deviate much from i.e.

n the resulting decision rule becomes

SLIDE 24

25

Fusion options

n Decision level fusion

n Builds on score level fusion n Different fusion rules (rank, vote, ect)

n

Example: Vote fusion

n Each modality produces a hard decision n

the count of modalities outputting

n Final decision n In a two class case, a hard decision is made

by comparing the score against a threshold

SLIDE 25

26

Fixed fusion strategies

SLIDE 26

27

Effect of estimation errors

P(ω1 | xk) P(ω2 | xk) margin xk Aposteriori class probabilities Estimation error distribution

SLIDE 27

29

Sources of estimation errors

Feature vector output by sensor i Training set for the i-th expert Classifier model Distribution of models Parameters for expert i Distribution of expert i parameter

SLIDE 28

30

Coping with estimation errors

P(ω1 | xk) P(ω2 | xk) margin xk Aposteriori class probabilities Estimation error distribution A Reducing the variance

SLIDE 29

31

Variance reduction

n Consider a vector of normalised scores n with mean n and covariance matrix

SLIDE 30

32

Variance reduction

n Fuse scores by n Average class conditional variance n Variance of fused score

SLIDE 31

33

Variance reduction

n Rearranging n Variance can be bounded

n For uncorrelated scores - variance reduces by a factor

f R

n For negatively correlated scores – variance can be

brought to zero

n For negatively correlated scores the variance drops

most when

SLIDE 32

34

Biometric Personal Identity Authentication

VOICE FACE

Fusion of face and voice

SLIDE 33

35

Modalities Performance FAR FRR HTER Face 1.75 2.00 1.88 Voice 1.47 1.00 1.23 Fusion SVM 0.32 0.25 0.28 Fusion MLP 0.34 0.25 0.29

Performance of individual and fused experts

Toy example

SLIDE 34

36

Merits of multimodal fusion

SLIDE 35

37

Fusion strategies

n simple rules (sum, product, max, min,

rank)

n trained fusion rule (logistic regression, decision

templates, sparse based representation, svm, deep architectures)

n multistage systems (stacking) n machine learning tools

n Separability measures n Feature selection n Clustering n Distance metric n Classification

SLIDE 36

38

Direct score fusion: score normalisation

n Aposteriori class probabilities are

automatically normalised to [0,1]

n Some systems compute a matching

score , rather than

n Scores have to be normalised to

facilitate fusion by simple rules

n aposteriori probability estimate

SLIDE 37

39

Score normalisation (cont)

n Motivation for score normalisation

n Non-homogeneous scores (distance, similarity) n Different ranges n Different distributions

n Desirable properties

n Robustness n Efficiency

n Most effective methods

n Nonlinear mapping with saturation for very large/small scores n Increased sensitivity near the boundaries (Ross and Jain)

SLIDE 38

40

Score normalisation (cont)

n Min-max n Scaling n Z-score

SLIDE 39

Information sources

Quality-based normalization Cohort-based normalization (online) Changing signal quality Changing signal quality Client/user- specific normalization (offline) User- dependent score characteristics

SLIDE 40

42

Confidence-based Fusion Algorithms

The image part with relationship ID rId3 was not found in the file.

PCA LDA MFCC PLP DCT

The image part with relationship ID rId3 was not found in the file.

GMM MLP MSE GMM HMM Fusion Face quality detectors Speech quality detectors

SLIDE 41

43

Generative & Discriminative Approaches in QDF

Generative Discriminative (probability-based) Discriminative (function-based) e.g. GMM e.g. MLP logistic regression e.g. SVM, MLP Algorithm used in experiments x and q are vectors

SLIDE 42

Case study in multimodal soft biometric fusion

n Multimodal biometric traits n Multimodal sensing of the same

biometric trait

n Different spectral bands n Voice/image sensed lips dynamics n Visual/language modalities for person

re-identification

83

SLIDE 43

Background and motivation

n Video surveillance very important tool for crime

prevention and detection

n Watch list n Forensic video analysis

n Hard biometrics (face) not always available n Other video analytics tools are useful alternatives

n Soft biometrics (clothing, gait) n Tracking

84

SLIDE 44

Soft biometrics and re- identification

n Person Re-Identification

n Recognising a person from non-

verlapping cameras

n Formulated as a ranking problem

SLIDE 45

Re-ID with V&L

n The majority of existing methods are

vision only

n Images or videos

n Joint vision and language modelling

n Image and video captioning, Visual

question answering, Image synthesis from language, …

n Can language help vision in Re-ID?

SLIDE 46

Language annotation

n Augmenting existing datasets

n CUHK03: ~2700 descriptions n VIPeR: ~1300 descriptions

n Crowd-sourced, 8 annotators n Annotation

n Free style sentences, not attributes n Encouraged to cover details n On average 45 words per description n Per image rather than per identity

SLIDE 47

Language annotation

A front profile of a young, slim and average height, black female with long brown hair. She wears sunglasses and possibly earrings and

necklace. She wears a brown t-shirt with a golden

coloured print on its chest, blue jeans and white sports shoes.

A short and slim young woman carrying a tortilla coloured rectangular shoulder bag with caramel straps,

n her right side. She has a light complexion and long,

straight auburn hair worn loose. She wears a dark brown short sleeved top along with bell bottomed ice blue jeans and her shoes can’t be seen but she might be wearing light colored flat shoes.

SLIDE 48

Re-ID with language

n ResNet-50 for visual information n Word2Vec embedding n Neural networks: CNN and LSTM n Multi-class setting, 2 examples per

class (identity)

n Data augmentation n Metric learning with learnt

representations (XQDA)

n Canonical Correlation

SLIDE 49

Re-ID with language

Detecting the concept of “spectacles”
“bespectacled”, ”glasses”, “eye-glasses”, …
GT, CNN, LSTM
One channel becomes “spectacles” detector during

training

Good representation learnt from unstructured data

SLIDE 50

Canonical correlation analysis

n Consider features x and y extracted from

two biometric modalities

n Basic principle – find direction in the

respective feature spaces that yield maximum correlation

n Gauging linear relationship between two

multidimensional random variables (feature vectors of two biometric modalities)

n Finding two sets of basis vectors such that the

projection of the feature vectors onto these bases is maximised

n Determine correlation coefficients

91

SLIDE 51

CAA problem formulation

n Training set of pairs of vectors n Maximisation of the correlation of the projections n Leads to an eigenvalue problem n With cov matrices regularised by

92

SLIDE 52

Re-ID with V&L

n Three sets:

n Training, query, gallery n Training: image and language pairs

n Various settings, query x gallery:

n V x V, L x L, V x L, V x VL, VL x VL

n Asymmetric settings:

n Transfer language info. With CCA

n XQDA as metric learning

SLIDE 53

Re-ID with V&L

Results on CUHK03, R1, R5, R10
LxL worse than VxV: more information in vision
VxVL 3.2 points higher than VxV
VLxVL 12.7 points higher than VxV, better than

state-of-the-art

Language helps

SLIDE 54

Person Re-ID

n Crossmodal & multimodal matching facilitated by

CAA

n Performance gain due to

n Joint training n Fusion of modalities

95

A tall, slim man, probably an Asian in his early twenties. He has dark short

hair. He is wearing spectacles and he

is holding something in his hand, probably a letter or envelope cover. He is wearing a multi-colour polo t- shirt with blue, white, black and red stripes on it. He is wearing a pair of dark colour pants and brown shoes.

SE-ResNet based Vision Model

(50 layers , [ 3 X 3 ] kernel)

SE-ResNet based Language Model

(50 layers , [ 1 X 2 ] kernel)

fimg ftxt LID

Vision

LID

Text

Joint CCA Embedding Space Learning

SLIDE 55

Take home message

n Role of multimodal biometrics n Fusion levels n Math formulation of different alternatives n The concept of marginalisation/multiple classifier

systems

n Notion of quality based, user specific and cohort

based extensions of fusion

n Multimodal sensing and fusion of a single

biometric

n Example: fusion of vision/language modalities

for soft biometrics

96

SLIDE 56

References

n

N. Poh, A. Martin, and S. Bengio. Performance Generalization in Biometric Authentication

Using Joint User-Specic and Sample Bootstraps. IEEE Trans. Pattern Analysis and Machine Intelligence, 29(3):492-498, 2007.

n

N. Poh and J. Kittler. Incorporating Variation of Model-specic Score Distribution in Speaker

Verication Systems. IEEE Transactions on Audio, Speech and Language Processing, 16(3):594-606, 2008.

n

N. Poh, T. Bourlai, J. Kittler and al. A Score-level Quality-dependent and Cost-sensitive

Multimodal Biometric Test Bed. Pattern Recognition, 43(3):1094{1105, 2010.

n

N. Poh, T. Bourlai, J. Kittler and al. Benchmarking Quality-dependent and Cost-sensitive

Multimodal Biometric Fusion Algorithms. IEEE Trans. Information Forensics and Security,4(4):849{866, 2009.

n

N. Poh, J. Kittler and T. Bourlai, Quality-based Score Normalisation with Device Qualitative

Information for Multimodal Biometric Fusion", IEEE Trans. on Systems, Man, Cybernatics Part A : Systems and Humans, 40(3):539{554, 2010.

n

Tresadern, P., et al., Mobile Biometrics: Combined Face and Voice Verification for a Mobile

Platform. Pervasive Computing, IEEE, 2013. 12(1): p. 79-87.

n

Poh, N. and S. Bengio, F-ratio Client-Dependent Normalisation on Biometric Authentication Tasks, in IEEE Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP)2005:

Philadelphia. p. 721-724.

n

Poh, N. and J. Kittler, Incorporating Variation of Model-specific Score Distribution in Speaker Verification Systems. IEEE Transactions on Audio, Speech and Language Processing, 2008. 16(3): p. 594-606.

97

SLIDE 57

n

Poh, N., et al., Group-specific Score Normalization for Biometric Systems, in IEEE Computer Society Workshop on Biometrics, CVPR2010. p. 38-45.

n

Poh, N. and M. Tistarelli. Customizing biometric authentication systems via discriminative score calibration. in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. 2012. IEEE.

n

Poh, N. and J. Kittler, A Biometric Menagerie Index for Characterising Template/Model- specific Variation, in Proc. of the 3rd Int'l Conf. on Biometrics2009: Sardinia. p. 816-827.

n

Poh, N. and J. Kittler, A Methodology for Separating Sheep from Goats for Controlled Enrollment and Multimodal Fusion, in Proc. of the 6th Biometrics Symposium2008:

Tampa. p. 17-22.

n

Poh, N., et al., A User-specific and Selective Multimodal Biometric Fusion Strategy by Ranking Subjects. Pattern Recognition Journal, 46(12): 3341-57 , 2013:

n

Poh, N., G. Heusch, and J. Kittler, On Combination of Face Authentication Experts by a Mixture of Quality Dependent Fusion Classifiers, in LNCS 4472, Multiple Classifiers System (MCS)2007: Prague. p. 344-356.

n

Poh, N. and J. Kittler, A Unified Framework for Multimodal Biometric Fusion Incorporating Quality Measures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012. 34(1): p. 3-18.

98

SLIDE 58

n

Poh, N., A. Rattani, and F. Roli, Critical analysis of adaptive biometric systems. Biometrics, IET, 2012. 1(4): p. 179-187.

n

Merati, A., N. Poh, and J. Kitter, Extracting Discriminative Information from Cohort Models, in IEEE 3rd Int'l Conf. on Biometrics: Theory, Applications, and Systems (BTAS)2010. p. 1- 6.

n

Poh, N., A. Merati, and J. Kitter, Making Better Biometric Decisions with Quality and Cohort Information: A Case Study in Fingerprint Verification, in Proc. 17th European Signal Processing Conf. (Eusipco)2009: Glasgow. p. 70-74.

n

Merati, A., N. Poh, and J. Kittler, User-Specific Cohort Selection and Score Normalization for Biometric Systems. Information Forensics and Security, IEEE Transactions on, 2012. 7(4): p. 1270-1277.

n

Poh, N., A. Merati, and J. Kittler. Heterogeneous Information Fusion: A Novel Fusion Paradigm for Biometric Systems. in International Joint Conference on Biometrics. 2011.

n

Poh, N., A. Martin, and S. Bengio, Performance Generalization in Biometric Authentication Using Joint User-Specific and Sample Bootstraps. IEEE Trans. on Pattern Analysis and Machine, 2007. 29(3): p. 492-498.

n

Poh, N. and J. Kittler, A Method for Estimating Authentication Performance Over Time, with Applications to Face Biometrics, in 12th IAPR Iberoamerican Congress on Pattern Recognition (CIARP)2007. p. 360-369.

99

SLIDE 59

n

Poh, N. and S. Bengio, Can Chimeric Persons Be Used in Multimodal Biometric Authentication Experiments?, in LNCS 3869, 2nd Joint AMI/PASCAL/IM2/M4 Workshop on Multimodal Interaction and Related Machine Learning Algorithms MLMI2005: Edinburgh.

p. 87-100.

n

Poh, N., et al., Benchmarking Quality-dependent and Cost-sensitive Score-level Multimodal Biometric Fusion Algorithms. IEEE Trans. on Information Forensics and Security, 2009. 4(4): p. 849-866.

n

Poh, N., et al., An Evaluation of Video-to-video Face Verification. IEEE Trans. on Information Forensics and Security, 2010. 5(4): p. 781-801.

n

M. Tistarelli, Y. Sun, and N. Poh, On the Use of Discriminative Cohort Score Normalization

for Unconstrained Face Recognition, IEEE Trans. on Information Forensics and Security, 2014.

100