Distances Between Sets Based on Set Commonality Kathy Horadam and - - PowerPoint PPT Presentation

distances between sets based on set commonality
SMART_READER_LITE
LIVE PREVIEW

Distances Between Sets Based on Set Commonality Kathy Horadam and - - PowerPoint PPT Presentation

Motivation Minkowski-type Metrics for Sets Distances Between Sets Based on Set Commonality Kathy Horadam and Michael Nyblom RMIT University, Melbourne, Australia Australian and New Zealand Mathematics Convention, Melbourne 10 December, 2014


slide-1
SLIDE 1

Motivation Minkowski-type Metrics for Sets

Distances Between Sets Based on Set Commonality

Kathy Horadam and Michael Nyblom

RMIT University, Melbourne, Australia

Australian and New Zealand Mathematics Convention, Melbourne 10 December, 2014

Horadam

slide-2
SLIDE 2

Motivation Minkowski-type Metrics for Sets

Outline

1

Motivation Biometric Graph Matching Scoring functions

2

Minkowski-type Metrics for Sets Minkowski-type metrics for sets Normalising the Minkowski-type metrics for sets

Horadam

slide-3
SLIDE 3

Motivation Minkowski-type Metrics for Sets Biometric Graph Matching Scoring functions

Outline

1

Motivation Biometric Graph Matching Scoring functions

2

Minkowski-type Metrics for Sets Minkowski-type metrics for sets Normalising the Minkowski-type metrics for sets

Horadam

slide-4
SLIDE 4

Motivation Minkowski-type Metrics for Sets Biometric Graph Matching Scoring functions

Background

A novel area for me, probably a once-off! Work arises from a problem in biometric matching Biometric matching: Personal identification through “what you are" (fingerprint, face) not “what you know" (PIN, password)

  • r

“what you carry" (token, smartcard)

Horadam

slide-5
SLIDE 5

Motivation Minkowski-type Metrics for Sets Biometric Graph Matching Scoring functions

Biometric matching

Biometric samples from an individual vary each time they are presented Need error-tolerant or fuzzy matching of biometric features to authenticate But not too error-tolerant or fuzzy, or an imposter can be authenticated as you!

Horadam

slide-6
SLIDE 6

Motivation Minkowski-type Metrics for Sets Biometric Graph Matching Scoring functions

Example: Features of Retina biometric

Extracting graphs from biometric images: A retina image (a) and its retina graph (b)

200 400 600 −600 −500 −400 −300 −200 −100

(a) (b)

Horadam

slide-7
SLIDE 7

Motivation Minkowski-type Metrics for Sets Biometric Graph Matching Scoring functions

Comparing biometric graphs

Use error-correcting graph matching algorithm to compare retina graphs G and G∗ Optimal edit path defines Maximum Common Subgraph (mcs) G ∩ G∗ Can count many types of structural elements of each graph G, G∗, G ∩ G∗ eg for vertices: we count |V|, |V ∗| and |V ∩ V ∗|. Many other subgraphs could be counted: # edges, # paths length 2, # nodes degree 3, # simple cycles, etc etc

Horadam

slide-8
SLIDE 8

Motivation Minkowski-type Metrics for Sets Biometric Graph Matching Scoring functions

Outline

1

Motivation Biometric Graph Matching Scoring functions

2

Minkowski-type Metrics for Sets Minkowski-type metrics for sets Normalising the Minkowski-type metrics for sets

Horadam

slide-9
SLIDE 9

Motivation Minkowski-type Metrics for Sets Biometric Graph Matching Scoring functions

Scoring genuine samples against imposters

DSQRT density

2 4 6 8 10 0.2 0.4 0.6 0.8 1.0 MATCH Genuine Imposter

Example: use dsqrt(G, G∗) = |V ∩ V ∗|/

  • |V| |V ∗| as the

scoring function. dsqrt is a distance but NOT a metric (preferred)

Horadam

slide-10
SLIDE 10

Motivation Minkowski-type Metrics for Sets Biometric Graph Matching Scoring functions

What metrics are known for comparing sets?

Distance or dissimilarity d : X × X → R on data set X non-negative, symmetric and reflexive function. Normalised if d(u, v) ≤ 1 ∀ u, v ∈ X. And is a metric if ∀ u, v, w ∈ X,

d(u, v) = 0 ⇔ u = v; and the triangle inequality holds: d(u, v) ≤ d(u, w) + d(w, v).

Horadam

slide-11
SLIDE 11

Motivation Minkowski-type Metrics for Sets Biometric Graph Matching Scoring functions

What metrics are known for comparing sets?

  • Notation. U = ∅, X = set of finite nonempty subsets of U. For

Xi, Xj, Xk ∈ X, let xi = |Xi|, xij = |Xi ∩ Xj|, xijk = |Xi ∩ Xj ∩ Xk|. Let mij = xi − xij, so xi > 0 and mij ≥ 0. Put xi = x∗

i + yij + yik + xijk,

where yij = yji = xij − xijk, so mij = xi − xij = x∗

i + yik.

Horadam

slide-12
SLIDE 12

Motivation Minkowski-type Metrics for Sets Biometric Graph Matching Scoring functions

What metrics are known for comparing sets?

The following are known normalised metrics on X. The Jaccard, or set-difference, metric dsd(Xi, Xj) = (mij + mji)/(xi + xj − xij) = 1 − xij/(xi + xj − xij) . The normalised maximum metric dmax(Xi, Xj) = max{mij, mji}/ max{xi, xj} = 1 − xij/ max{xi, xj} . CAN WE FIND MORE SET METRICS BASED ON SET COMMONALITY ?

Horadam

slide-13
SLIDE 13

Motivation Minkowski-type Metrics for Sets Biometric Graph Matching Scoring functions

Which measures separate genuine from imposter retinas best?

Score Distance? Metric? NN dsqrt (good-ish) √ × 0.0486 dsd (good-ish) √ √ 0.0154 dmax (ugly) √ √ −0.0450 Comparison of nearest neighbour (NN) distances of vertex sets in VARIA retina database for 3 scoring functions ARE THERE BETTER SCORING METRICS THAN dmax?

Horadam

slide-14
SLIDE 14

Motivation Minkowski-type Metrics for Sets Biometric Graph Matching Scoring functions

Which measures separate genuine from imposter retinas best?

Score Distance? Metric? NN dsqrt (good-ish) √ × 0.0486 dsd (good-ish) √ √ 0.0154 dmax (ugly) √ √ −0.0450 Comparison of nearest neighbour (NN) distances of vertex sets in VARIA retina database for 3 scoring functions ARE THERE BETTER SCORING METRICS THAN dmax?

Horadam

slide-15
SLIDE 15

Motivation Minkowski-type Metrics for Sets Biometric Graph Matching Scoring functions

Which measures separate genuine from imposter retinas best?

Score Distance? Metric? NN dsqrt (good-ish) √ × 0.0486 dsd (good-ish) √ √ 0.0154 dmax (ugly) √ √ −0.0450 Comparison of nearest neighbour (NN) distances of vertex sets in VARIA retina database for 3 scoring functions ARE THERE BETTER SCORING METRICS THAN dmax?

Horadam

slide-16
SLIDE 16

Motivation Minkowski-type Metrics for Sets Minkowski-type metrics for sets Normalising the Minkowski-type metrics for sets

Outline

1

Motivation Biometric Graph Matching Scoring functions

2

Minkowski-type Metrics for Sets Minkowski-type metrics for sets Normalising the Minkowski-type metrics for sets

Horadam

slide-17
SLIDE 17

Motivation Minkowski-type Metrics for Sets Minkowski-type metrics for sets Normalising the Minkowski-type metrics for sets

The Minkowski or p-norm metrics dp on Rn

These are defined for real p ≥ 1 and dimension n ≥ 1 to be dp((u1, u2, . . . , un), (v1, v2, . . . , vn)) = (

n

  • i=1

|ui − vi|p)

1 p .

p = 1: absolute value distance, (taxicab, city-block or Manhattan distance). p = 2: usual Euclidean distance. limp→∞ dp = d∞: infinity norm (Chebyshev) distance, max distance d∞((u1, u2, . . . , un), (v1, v2, . . . , vn)) = max{|u1 − v1|, |u2 − v2|, . . . , |un − vn|}. Varying p changes the weight given to larger and smaller differences.

Horadam

slide-18
SLIDE 18

Motivation Minkowski-type Metrics for Sets Minkowski-type metrics for sets Normalising the Minkowski-type metrics for sets

Eureka! Minkowski-type Metric Family for Sets

The following definition gives set-based metrics with analogous

  • properties. For each p ≥ 1, define d2,p : X × X → R to be

d2,p(Xi, Xj) = [mp

ij + mp ji ]

1 p .

Theorem

1

d2,p is a metric ;

2

d2,1 = dsd , limp→∞ d2,p = dmax and if p < p′, d2,p ≥ d2,p′ . For sets, the Minkowski-type metric is a modification of the 2D real Minkowski metric. d2,p(Xi, Xj) = dp((xi, xj), (xij, xij)) . d2,2 is analogous to the Euclidean metric d2 in the plane.

Horadam

slide-19
SLIDE 19

Motivation Minkowski-type Metrics for Sets Minkowski-type metrics for sets Normalising the Minkowski-type metrics for sets

Outline

1

Motivation Biometric Graph Matching Scoring functions

2

Minkowski-type Metrics for Sets Minkowski-type metrics for sets Normalising the Minkowski-type metrics for sets

Horadam

slide-20
SLIDE 20

Motivation Minkowski-type Metrics for Sets Minkowski-type metrics for sets Normalising the Minkowski-type metrics for sets

Normalising the Minkowski-type Metric Family

Theorem For each p ≥ 1, define d2,p to be d2,p(Xi, Xj) = [mp

ij + mp ji ]1/p/(xij + [mp ij + mp ji ]1/p) .

Then

1

d2,p is a normalised metric on X ;

2

d2,1 = dsd , (Jaccard metric)

3

limp→∞ d2,p = dmax (maximum metric)

4

d2,p(Xi, Xj) = 1 ⇔ Xi ∩ Xj = ∅ ;

5

d2,p is monotone decreasing in p.

Horadam

slide-21
SLIDE 21

Motivation Minkowski-type Metrics for Sets Minkowski-type metrics for sets Normalising the Minkowski-type metrics for sets

Normalising the Minkowski-type Metric Family

Lemma Reduction Lemma. The triangle inequality holds for Xi, Xj, Xk if and only if it holds for the subsets X ′

i ⊆ Xi, X ′ j ⊆ Xj and X ′ k ⊆ Xk

where X ′

i = [Xi \ (Xi ∩ Xj)] ∪ (Xi ∩ Xj ∩ Xk),

X ′

j = [Xj \ (Xi ∩ Xj)] ∪ (Xi ∩ Xj ∩ Xk) similarly, and

X ′

k = (Xi ∩ Xk) ∪ (Xj ∩ Xk).

(only if) d2,p(Xi, Xj) ≤ d2,p(X ′

i , X ′ j ) since

[1 + xij/(mp

ij + mp ji )1/p]−1 ≤ [1 + xijk/(mp ij + mp ji )1/p]−1,

d2,p(X ′

i , X ′ k) + d2,p(X ′ j , X ′ k) ≤ d2,p(Xi, Xk) + d2,p(Xj, Xk) since

[1 + xik/((x∗

i )p + yp jk)1/p]−1 ≤ [1 + xik/(mp ik + mp ki)1/p]−1 ;

and by symmetry [1 + xjk/((x∗

j )p + yp ik)1/p]−1 ≤ [1 + xjk/(mp jk + mp kj)1/p]−1.

  • Horadam
slide-22
SLIDE 22

Motivation Minkowski-type Metrics for Sets Minkowski-type metrics for sets Normalising the Minkowski-type metrics for sets

Normalising the Minkowski-type Metric Family

Proof. WTP d2,p(X ′

i , X ′ j ) ≤ d2,p(X ′ i , X ′ k) + d2,p(X ′ j , X ′ k).

Set u = ((x∗

i )p + yp jk)1/p, v = ((x∗ j )p + yp ik)1/p, w = (mp ij + mp ji )1/p,

so w ≤ u + v. We need to show that u u + yik + xijk + v v + yjk + xijk ≥ w w + xijk . Since yik ≤ v and yjk ≤ u, the LHS is at least

u u+v+xijk + v v+u+xijk = u+v u+v+xijk . Since the function t → t t+xijk is

increasing in t, LHS is ≥

w w+xijk as required.

Horadam

slide-23
SLIDE 23

Motivation Minkowski-type Metrics for Sets Minkowski-type metrics for sets Normalising the Minkowski-type metrics for sets

Reprise: Which measures separate genuine from imposter best?

Score Distance? Metric? NN dsqrt √ × 0.0486 dav √ × 0.0232 dsd √ √ 0.0154

  • d22

√ × −0.0007 d22 √ √ −0.0022 dmin × × −0.0139 dmax √ √ −0.0450 Comparison of nearest neighbour (NN) distances of vertex sets in VARIA retina database for 7 scoring functions

Horadam

slide-24
SLIDE 24

Motivation Minkowski-type Metrics for Sets Minkowski-type metrics for sets Normalising the Minkowski-type metrics for sets

THANKYOU.....QUESTIONS? References:

1

K.J. Horadam and M.A. Nyblom, Distances between sets based on set commonality, Discrete Applied Mathematics 167 (2014) 310–314.

2

  • J. Jeffers, S. A. Davis and K. J. Horadam,

Estimating individuality in feature point based retina templates, 5th IAPR International Conference on Biometrics (ICB), IEEE (2012) 454–459.

Horadam