Classification with mixtures of curved Mahalanobis metrics or LMNN - PowerPoint PPT Presentation

Classification with mixtures of curved Mahalanobis metrics — or LMNN in Cayley-Klein geometries — arXiv:1609.07082 Frank Nielsen 1 , 2 Boris Muzellec 1 Richard Nock 3 , 4 , 5 1 Ecole Polytechnique, France 2 Sony CSL, Japan 3 Data61, Australia 4 ANU, Australia 4 The University of Sydney, Australia 26 th September 2016 1

Mahalanobis distances ◮ For Q ≻ 0, a symmetric positive definite matrix like a covariance matrix, define Mahalanobis distance : � ( p − q ) ⊤ Q ( p − q ) D Q ( p , q ) = Metric distance (indiscernibles/symmetry/triangle inequality) Eg., Q = precision matrix Σ − 1 , where Σ = covariance matrix ◮ Generalize Euclidean distance when Q = I : D I ( p , q ) = � p − q � ◮ Mahalanobis distance interpreted as Euclidean distance after Cholesky decomposition Q = L ⊤ L and affine transformation x ′ ← L ⊤ x : D Q ( p , q ) = D I ( L ⊤ p , L ⊤ q ) = � p ′ − q ′ � 2

Generalizing Mahalanobis distances with Cayley-Klein projective geometries + Learning in Cayley-Klein spaces 3

Cayley-Klein geometry: Projective geometry [7, 3] ◮ RP d : ( λ x , λ ) ∼ ( x , 1 ) homogeneous coordinates x �→ ˜ x = ( x , w = 1 ) , and x �→ x dehomogeneization by “perspective division” ˜ w ◮ cross-ratio measure is invariant by projectivity/homography/collineation: ( p , q ; P , Q ) = ( p − P )( q − Q ) ( p − Q )( q − P ) where p , q , P , Q are collinear Q q p P 4

Definition of Cayley-Klein geometries A Cayley-Klein geometry is K = ( F , c dist , c angle ) : 1. A fundamental conic: F 2. A constant unit c dist ∈ C for measuring distances 3. A constant unit c angle ∈ C for measuring angles See monograph [7] 5

Distance in Cayley-Klein geometries dist ( p , q ) = c dist Log (( p , q ; P , Q )) where P and Q are intersection points of line l = ( pq ) ( ˜ l = ˜ p × ˜ q in 2D) with the conic. Log is principal complex logarithm (modulo 2 π i ) Q F q p P l 6

Key properties of Cayley-Klein distances ◮ dist ( p , p ) = 0 (law of indiscernibles) ◮ Signed distances : dist ( p , q ) = − dist ( q , p ) ◮ When p , q , r are collinear dist ( p , q ) = dist ( p , r ) + dist ( r , q ) Geodesics in Cayley-Klein geometries are straight lines (eventually clipped within the conic domain) Logarithm is transferring multiplicative properties of the cross-ratio to additive properties of Cayley-Klein distances. When p , q , P , Q are collinear: ( p , q ; P , Q ) = ( p , r ; P , Q ) · ( r , q ; P , Q ) 7

Dual conics In projective geometry, points and lines are dual concepts Dual parameterizations of the fundamental conic F = ( A , A ∆ ) x ⊤ A ˜ Quadratic form Q A ( x ) = ˜ x ◮ primal conic = set of border points: C A = { ˜ p : Q A (˜ p ) = 0 } ◮ dual conic = set of tangent hyperplanes: A = { ˜ l : Q A ∆ (˜ C ∗ l ) = 0 } A ∆ = A − 1 | A | is the adjoint matrix Adjoint can be computed even when A is not invertible ( | A | = 0) 8

Taxonomy Signature of matrix = sign of eigenvalues of its eigen decomposition A ∆ Type A Conic Elliptic (+ , + , +) (+ , + , +) non-degenerate complex conic Hyperbolic (+ , + , − ) (+ , + , − ) non-degenerate real conic Dual Euclidean (+ , + , 0 ) (+ , + , 0 ) Two complex lines with a real intersection point Dual Pseudo-euclidean (+ , − , 0 ) (+ , 0 , 0 ) Two real lines with a double real intersection point Deux Euclidean (+ , 0 , 0 ) (+ , + , 0 ) Two complex points with a double real line passing through Pseudo-euclidean (+ , 0 , 0 ) (+ , − , 0 ) Two complex points with a double real line passing through Galilean (+ , 0 , 0 ) (+ , 0 , 0 ) Double real line with a real intersection point Degenerate cases are obtained as limit of non-degenerate cases. Measurements can be elliptic, hyperbolic or parabolic (degenerate case). 9

Real CK distances without cross-ratio expressions For real Cayley-Klein measures, we choose the constants: ◮ Constants ( κ is curvature): ◮ Elliptic ( κ > 0): c dist = κ 2 i ◮ Hyperbolic ( κ < 0): c dist = − κ 2 ◮ Bilinear form S pq = ( p ⊤ , 1 ) ⊤ S ( q , 1 ) = ˜ p ⊤ S ˜ q ◮ Get rid of cross-ratio using: � S 2 S pq + pq − S pp S qq ( p , q ; P , Q ) = � S 2 S pq − pq − S pp S qq 10

Elliptic Cayley-Klein metric distance �   S 2 S pq + pq − S pp S qq d E ( p , q ) = κ 2 i Log   � S 2 S pq − pq − S pp S qq � � S pq d E ( p , q ) = κ arccos � S pp S qq Notice that d E ( p , q ) < κπ , domain D S = R d in elliptic case. x y x’ y’ Gnomonic projection d E ( x , y ) = κ · arccos ( � x ′ , y ′ � ) 11

Hyperbolic Cayley-Klein distance When p , q ∈ D S := { p : S pp < 0 } , the hyperbolic domain: �   S 2 S pq + pq − S pp S qq d H ( p , q ) = − κ 2 log   � S 2 S pq − pq − S pp S qq �� 1 − S pp S qq d H ( p , q ) = − κ arctanh S 2 pq � � S pq d H ( p , q ) = − κ arccosh � S pp S qq √ x 2 − 1 ) and arctanh ( x ) = 1 2 log 1 + x with arccosh ( x ) = log ( x + 1 − x . Curvature κ < 0 12

Decomposition of the bilinear form [1] � Σ � a Write S = = S Σ , a , b with Σ ≻ 0. a ⊤ b p ⊤ S ˜ q = p ⊤ Σ q + p ⊤ a + a ⊤ q + b S p , q = ˜ Let µ = − Σ − 1 a ∈ R d ( a = − Σ µ ) and b = µ ⊤ Σ µ + sign ( κ ) 1 κ 2 ( b − µ ⊤ µ ) − 1 � b > µ ⊤ µ 2 κ = − ( µ ⊤ µ − b ) − 1 b < µ ⊤ µ 2 Then the bilinear form writes as: S ( p , q ) = S Σ ,µ,κ ( p , q ) = ( p − µ ) ⊤ Σ( q − µ ) + sign ( κ ) 1 κ 2 13

Curved Mahalanobis metric distances We have [1]: κ → 0 + D Σ ,µ,κ ( p , q ) = lim lim κ → 0 − D Σ ,µ,κ ( p , q ) = D Σ ( p , q ) Mahalanobis distance D Σ ( p , q ) = D Σ , 0 , 0 ( p , q ) Thus hyperbolic/elliptic Cayley-Klein distances can be interpreted as curved Mahalanobis distances , or κ -Mahalanobis distances When S = diag ( 1 , 1 , ..., 1 , − 1 ) , we recover the canonical hyperbolic distance [5] in Cayley-Klein model: � � 1 − � p , q � D h ( p , q ) = arccosh � � 1 − � p , p � 1 − � q , q � defined inside the interior of a unit ball. 14

Cayley-Klein bisectors are affine Bisector Bi ( p , q ) : Bi ( p , q ) = { x ∈ D S : dist S ( p , x ) = dist S ( x , q ) } S ( p , x ) S ( q , x ) = � � S ( p , p ) S ( q , q ) arccos and arccosh are monotonically increasing functions. � � � � x , | S ( p , p ) | Σ q − | S ( q , q ) | Σ p � � | S ( p , p ) | ( a ⊤ ( q + x ) + b ) − | S ( q , q ) | ( a ⊤ ( p + x ) + b ) = 0 + Hyperplanes (restricted to the domain) 15

Cayley-Klein Voronoi diagrams are affine Can be computed from equivalent (clipped) power diagrams [2, 5] https://www.youtube.com/watch?v=YHJLq3-RL58 16

Cayley-Klein balls Blue: Mahalanobis Red: elliptic Green: Hyperbolic Cayley-Klein balls have Mahalanobis ball shapes with displaced centers 17

Learning curved Mahalanobis metrics 18

Large Margin Nearest Neighbors [8], LMNN Learn Mahalanobis distance M = L ⊤ L ≻ 0 for a given input data-set P ◮ Distance of each point to its target neighbors shrink, ǫ pull ( L ) S = { ( x i , x j ) : y i = y j and x j ∈ N ( x j ) } ◮ Keep a distance margin of each point to its impostors , ǫ push ( L ) R = { ( x i , x j , x l ) : ( x i , x j ) ∈ S and y i � = y l } http://www.cs.cornell.edu/~kilian/code/lmnn/lmnn.html 19

LMNN: Cost function and optimization Objective cost function [8]: convex and piecewise linear (SDP) Σ i , i → j � L ( x i − x j ) � 2 , ǫ pull ( L ) = 1 + � L ( x i − x j ) � 2 − � L ( x i − x l ) � 2 � � ǫ push ( L ) = Σ i , i → j Σ j ( 1 − y il ) + , ǫ ( L ) = ( 1 − µ ) ǫ pull ( L ) + µǫ push ( L ) i → j : x j is a target neighbor of x i y il = 1 iff x i and x j have same label, y il = 0 otherwise. µ set by cross-validation Optimize by gradient descent: ǫ ( L t + 1 ) = ǫ ( L t ) − γ ∂ǫ ( L t ) ∂ L ∂ǫ ∂ L = ( 1 − µ )Σ i , i → j C ij + µ Σ ( i , j , l ) ∈R t ( C ij − C il ) where C ij = ( x i − x j ) ⊤ ( x i − x j ) Easy, no projection mechanism like for Mahalanobis Metric for Clustering (MMC) [9] 20

Elliptic Cayley-Klein LMNN [1], CVPR 2015 � � � ǫ ( L ) = ( 1 − µ ) d E ( x i , x j ) + µ ( 1 − y il ) ζ ijl i , i → j i , i → j l with ζ ijl = [ 1 + d E ( x i , x j ) − d E ( x i , x l )] + (hinge loss) ∂ǫ ( L ) ∂ d E ( x i , x j ) ( 1 − y il ) ∂ζ ijl � � � = ( 1 − µ ) + µ ∂ L ∂ L ∂ L i , i → j i , i → j l C ij = ( x ⊤ i , 1 ) ⊤ ( x ⊤ j , 1 ) ∂ d E ( x i , x j ) k � S ij C ii + S ij � = C jj − ( C ij + C ji ) L ∂ L � S ii S jj S ii S jj − S 2 ij � ∂ d E ( x i , x j ) − ∂ d E ( x i , x l ) ∂ζ ijl , if ζ ijl ≥ 0 , ∂ L ∂ L ∂ L = 0 , otherwise . 21

Hyperbolic Cayley-Klein LMNN (new case) To ensure S keeps correct signature ( 1 , d , 0 ) during the LMNN gradient descent, we decompose S = L ⊤ DL (with L ≻ 0) and perform a gradient descent on L with the following gradient: ∂ d H ( x i , x j ) � S ij � k C ii + S ij = DL C jj − ( C ij + C ji ) � ∂ L S ii S jj S 2 ij − S ii S jj Recall two difficulties of hyperbolic case compared to elliptic case: ◮ Hyperbolic Cayley-Klein distance may be very large (unbounded vs. < κπ for elliptic case) ◮ Data-set should be contained inside the compact domain D S 22

Classification with mixtures of curved Mahalanobis metrics or LMNN - PowerPoint PPT Presentation

Classification with mixtures of curved Mahalanobis metrics or LMNN in Cayley-Klein geometries arXiv:1609.07082 Frank Nielsen 1 , 2 Boris Muzellec 1 Richard Nock 3 , 4 , 5 1 Ecole Polytechnique, France 2 Sony CSL, Japan 3 Data61, Australia

Classification with mixtures of curved Mahalanobis metrics or LMNN in Cayley-Klein geometries

Analysis of a model of elastic plastic mixtures (Prandtl-Reuss-mixtures) Project of Josef

Lect. 17 General Relativity - Curved Space-Time. General Relativity - Curved Space-time

Axiomatic Quantum Field Theory in Curved Spacetime Robert M. Wald (based on work with Stefan

Release granular mushrooms Release granular mushrooms and dried mixtures and dried mixtures

The science of mixtures and separation techniques Rahul Bhambure PhD Scientist, Chemical

Mixtures of models Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory

What we learned from Community Metrics Agenda Why are metrics used? How metrics are used

Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics

AGENCY OPERATIONS METRICS The Metrics of Me The Metrics of Me x 159 13,006 5 days old books

Proposal Metrics Dashboard What Gets Measured Gets Done Topics Why Keep Metrics? What

Graph Classification Classification Outline Introduction, Overview Classification using

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification

TRACKING FEATURES WITH KALMAN FILTERING, MAHALANOBIS DISTANCE AND A MANAGEMENT MODEL Raquel R.

Method Based on Morphological Analysis, Clustering and the Mahalanobis-Taguchi Method Hirohisa

Distance Metric Learning: Beyond 0/1 Loss Praveen Krishnan CVIT, IIIT Hyderabad June 14, 2017 1

Datamining Recursive partitioning trees Sren Hjsgaard Department of Mathematical Sciences

HICO: A Benchmark for Recognizing Human-Object Interactions in Images Yu-Wei Chao, Zhan Wang,

Data Mining Concepts Duen Horng (Polo) Chau Assistant Professor Associate Director, MS

Supervised Self-Organising Maps similarity/distance (Kohonen, 1982). Ron Wehrens Institute of

Scalable Multi-Class Gaussian Process Classification using Expectation Propagation Carlos

Mining Useful Patterns Jilles Vreeken 22 May 2015 Questions of the day How can we find useful

Semantics Avalanche: Word Sense Disambiguation, Dependency Parsing, Semantic Role

Android goes Semantic: DL Reasoners on Smartphones Fernando Bobillo , fbobillo@unizar.es

Sambuz

Useful Links

Newsletter

Mail Us