manifold matching joint optimization of fidelity
play

Manifold Matching: Joint Optimization of Fidelity & - PowerPoint PPT Presentation

7th Conference on Multivariate Distributions with Applications Manifold Matching: Joint Optimization of Fidelity & Commensurability Carey E. Priebe Department of Applied Mathematics & Statistics Johns Hopkins University August, 2010


  1. 7th Conference on Multivariate Distributions with Applications Manifold Matching: Joint Optimization of Fidelity & Commensurability Carey E. Priebe Department of Applied Mathematics & Statistics Johns Hopkins University August, 2010 Maresias, Brazil 1 / 23

  2. Collaborators David J. Marchette Zhiliang Ma Sancar Adali &c. ——————– Support: AFOSR, NSSEFF, ONR, HLTCOE, ASEE 2 / 23

  3. Problem Formulation Given x i 1 ∼ · · · ∼ x ik ∼ · · · ∼ x iK , i = 1 , . . . , n 3 / 23

  4. Problem Formulation Given x i 1 ∼ · · · ∼ x ik ∼ · · · ∼ x iK , i = 1 , . . . , n • n objects are each measured under K different conditions • x i 1 ∼ · · · ∼ x ik ∼ · · · ∼ x iK denotes K matched feature vectors representing a single object O i • x ik ∈ Ξ k 3 / 23

  5. Problem Formulation Given x i 1 ∼ · · · ∼ x ik ∼ · · · ∼ x iK , i = 1 , . . . , n • n objects are each measured under K different conditions • x i 1 ∼ · · · ∼ x ik ∼ · · · ∼ x iK denotes K matched feature vectors representing a single object O i • x ik ∈ Ξ k • K new measurements { y k } K k =1 , y k ∈ Ξ k 3 / 23

  6. Problem Formulation Given x i 1 ∼ · · · ∼ x ik ∼ · · · ∼ x iK , i = 1 , . . . , n • n objects are each measured under K different conditions • x i 1 ∼ · · · ∼ x ik ∼ · · · ∼ x iK denotes K matched feature vectors representing a single object O i • x ik ∈ Ξ k • K new measurements { y k } K k =1 , y k ∈ Ξ k Question Are { y k } K k =1 matched feature vectors representing a single object measured under K conditions? 3 / 23

  7. Hypotheses Ξ 1 · · · Ξ K Object O 1 x 11 ∼ · · · ∼ x 1 K . . . . . . . . . . . . Object O n ∼ · · · ∼ x n 1 x nK 4 / 23

  8. Hypotheses Ξ 1 · · · Ξ K Object O 1 x 11 ∼ · · · ∼ x 1 K . . . . . . . . . . . . Object O n ∼ · · · ∼ x n 1 x nK • Each space Ξ k comes with a dissimilarity δ k , yielding dissimilarity matrices ∆ 1 , · · · , ∆ K 4 / 23

  9. Hypotheses Ξ 1 · · · Ξ K Object O 1 x 11 ∼ · · · ∼ x 1 K . . . . . . . . . . . . Object O n ∼ · · · ∼ x n 1 x nK • Each space Ξ k comes with a dissimilarity δ k , yielding dissimilarity matrices ∆ 1 , · · · , ∆ K • Given new measurements { y k } K k =1 we can obtain within-condition dissimilarities δ k ( y k , x ik ) , i = 1 , . . . , n, k = 1 , . . . , K 4 / 23

  10. Hypotheses Ξ 1 · · · Ξ K Object O 1 x 11 ∼ · · · ∼ x 1 K . . . . . . . . . . . . Object O n ∼ · · · ∼ x n 1 x nK • Each space Ξ k comes with a dissimilarity δ k , yielding dissimilarity matrices ∆ 1 , · · · , ∆ K • Given new measurements { y k } K k =1 we can obtain within-condition dissimilarities δ k ( y k , x ik ) , i = 1 , . . . , n, k = 1 , . . . , K • Goal ( K = 2 ): determine whether y 1 and y 2 are a match 4 / 23

  11. Hypotheses Ξ 1 · · · Ξ K Object O 1 x 11 ∼ · · · ∼ x 1 K . . . . . . . . . . . . Object O n ∼ · · · ∼ x n 1 x nK • Each space Ξ k comes with a dissimilarity δ k , yielding dissimilarity matrices ∆ 1 , · · · , ∆ K • Given new measurements { y k } K k =1 we can obtain within-condition dissimilarities δ k ( y k , x ik ) , i = 1 , . . . , n, k = 1 , . . . , K • Goal ( K = 2 ): determine whether y 1 and y 2 are a match H 0 : y 1 ∼ y 2 versus H A : y 1 ≁ y 2 (we control the probability of missing a true match) 4 / 23

  12. what are these “conditions” and what does it mean to be “matched” • let condition be language for a text document, and “matched” mean “on the same topic” • let condition be modality for an photo, and “matched” mean “of the same person” – indoor lighting vs outdoor lighting – two cameras of different quality – passport photos and airport surveillance photos • let condition 1 be wiki text document and condition 2 be wiki hyperlink structure • let condition 1 be text document and condition 2 be photo • . . . or just a single space with multiple dissimilarities 5 / 23

  13. (not matched) The English is clear enough to lorry drivers — but the Welsh reads “I am not in the office at the moment. Send any work to be translated.” < http://news.bbc.co.uk/2/hi/uk_news/wales/7702913.stm > 6 / 23

  14. Manifold Matching I Conditional distributions are induced by maps π k from “object space” Ξ Ξ π 1 π K · · · Ξ 1 Ξ K Conditional spaces Ξ k are not commensurate 7 / 23

  15. Manifold Matching I Conditional distributions are induced by maps π k from “object space” Ξ Ξ π 1 π K · · · Ξ 1 Ξ K ∃ ϕ ? Conditional spaces Ξ k are not commensurate 7 / 23

  16. Dirichlet Setting Let S p be the standard p -simplex in R p +1 Let Ξ 1 = S p and Ξ 2 = S p (but the fact that the two spaces are the same is unknown to the algorithms ...) Let α i ∼ iid Dirichlet (1) represent n “objects” or “topics” Let X ik ∼ iid Dirichlet ( rα i + 1) represent K languages (WCHs) 8 / 23

  17. Dirichlet Setting Let S p be the standard p -simplex in R p +1 Let Ξ 1 = S p and Ξ 2 = S p (but the fact that the two spaces are the same is unknown to the algorithms ...) Let α i ∼ iid Dirichlet (1) represent n “objects” or “topics” Let X ik ∼ iid Dirichlet ( rα i + 1) represent K languages (WCHs) • r controls “what it means to be matched” (document variability & translation quality analogy) Ξ 1 Ξ 2 1 1 r r X i 1 α i α i X i 2 8 / 23

  18. Manifold Matching II Matched points are used to define maps ρ k to the same space X (with distance d ) Ξ π 1 π K · · · Ξ 1 Ξ K · · · ρ 1 ρ K X Reject for d ( � y 1 , � y 2 ) “large” 9 / 23

  19. Manifold Matching II Matched points are used to define maps ρ k to the same space X (with distance d ) Ξ π 1 π K · · · Ξ 1 Ξ K · · · ρ 1 ρ K X = R d Reject for d ( � y 1 , � y 2 ) “large” 9 / 23

  20. canonical correlation • Multidimensional scaling yields high-dimensional embeddings: ∆ 1 �→ X ′ 1 and ∆ 2 �→ X ′ 2 • Canonical correlation finds U 1 : X ′ 1 �→ X 1 and U 2 : X ′ 2 �→ X 2 to maximize correlation • Out-of-sample embedding: y 1 �→ y ′ 1 , y 2 �→ y ′ 2 y 1 = U T 1 y ′ y 2 = U T 2 y ′ 2 are in R d • Both � 1 and � with same coordinate system (i.e., they are commensurate) • Reject for d ( � y 1 , � y 2 ) “large” 10 / 23

  21. procrustes ◦ mds • Multidimensional scaling yields low-dimensional embeddings: ∆ 1 �→ X 1 and ∆ 2 �→ X 2 • Procrustes ( X 1 , X 2 ) yields Q ∗ = arg min � X 1 − X 2 Q � F Q T Q = I y ′ • Out-of-sample embedding: y 1 �→ � y 1 , y 2 �→ � 2 y 2 = Q ∗ � y ′ 2 are in R d • Both � y 1 and � with same coordinate system (i.e., they are commensurate) • Reject for d ( � y 1 , � y 2 ) “large” 11 / 23

  22. fidelity & commensurability Fidelity is how well the mapping preserves original dissimilarities; our within-condition fidelity error is given by � 1 x jk ) − δ k ( x ik , x jk )) 2 . � n � ǫ f k = ( d ( � x ik , � 2 1 ≤ i<j ≤ n Commensurability is how well the mapping preserves matchedness; our between-condition commensurability error is given by � ǫ c k 1 k 2 = 1 x ik 2 ) − δ k 1 k 2 ( x ik 1 , x ik 2 )) 2 . ( d ( � x ik 1 , � n 1 ≤ i ≤ n Alas, δ k 1 k 2 does not exist; however, our story seems to suggest that it might be reasonable to let δ k 1 k 2 ( x ik 1 , x ik 2 ) = 0 for all i, k 1 , k 2 . NB: There is also between-condition separability error given by � 1 x jk 2 ) − δ k 1 k 2 ( x ik 1 , x jk 2 )) 2 . � n � ǫ s k 1 k 2 = ( d ( � x ik 1 , � 2 1 ≤ i<j ≤ n 12 / 23

  23. Methodological Comparison • canonical correlation optimizes commensurability without regard for fidelity • procrustes ◦ mds optimizes fidelity without regard for commensurability 13 / 23

  24. Methodological Comparison • canonical correlation optimizes commensurability without regard for fidelity • procrustes ◦ mds optimizes fidelity without regard for commensurability • compare: joint optimization of fidelity & commensurability . . . 13 / 23

  25. Omnibus Embedding Approach n × n n × n n 1 × 2 n = × n 1 ∆ 1 W × u 1 u 2 2 n M n × n 1 v 2 T ∆ 2 n × n × 1 W v 1 v 1 y 1 u 1 T T y 2 u 2 T v 2 T • Under “matched” assumption, impute dissimilarities δ 12 ( x i 1 1 , x i 2 2 ) to obtain an omnibus dissimilarity matrix M • Embed M as 2 n points in R d • Let u i 1 = δ 1 ( y 1 , x i 1 ) and v i 2 = δ 2 ( y 2 , x i 2 ) • Under H 0 : y 1 ∼ y 2 , impute v i 1 = δ 12 ( y 1 , x i 2 ) and u i 2 = δ 12 ( y 2 , x i 1 ) 1 ) T and ( u T • Out-of-sample embedding of ( u T 1 , v T 2 , v T 2 ) T yields � y 1 and � y 2 14 / 23

  26. Simulation Results n=100, p=3, d=2, r=100, c=0.1, q=3 ROC curves: β against α 1.0 0.8 0.6 power 0.4 0.2 pom cca 0.0 jofc 0.0 0.2 0.4 0.6 0.8 1.0 alpha Simulation results indicate that joint optimization of fidelity & commensurability via omnibus embedding approach is (for this case) superior to canonical correlation and procrustes ◦ mds 15 / 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend