optimal learning of joint alignments with a faulty oracle
play

Optimal Learning of Joint Alignments with a Faulty Oracle - PowerPoint PPT Presentation

Optimal Learning of Joint Alignments with a Faulty Oracle Charalampos E. Tsourakakis ctsourak@bu.edu Boston University ISIT 2020 Optimal Learning of Joint Alignments with a Faulty Oracle 1 / 35 Joint work with: Kasper Green Larsen Michael


  1. Optimal Learning of Joint Alignments with a Faulty Oracle Charalampos E. Tsourakakis ctsourak@bu.edu Boston University ISIT 2020 Optimal Learning of Joint Alignments with a Faulty Oracle 1 / 35

  2. Joint work with: Kasper Green Larsen Michael Mitzenmacher Optimal Learning of Joint Alignments with a Faulty Oracle 2 / 35

  3. Datasets modeled as graphs World Wide Web Internet Social network Connectome Airline network Images Optimal Learning of Joint Alignments with a Faulty Oracle 3 / 35

  4. Graphs from probing/testing pairs of items (a) (b) (a) Humans in the loop for entity resolution (b) Protein-protein interactions Optimal Learning of Joint Alignments with a Faulty Oracle 4 / 35

  5. Joint alignment from pairwise differences [Next four slides use material from Y. Chen’s slides] • n unknown variables g (0) , . . . , g ( n − 1) • k possible states, • described as the latent function g : [ n ] → [ k ] g (0) = 5 g (1) = 7 . . . • Think of [ n ] as a set of nodes, and g ( u ) as the cluster id that corresponds node u ∈ [ n ] Optimal Learning of Joint Alignments with a Faulty Oracle 5 / 35

  6. Joint alignment from pairwise differences • Goal: learn latent function g : [ n ] → [ k ] • We obtain a noisy measurement of pairwise difference ˜ f ( i , j ) := ( g ( i ) − g ( j )+ some iid noise) mod k . Optimal Learning of Joint Alignments with a Faulty Oracle 6 / 35

  7. Joint alignment from pairwise differences Typical input to a multi-image alignment problem. We may compute pairwise noisy estimates of relative angles of rotation. Optimal Learning of Joint Alignments with a Faulty Oracle 7 / 35

  8. Joint alignment from noisy pairwise differences Desired output Optimal Learning of Joint Alignments with a Faulty Oracle 8 / 35

  9. Joint alignment from pairwise differences • Clusters: k groups, numbered { 0 , 1 , ..., k − 1 } and that we think of as being arranged modulo k • Cluster ids: g ( u ) refers to the cluster number associated with a vertex u • Query/measurement: when we query an edge e = ( x , y ), we obtain ˜ � � f ( x , y ) = g ( x ) − g ( y ) + η xy mod k (1) where the additive noise values η xy are i.i.d. random variables supported on { 0 , 1 , · · · , k − 1 } . • Problem. Recover g (up to a cyclic offset) with high probability using as few measurements as possible and as fast as possible. Optimal Learning of Joint Alignments with a Faulty Oracle 9 / 35

  10. Noise probability distribution • When we query an edge e = ( x , y ), we obtain ˜ � � f ( x , y ) = g ( x ) − g ( y ) + η xy mod k where the additive noise values η xy are i.i.d. random variables supported on { 0 , 1 , · · · , k − 1 } . 1 � k + δ, if i = 0; Pr [ η xy = i ] = (2) 1 δ k − k − 1 , for each i � = 0 . • We choose which pairs to query in a non-adaptive way. • We obtain a set of noisy measurements { ˜ � [ n ] � f ( i , j ) = g ( i ) − g ( j ) + noise mod k } ( i , j ) ∈ Ω where Ω ⊆ is a 2 symmetric index set, wlog, a set of pairs { i , j } with i < j . Optimal Learning of Joint Alignments with a Faulty Oracle 10 / 35

  11. Remark Our MLE problem is a discrete, non-convex problem. Optimal Learning of Joint Alignments with a Faulty Oracle 11 / 35

  12. Joint alignment - k = 2 - Let V = [ n ] be the set of items - (Unknown) g : V → {− 1 , +1 } • Red ( R = { v ∈ V ( G ) : g ( v ) = − 1 } ) • Blue ( B = { v ∈ V ( G ) : g ( v ) = +1 } ) - Observation: Define τ ( u , v ) = g ( u ) g ( v ) ∈ {± 1 } for any u , v ∈ V . Then, if τ ( u , v ) = − 1, then u is in a different cluster than v Optimal Learning of Joint Alignments with a Faulty Oracle 12 / 35

  13. Joint alignment - k = 2 - Model: We can query any pair of nodes { u , v } once to get a noisy measurement of τ ( u , v ). The oracle returns • ˜ τ ( u , v ) = g ( u ) g ( v ) η u , v , where • η u , v ∈ {± 1 } is iid noise in the edge observations • E [ η u , v ] = δ for all pairs u , v ∈ V - Equivalently, for each query we receive the correct answer with probability 1 − q = 1 2 + δ 2 , where q > 0 is the corruption probability. - Problem ( k = 2 ): Recover g whp with as few queries to the oracle as possible. Optimal Learning of Joint Alignments with a Faulty Oracle 13 / 35

  14. Related work – Overview Optimal Learning of Joint Alignments with a Faulty Oracle 14 / 35

  15. Related Work – k = 2, Correlation Clustering • Correlation Clustering: given an undirected signed graph, partition the nodes into clusters so that the total number of disagreements is minimized [Bansal et al., 2004, Shamir et al., 2004] ( NP-hard ) • Excellent survey by Bonchi et al. [Bonchi et al., 2014] • Mathieu and Schudy initiated the study of noisy correlation clustering [Mathieu and Schudy, 2010] � n • complete information (all � signs) • cardinality constraints on clusters (Ω( √ n ))) 2 Optimal Learning of Joint Alignments with a Faulty Oracle 15 / 35

  16. Related Work – k = 2, Planted Partition Planted Partition Model • Two groups (clusters) of nodes • A graph is generated as follows. Edge probabilities are • p within each cluster, • and q < p across the clusters. • Problem: Recover the two clusters given such a graph. Results • If the two clusters are balanced, i.e., each cluster has O ( n ) nodes, then one can recover the clusters whp , see [McSherry, 2001, Vu, 2014, Abbe et al., 2016, Hajek et al., 2016]. Optimal Learning of Joint Alignments with a Faulty Oracle 16 / 35

  17. Related Work – k = 2, # Queries as a function of the imbalance • Matrix completion techniques [Cand` es et al., 2006] can be used to predict signs of edges [Chiang et al., 2014] • γ = n max | C | cluster C • The number of queries needed for exact recovery is O ( γ 4 n log 2 n ), • Finally, Mazumdar and Saha study the case k = 2 and achieve recovery in poly-time using O ( n log n /δ 4 ) queries [Mazumdar and Saha, 2016] • State-of-the-art is due to [CT, Mitzenmacher, Larsen Webconf 2020] Optimal Learning of Joint Alignments with a Faulty Oracle 17 / 35

  18. Related Work – k ≥ 3 • Joint alignment: Chen and Candes consider a similar setting as ours, and propose a projected power method to solve the non-convex maximum likelihood estimation problem [Chen and Candes, 2016]. Optimal Learning of Joint Alignments with a Faulty Oracle 18 / 35

  19. Related Work – k ≥ 3 • Chen and Candes formulate the problem as a constrained PCA problem, and show that a non-convex, projected, power method solves the problem with high probability when the random queries form a random Erd¨ os-R´ enyi graph. Optimal Learning of Joint Alignments with a Faulty Oracle 19 / 35

  20. Related Work – k ≥ 3 • The Chen-Candes algorithm is non-adaptive, and the underlying queries form a random binomial graph • They show that, in the setting where queries form a random binomial graph, the minimax probability of error tends to 1 if the � n log n � number of queries is less than Ω k δ 2 • The query complexity matches the lower bound • Before, inferior results were obtained by Mitzenmacher and Tsourakakis. Optimal Learning of Joint Alignments with a Faulty Oracle 20 / 35

  21. Older result (2018) – Mitzenmacher-T. We prove the following result. Our proof uses BFS as its subroutine. Theorem There exists a polynomial time algorithm that performs O ( n 1+ o (1) ) queries, and recovers g (up to some global offset) whp for any 1 − q = 1+ δ 2 , where 0 < δ ≤ 1 is any positive constant. 1 • The o (1) term in the exponent is log log n . Optimal Learning of Joint Alignments with a Faulty Oracle 21 / 35

  22. Upper bound – Larsen- Mitzenmacher-T. (2019) Theorem 1. (extremely small bias) If (lg n / nk ) 1 / 4 ≤ δ ≤ 1 / 2 k and k ≤ n o (1) , then there is a non-adaptive and deterministic query algorithm that makes O ( n log n δ 2 k ) queries, runs in O ( n log n δ 2 k ) time and is correct whp . Theorem 2. (larger bias) If 1 / 2 k ≤ δ ≤ 1 / 4 and k ≤ n o (1) , then there is a non-adaptive and deterministic query algorithm that makes O ( n log n ) queries, runs in O ( n log n ) time and is correct whp . δ δ Optimal Learning of Joint Alignments with a Faulty Oracle 22 / 35

  23. Proposed algorithm - Step 1 O ( n log n k δ 2 ) queries Optimal Learning of Joint Alignments with a Faulty Oracle 23 / 35

  24. Proposed algorithm – Step 2 “grounding” Optimal Learning of Joint Alignments with a Faulty Oracle 24 / 35

  25. Proposed algorithm – Learn { g ( x ) } x ∈ S up to cyclic offset Optimal Learning of Joint Alignments with a Faulty Oracle 25 / 35

  26. Proposed algorithm – Learn { g ( x ) } x ∈ V \ S up to (the same) cyclic offset Optimal Learning of Joint Alignments with a Faulty Oracle 26 / 35

  27. Learning Joint Alignment with a Faulty Oracle 1 Choose S ⊆ V such that | S | = O ( log n k δ 2 ) if 0 ≤ δ ≤ 1 / 2 k and | S | = O ( lg n δ ) if 1 / 2 k ≤ δ ≤ 1 / 4. 2 Perform all queries between S and V \ S . 3 Fix a node s ∈ S and assign it the label ˆ g ( s ) = 0. 4 For each s ′ ∈ S \ { s } , compute an estimate µ s ′ of ( g ( s ′ ) − g ( s )) mod k using the plurality vote among the queries f ( s , b ) } b ∈ V \ S and assign s ′ the label ˆ { ˜ f ( s ′ , b ) − ˜ g ( s ′ ) = µ s ′ . 5 For each v / ∈ V \ S , assign it a label corresponding to the result g ( s ) + ˜ of the plurality vote among { ˆ f ( v , s ) } s ∈ S . Optimal Learning of Joint Alignments with a Faulty Oracle 27 / 35

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend