Euclidean Distance Geometry Leo Liberti IBM Research, USA CNRS LIX - PowerPoint PPT Presentation

NP-hardness • Q is NP -hard if every problem in NP reduces to Q • Q is NP -complete if it is NP -hard and is in NP Why does it work? polytime reduction any P in NP − − − − − − − − − − − − − − − − − − → Q : how hard? • Suppose Q easier than P • Solve P by reducing to Q in polytime and then solve Q • Then P as easy as Q , against assumption • ⇒ Q at least as hard as P So if Q is NP -hard it is as hard as any problem in NP ⇒ Q is as hard as the hardest problem in NP 14

NP-hardness proofs Given a new problem Q , take any known NP -hard problem P and reduce it to Q Why does it work? polytime reduction − − − − − − − − − − − − − − − − − − → Q : how hard? P : NP -hard • As before : Suppose . . . (etc.) ⇒ Q at least as hard as P • Since P is NP -hard, it is hardest in NP , and so is Q ⇒ Q is NP -hard 15

Complexity of the DGP 1. Applications 2. Definition 3. Complexity primer 4. Complexity of the DGP 5. Number of solutions 6. Mathematical optimization formulations 7. Realizing complete graphs 8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP 10. Tractability of protein instances 11. Finding vertex orders 12. Approximate realizations 16

DGP ∈ NP? • NP : YES/NO problems with polytime-checkable proofs for YES • DGP is a YES/NO problem • DGP 1 ∈ NP , since d uv = | x u − x v | ⇒ ( d ∈ Q → x ∈ Q ) • Solutions might involve irrational numbers when K > 1 • Some empirical evidence that DGP �∈ NP [Beeker et al. 2013] 17

The DGP is NP-hard Partition Given a = ( a 1 , . . . , a n ) ∈ N n , ∃ I ⊆ { 1 , . . . , n } s.t. � a i = � a i ? i ∈ I i �∈ I • Reduce ( NP -hard) Partition to DGP 1 • a − → cycle C with V ( C ) = { 1 , . . . , n } , E ( C ) = {{ 1 , 2 } , . . . , { n, 1 }} • For i < n let d i,i +1 = a i , and d n,n +1 = d n 1 = a n • E.g. for a = (1 , 4 , 1 , 3 , 3), get cycle graph: 3 2 4 1 1 4 1 3 3 5 [Saxe, 1979] 18

Partition is YES ⇒ DGP 1 is YES • Given : I ⊂ { 1 , . . . , n } s.t. � a i = � a i i ∈ I i �∈ I • Construct : realization x of C in R 1. x 1 = 0 // start 2. induction step : suppose x i known if i ∈ I let x i +1 = x i + d i,i +1 // go right else x i +1 = x i − d i,i +1 // go left • Correctness proof : by the same induction but careful when i = n : have to show x n +1 = x 1 19

I = { 1 , 2 , 3 } x 1 0 1 2 3 4 5 6

x 1 x 2 0 1 2 3 4 5 6 1 ∈ I

x 1 x 2 x 3 0 1 2 3 4 5 6 2 ∈ I 1 ∈ I

x 1 x 2 x 3 x 4 0 1 2 3 4 5 6 2 ∈ I 1 ∈ I 3 ∈ I

4 �∈ I x 1 x 2 x 5 x 3 x 4 0 1 2 3 4 5 6 2 ∈ I 1 ∈ I 3 ∈ I

5 �∈ I 4 �∈ I x 6 = x 1 ? x 2 x 5 x 3 x 4 0 1 2 3 4 5 6 2 ∈ I 1 ∈ I 3 ∈ I

Partition is YES ⇒ DGP 1 is YES � � (1) = ( x i +1 − x i ) = d i,i +1 = i ∈ I i ∈ I � � = = a i = a i i ∈ I i �∈ I � � = d i,i +1 = ( x i − x i +1 ) = (2) i �∈ I i �∈ I � � � (1) = (2) ⇒ ( x i +1 − x i ) = ( x i − x i +1 ) ⇒ ( x i +1 − x i ) = 0 i ∈ I i �∈ I i ≤ n ⇒ ( x n +1 − x n ) + ( x n − x n − 1 ) + · · · + ( x 3 − x 2 ) + ( x 2 − x 1 ) = 0 ⇒ x n +1 = x 1 26

Partition is NO ⇒ DGP 1 is NO • By contradiction: suppose DGP 1 is YES, x realization of C • F = {{ u, v } ∈ E ( C ) | x u ≤ x v } , E ( C ) � F = {{ u, v } ∈ E ( C ) | x u > x v } • Trace x 1 , . . . , x n : follow edges in F ( → ) and in E ( C ) � F ( ← ) � � ( x v − x u ) ( x u − x v ) = x 4 x 1 x 5 x 3 x 2 { u,v }∈ F { u,v }�∈ F � � | x u − x v | = | x u − x v | -3 -2 -1 0 1 2 3 { u,v }∈ F { u,v }�∈ F � � d uv = d uv { u,v }∈ F { u,v }�∈ F • Let J = { i < n | { i, i + 1 } ∈ F } ∪ { n | { n, 1 } ∈ F } � � ⇒ a i = a i i ∈ J i �∈ J • So J solves Partition instance, contradiction • ⇒ DGP is NP -hard, DGP 1 is NP -complete 27

Number of solutions 1. Applications 2. Definition 3. Complexity primer 4. Complexity of the DGP 5. Number of solutions 6. Mathematical optimization formulations 7. Realizing complete graphs 8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP 10. Tractability of protein instances 11. Finding vertex orders 12. Approximate realizations 28

With congruences • ( G, K ): DGP instance X ⊆ R Kn : set of solutions • ˜ • Congruence : composition of translations, rotations, reflections • C = set of congruences in R K • x ∼ y means ∃ ρ ∈ C ( y = ρx ): distances in x are preserved in y through ρ X | = 2 ℵ 0 • ⇒ if | ˜ X | > 0, | ˜ 29

Modulo congruences • Congruence is an equivalence relation ∼ on ˜ X (reflexive, symmetric, transitive) ∼ • Partitions ˜ X into equivalence classes • X = ˜ X/ ∼ : sets of representatives of equivalence classes • Focus on | X | rather than | ˜ X | 30

Cardinality of X • infeasible ⇔ | X | = 0 • rigid graph ⇔ | X | < ℵ 0 • globally rigid graph ⇔ | X | = 1 • flexible graph ⇔ | X | = 2 ℵ 0 • | X | = ℵ 0 : impossible by Milnor’s theorem 31

Milnor’s theorem implies | X | � = ℵ 0 • System S of polynomial equations of degree d ∀ i ≤ m p i ( x 1 , . . . , x nK ) = 0 • Let X be the set of x ∈ R nK satisfying S • Number of connected components of X is ≤ d (2 d − 1) nK − 1 [Milnor 1964] • If | X | is countable then G cannot be flexible ⇒ incongruent elements of X are separate connected components ⇒ by Milnor’s theorem, there’s finitely many of them 32

Examples x 3 V 1 = { 1 , 2 , 3 } ρ congruence in R 2 E 1 = {{ u, v } | u < v } ⇒ ρx valid realization d 1 = 1 | X | = 1 x 1 x 2 x 4 x 3 V 2 = V 1 ∪ { 4 } ρ reflects x 4 wrt x 1 , x 2 E 2 = E 1 ∪ {{ 1 , 4 } , { 2 , 4 }} ⇒ ρx valid realization √ d 2 = 1 ∧ d 14 = | X | = 2 ( , ) 2 x 1 x 2 x 4 x 3 ρ rotates x 2 x 3 , x 1 x 4 by θ V 3 = V 2 ⇒ ρx valid realization E 3 = {{ u, u + 1 }| u ≤ 3 } ∪ { 1 , 4 } | X | is uncountable d 1 = 1 ( , . . . ) , , , x 1 x 2 33

Mathematical optimization formulations 1. Applications 2. Definition 3. Complexity primer 4. Complexity of the DGP 5. Number of solutions 6. Mathematical optimization formulations 7. Realizing complete graphs 8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP 10. Tractability of protein instances 11. Finding vertex orders 12. Approximate realizations 34

System of quadratic constraints � x u − x v � 2 = d 2 ∀{ u, v } ∈ E uv • Around 10 vertices • Computationally useless 35

Quadratic objective ( � x u − x v � 2 − d 2 uv ) 2 � min x ∈ R nK { u,v }∈ E • Globally optimal value zero iff x is a realization of G • sBB: 10-100 vertices, exact solutions • heuristics: 100-1000 vertices, poor quality [Lavor et al., 2006] 36

Convexity and concavity � x u − x v � 2 � max x ∈ R nK { u,v }∈ E � x u − x v � 2 ≤ d 2 ∀{ u, v } ∈ E uv • Convex constraints, concave objective • Computationally no better than “quadratic objective” 37

Pointwise reformulation � max θ uvk ( x uk − x vk ) x ∈ R nK { u,v }∈ E,k ≤ K � x u − x v � 2 ≤ d 2 ∀{ u, v } ∈ E uv • Convex subproblem in stochastic iterative heuristics “guess θ and solve” • 100-1000 vertices, good quality [L. IOS14/MAGO14(slides)] 38

SDP formulation � min ( X uu + X vv − 2 X uv ) X � 0 { u,v }∈ E X uu + X vv − 2 X uv ≥ d 2 ∀{ u, v } ∈ E uv • Similar to those of Ye, Wolkowicz — works better for proteins • 100 vertices, good quality [D’Ambrosio et al., in progress] 39

Realizing complete graphs 1. Applications 2. Definition 3. Complexity primer 4. Complexity of the DGP 5. Number of solutions 6. Mathematical optimization formulations 7. Realizing complete graphs 8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP 10. Tractability of protein instances 11. Finding vertex orders 12. Approximate realizations 40

Cliques 4-clique 3-clique 2 2-clique 2 2 1 1 3 1 3 4 ( K + 1)-clique = K -clique ⊕ a vertex 41

Triangulation 2 1 2 3 1 1 − → 1 3 0 1 2 2 Example: realize triangle on a line • From � x 3 − x 1 � = 2 and � x 3 − x 2 � = 1 get x 2 3 − 2 x 1 x 3 + x 2 = 4 (1) 1 x 2 3 − 2 x 2 x 3 + x 2 = 1 . (2) 2 • ( ?? ) − ( ?? ) yields x 2 1 − x 2 2 x 3 ( x 1 − x 2 ) 2 − 3 = ⇒ 2 x 3 = 4 , • Hence x 3 = 2 42

Realizing a ( K + 1) -clique in R K − 1 • Apply triangulation inductively on K assume x 1 , . . . , x K ∈ R K − 1 known, compute y = x K +1 • K quadratic eqns ( ∀ j ≤ K � y − x j � 2 = d 2 j,K +1 ) in K − 1 vars  � y � 2 − 2 x 1 · y + � x 1 � 2 d 2 = [1]  1 ,K +1 . . . . . . � y � 2 − 2 x K · y + � x K � 2  d 2 = [ K ] K,K +1 • Form system ∀ j ≤ K ([ j ] − [ K ])  � x 1 � 2 − � x K � 2 − d 2 1 ,K +1 + d 2 2( x 1 − x K ) · y = [1] − [ K ] K,K +1  . . . . . . � x K − 1 � 2 − � x K � 2 − d 2 K − 1 ,K +1 + d 2  2( x K − 1 − x K ) · y [ K − 1] − [ K ] = K,K +1 • This is a ( K − 1) × ( K − 1) linear system Ay = b Solve to find y [Dong, Wu 2002] 43

“Solve”? 1. What if A is singular? 2. Or: A nonsingular but instance is NO 44

Singularity: rk A = K − 2 One row x j − x K of A depends on the others triangle in R 1 K = 2 x 1 − x 2 = 0 x 3 ? x 1 = x 2 x 3 ? x 4 ? x 3 4-clique in R 2 K = 3 x 1 , x 2 , x 3 on a line x 1 x 2 x 4 ? x 5 ? x 2 x 3 x 4 5-clique in R 3 K = 4 x 1 , . . . , x 4 in a plane x 1 x 5 ? Trend continues: rk A = K − 2 ⇒ | X | = 2 (see later) 45

Singularity: rk A = K − 3 Two rows x j − x k depend on the others x 4 ? 4-clique in R 2 K = 3 x 1 = x 2 = x 3 x 1= x 2= x 3 x 5 ? x 4 5-clique in R 3 K = 4 x 1 , . . . , x 4 on a line x 3 x 2 x 1 Trend continues: [Hendrickson, 1992] Thm. 5.8 . If a graph G is connected, flexible and has more than K vertices, | X | contains almost always a submanifold diffeomorphic to a circle 46

Hendrickson’s theorem also applies to non-cliques 47

Nonsingular matrix A with NO instance • Infeasible quadratic system ∀ j ≤ K + 1 � x j − x K � 2 = d jK • Take differences, get nonsingular A and value for x K • . . . but it’s wrong! Shit happens! Every time you solve the linear system Ay = b check feasibility with quadratic system 48

Algorithm for realizing complete graphs in R K • Assume: (i) G = ( V, E ) complete (ii) | V | = n ≥ K + 2 (iii) we know x 1 , . . . , x K +1 • Increase K : we know how to realize x K +2 in R K • Use this inductively for each i ∈ { K + 2 , . . . , n } 49

Algorithm for realizing complete graphs in R K // realize next vertex iteratively for i ∈ { K + 2 , . . . , n } do // use ( K + 1) immediate adjacent predecessors to compute x i if rk A = K then x i = A − 1 b // A, b defined as above else x i = ∞ // A singular, mark ∞ and exit break end if // check that x i is feasible w.r.t. other distances for { j ∈ N ( i ) | j < i } do if � x i − x j � � = d ij then // if not, mark infeasible and exit loop ∗ x i = ∅ break end if end for if x i = ∅ then break end if end for return x ∗ the “ignore trouble” policy, a.k.a. “ignore probability zero events” 50

Complexity of Alg. 1 • Outer loop: O ( n ) • Rank and inverse of A : O ( K 3 ) • Inner loop: O ( n ) • Get O ( n 2 K 3 ) • But in most applications K is fixed • Get O ( n 2 ) But how do we find the realization of the first K + 1 vertices? 51

Realizing ( K + 1) -cliques in R K • Realizing ( K + 1)-cliques in R K − 1 yields “flat simplices” (e.g. triangles on lines) • Use “natural” embedding dimension R K • Same reasoning as above: get system Ay = b where y = x K +1 and A j = 2( x j − x K ) • But now A is ( K − 1) × K • Same as previous case with A singular 52

Almost square How can you solve the following system Ay = b :       a 11 a 12 . . . a 1 K y 1 b 1 . . . . . ... . . . . .       . . . .  = .      a K − 1 , 1 a K − 1 , 2 . . . a K − 1 ,K y K − 1 b K − 1 where A has one more columns than rows and rank K − 1? 53

Basics and nonbasics • Since rk A = K − 1, ∃ K − 1 linearly independent columns • B : set of their indices • N : index of remaining columns • B : ( K − 1) × ( K − 1) square matrix of columns in B • ⇒ B is nonsingular • Can partition columns as A = ( B | N ) Column j corresponds to variable y j • Variables y B are called basic variables • Variable y N is called nonbasic variable 54

The dictionary ( B | N ) y = b ⇒ By B + Ny N = b B − 1 b − B − 1 Ny N ⇒ = y B Basics expressed in function of nonbasic 55

One quadratic equation • From value of y N , can use dictionary to get y • Use one quadratic equation 1. Pick any h ∈ { 1 , . . . , K − 1 } , equation is � x h − y � 2 2 = d 2 hK 2. y = ( y B | y N ) ⊤ 3. Replace y B with B − 1 b − B − 1 Ny N in equation 4. Solve resulting quadratic equation in one variable y N 5. Get 0,1 or 2 values for y N 6. ⇒ Get 0,1 or 2 positions for x K +1 56

What if B − 1 N is zero? • y B = B − 1 b − B − 1 Ny N reduces to y B = B − 1 b • Use one quadratic equation 1. Pick any h ∈ { 1 , . . . , K − 1 } , equation is � x h − y � 2 2 = d 2 hK 2. y = ( y B | y N ) ⊤ 3. Replace y B with B − 1 b in equation 4. Solve resulting quadratic equation in one variable y N 5. Get 0,1 or 2 values for y N 6. ⇒ Get 0,1 or 2 positions for x K +1 57

The difference dictionary • B − 1 N � = 0: y N − − − − − − − → y B → y + , y − with different components • Different values y + N � = y − N − quadratic eqn. • B − 1 N = 0: y B − − − − − − − − − − → y N • Even if y + N , K − 1 components of y + , y − are equal N � = y − aff( x 1 , . . . , x K − 1 ) = { y ∈ R K | y N = 0 } 58

The case of no solutions • No realizations exist for this ( K + 1)-clique in R K • DGP instance is NO 59

The case of one solution • Assume for simplicity: N = K , h = 1, B − 1 N � = 0 Then � x h − y � 2 = d 2 h,K +1 becomes: λy 2 K − 2 µy K + ν = 0 , where � 1 + β 2 ℓj a 2 λ = jK ℓ,j<K � µ = x 1 K + β ℓj a jK ( β ℓj b ℓ − x 1 ℓ ) ℓ,j<K � β ℓj b ℓ ( β ℓj b ℓ − 2 x 1 ℓ ) + � x 1 � 2 − d 2 ν = 1 ,K +1 ℓ,j<K • (Exactly one solution for y K ) ⇔ µ 2 = λν , not a tautology • The set of all ( K + 1)-clique DGP instances in R K s.t. µ 2 = λν has Lebesgue measure 0 • Ignore them, they happen with probability ∗ 0! ∗ Assuming continuous distributions over the reals. For floating point number, who knows? . . . but we’ll ignore these instances anyhow 60

Discriminant > 0 , = 0 4 1 3 2 1 1 3 3 4 2 2 4 61

The case of two solutions • K spheres S K − 1 , . . . , S K − 1 in R K 1 K centered at x 1 , . . . , x K with radii d 1 ,K +1 , . . . , d K,K +1 • x K +1 must be at the intersection of S K − 1 , . . . , S K − 1 1 K j S K − 1 j S K − 1 • If � � = ∅ , then | � | = 2 in general j j • will not mention “probability 0” or “in general” anymore [Coope 2000] 62

Mirror images • Let x + = { x 1 , . . . , x K , x + K +1 } , x − = { x 1 , . . . , x K , x − K +1 } assume dim aff( x 1 , . . . , x K ) = K ( † ) • Theorem x + , x − ∈ R K are reflections w.r.t. hyperplane defined by x 1 , . . . , x K • Proof 1. x + , x − congruent by construction 2. ∀ i ≤ K x i ∈ x + ∩ x − → x + , x − not translations 3. | x + ∩ x − | = K < | x + | = | x − | → x + , x − not rotations by ( † ) 4. ⇒ must be reflections 63

Algorithm for realizing ( K + 1) -cliques in R K // realize 1 at the origin x 1 = (0 , . . . , 0) // realize next vertex iteratively for ℓ ∈ { 2 , . . . , K + 1 } do // at most two positions in R ℓ − 1 for vertex ℓ S ℓ − 2 S = � i i<ℓ if S = ∅ then // warn if infeasible return ∅ end if // arbitrarily choose one of the two points choose any x ℓ ∈ S end for // return feasible realization return x 64

Complexity of Alg. 2 • Outer loop: O ( K ) • Gaussian elimination on A : O ( K 3 ) • Some messing about to obtain x + K +1 , x − K +1 : + O ( K 2 ) • Get O ( K 4 ) • But in most applications K is fixed • Get O (1) 65

Back to complete graphs • Alg. 2: realize 1 , . . . , K + 1 in R K : O (1) • Alg. 1: Realize K + 2 , . . . , n : O ( n 2 ) • ⇒ O ( n 2 ) • What about | X | ? – Alg. 1 is deterministic: one solution from x 1 , . . . , x K +1 – Alg. 2 is stochastic: pick one of two values K times ⇒ | X | = 2 K 66

K -trilaterative graphs • In Alg. 1 we only need each v > K + 1 to have K + 1 adjacent predecessors in order to find a unique solution for x v • Determination of x v from K +1 adjacent predecessors: K -trilateration • K -trilaterative graph : (i) has a vertex order ensuring this property (ii) the initial K + 1 vertices induce a ( K + 1)-clique the order is called K -trilateration order • Alg. 1 realizes all K -trilaterative graphs The DGP restricted to K -trilaterative graphs in R K is easy [Eren et al. 2004] 67

The story so far • Lots of nice applications • DGP is NP -hard • May have 0 , 1, finitely many or 2 ℵ 0 solutions modulo congruences • Continuous optimization techniques don’t scale well • Using K + 1 adjacent predecessors, realize K -trilaterative graphs in R K in polytime • Do we need K + 1 adjacent predecessors, or can we do with less? 68

The Branch-and-Prune algorithm 1. Applications 2. Definition 3. Complexity primer 4. Complexity of the DGP 5. Number of solutions 6. Mathematical optimization formulations 7. Realizing complete graphs 8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP 10. Tractability of protein instances 11. Finding vertex orders 12. Approximate realizations 69

Fewer adjacent predecessors • Alg. 2 only needs K adjacent predecessor • Extend to n vertices: ( K − 1)-trilaterative graphs • Can we realize ( K − 1)-trilaterative graphs in R K ? • A small case: graph consisting of two K + 1 cliques 1 1 4 1 5 2 2 2 3 4 4 3 3 5 5 70

Take a closer look. . . 5 2 2 4 4 3 3 5 • Realization of a K + 1 clique in R K knowing x 1 , . . . , x K • We know how to do that! • Consistent with 2 solutions for x 5 , reflected across plane through x 2 , x 3 , x 4 71

Discretization and pruning edges • ( K − 1) -trilaterative graph G = ( V, E ): ∀ v > K ∃ U v ⊂ V ( | U v | = K ∧ ∀ u ∈ U v ( u < v ) ∧ { u, v } ∈ E ) • Discretization edges : E D = {{ u, v } ∈ E | u, v ≤ K } ∪ {{ u, v } ∈ E | v > K ∧ u ∈ U v } � �� initial clique vertex order • Pruning edges E P = E � E D 9 15 10 6 14 8 5 7 11 2 13 4 16 12 1 3 17 19 18 72

Role of discretization edges Missing discretization edge ⇒ non-rigid structure Else: X finite ⇒ X uncountable 73

Role of pruning edges No pruning edges: 8 incongruent realizations in R 2 2 5 1 3 2 4 1 2 4 2 4 3 5 1 3 5 1 3 5 4 5 4 4 3 3 3 5 5 3 4 2 2 2 4 2 1 1 1 1 5 74

Role of pruning edges Pruning edge { 1 , 4 } : only 4 realizations remain valid 2 5 1 3 2 4 1 2 4 2 4 3 5 1 3 5 1 3 5 4 5 4 4 3 3 3 5 5 3 4 2 2 2 4 2 1 1 1 1 5 75

Motivation Protein backbones • Total order < on V • Covalent bond distances : { u − 1 , u } ∈ E • Covalent bond angles : { u − 2 , u } ∈ E • NMR experiments : { u − 3 , u } ∈ E (and other edges { u, v } with v − u > 3) Generalize “3” to K [Lavor et al., COAP 2012] 76

K DMDGP graphs K = 2 K = 3 2 4 1 3 5 Generalization of protein backbone order : v > K is adjacent to K immediate predecessors v − 1 , . . . , v − K K DMDGP: Discretizable Molecular Distance Geometry Problem 77

The Branch-and-Prune (BP) algorithm BP ( v , ¯ x , X ): 1. Given v > K , realization ¯ x = ( x 1 , . . . , x v − 1 ) S K − 1 � 2. Compute S = u u ∈ U v 3. For each x v ∈ S s.t. ∀{ u, v } ∈ E P ( u < v → � x u − x v � = d uv ) (a) let x = (¯ x, x v ) (b) if v = n add x to X , else call BP ( v + 1, x , X ) • Recursive: starts with BP ( K + 1, ( x 1 , . . . , x K ), ∅ ) • All realizations in X are incongruent ∗ • Can be easily modified to find only p solutions for given p • Applies to all ( K − 1)-trilaterative graphs in R K • Specialize to KDMDGP graph by setting U v = { v − 1 , . . . , v − K } ∗ with probability 1, and aside from one reflection at v = K + 1 [L. et al. ITOR 2008] 78

Complexity of BP • Most operations are O ( K h ) for some fixed h ⇒ O (1) • Distance check at Step 3: O ( n ) • Recursion on at most 2 branches at each call: binary tree • Only recurse when v > K, v < n : 2 n − K nodes • Overall O ( n 2 n − K ) = O (2 n ) Worst-case exponential behaviour 79

Hardness of K DMDGP • The K DMDGP is NP -hard for each K – every DGP instance is also DMDGP if K = 1 – reduction from Partition can be extended to any K • ( K − 1)-trilateration graphs are NP -hard by inclusion • No polytime algorithm unless P = NP Trilaterative graphs in R K are complexitywise borderline at K 80

Correctness Thm. When BP terminates, X contains every incongruent realization of G Proof. • Let ¯ y be any realization of G • Since G has an initial K -clique, can rotate/translate/reflect ¯ y to y [ K ] = x [ K ] for all x ∈ X • BP exhaustively constructs every extension of x [ K ] which is feasible with all distances, so y ∈ X for a realization y , y [ h ] = ( y 1 , . . . , y h ) is the initial segment of y 81

Two examples 82

Empirical observations • Fast : up to 10k vertices in a few seconds on 2010 hardware • Precise : errors in range O (10 − 9 )- O (10 − 12 ) • Number of solutions always a power of 2: obvious if E P = ∅ , but otherwise mysterious • Linear-time behaviour on proteins : this really shouldn’t happen 83

Symmetry in the K DMDGP 1. Applications 2. Definition 3. Complexity primer 4. Complexity of the DGP 5. Number of solutions 6. Mathematical optimization formulations 7. Realizing complete graphs 8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP 10. Tractability of protein instances 11. Finding vertex orders 12. Approximate realizations [L. et al. DAM 2014] 84

Partial reflections • For each v > K , let g v ( x ) = ( x 1 , . . . , x v − 1 , R v x ( x v ) , . . . , R v x ( x n )) be the partial reflection of x w.r.t. v x v − 3 x v − 1 x v − 2 • Note: the g v ’s are idempotent operators • G D = ( V, E D ): subgraph of G given by discretization edges • ∀ v > K reflection R v x gives a binary choice in general ∗ • X D ⊂ R nK contains 2 n − K incongruent realizations of G D ∗ subsequent results hold “with probability 1” 85

Discretization group • G D = � g v | v > K � : the discretization group of G w.r.t. K subgroup of a Cartesian product of reflection groups � g a v • An element g ∈ G D has the form v , where a v ∈ { 0 , 1 } v>K � � g a K +1 K +1 ◦ · · · ◦ g a n • Action of G D on X D : g ( x ) = ( x ) n 86

Commutativity of partial reflections Lemma A G D is Abelian Proof Assume K < u < v . Then g u ( x 1 , . . . , x v − 1 , R v x ( x v ) , . . . , R v g u g v ( x ) = x ( x n )) ( x 1 . . . , x u − 1 , R u g v ( x ) ( x u ) , . . . , R u g v ( x ) R v x ( x v ) , . . . , R u g v ( x ) R v = x ( x n )) ( x 1 . . . , x u − 1 , R u x ( x u ) , . . . , R v g u ( x ) R u x ( x v ) , . . . , R v g u ( x ) R u = x ( x n )) g v ( x 1 , . . . , x u − 1 , R u x ( x u ) , . . . , R u = x ( x n )) = g v g u ( x ) where equality of these terms holds by a Technical Lemma (next slide) [L. et al. 2013] 87

Commutativity of partial reflections Technical Lemma (Proof sketch for K = 2) Let y ⊥ Aff( x v − 1 , . . . , x v − K ) and ρ y = R v x z y ρ z t O ρ z y ρ ρ z y ρ z ρ y t = ρ ρ z y ρ z t ρ z ρ y t t ρ y 88

One realization generates all others Lemma B The action of G D on X D is transitive g 3 g 4 g 5 x K = 2 y ∃ g ∈ G D ( y = g ( x )): namely, y = g 5 ( g 4 ( g 3 ( x ))) Proof By induction on v : assume result holds to v − 1 with g ′ , then either it holds for v and g = g ′ , else flip and let g = g v g ′ [L. et al. 2013] 89

Structure and invariance • G D is Abelian and generated by n − K idempotent elements ⇒ G D ∼ = C n − K 2 • G D ≤ Aut( X D ) by construction 90

Solution sets • X : set of incongruent realizations of G • G D defined on same vertices but fewer edges ⇒ fewer distance constraints on realizations ⇒ more realizations • All realizations of G are also realizations of G D ⇒ X ⊆ X D 91

Losing invariance on pruning edges Lemma C Let W uv = { u + K + 1 , . . . , v } be the range of { u, v } ∀ x ∈ X, u, w, v ∈ V ( w ∈ W uv ↔ � x u − x v � � = � g w ( x ) u − g w ( x ) v � ) Proof sketch for K = 2 u w v Corollary If { u, v } ∈ E P and w ∈ W uv , g w ( x ) �∈ X [L. et al. 2013] 92

Pruning group Define: { g w ∈ G D | w > K ∧ ∀{ u, v } ∈ E P ( w �∈ W uv ) } Γ = � Γ � G P = Lemma D X is invariant w.r.t. G P Proof Follows by corollary, invariance of X D w.r.t. G D and X ⊆ X D 93

Transitivity of the pruning group Lemma E The action of G P on X is transitive • Given x, y ∈ X , aim to show ∃ g ∈ G P ( y = g ( x )) • Lemma B ⇒ ∃ g ∈ G D with y = g ( x ) ∈ X D • Suppose g �∈ G P and aim for a contradiction • ⇒ ∃{ u, v } ∈ E P and w ∈ W uv s.t. g w is a component of g • Lemma C ⇒ � g w ( x ) u − g w ( x ) v � � = d uv • If w is the only such vertex, y = g ( x ) � = x against hypothesis, done • Suppose ∃ another z ∈ W uv s.t. g z is a component of g • Set of cases s.t. � x u − x v � = � g z g w ( x ) u − g z g w ( x ) v � given � g w ( x ) u − g w ( x ) v � � = � x u − x v � � = � g z ( x ) u − g z ( x ) v � has Lebesgue measure 0 in all DGP inputs • By induction, holds for any number of components g z of g with z ∈ W uv • ⇒ y = g ( x ) � = x against hypothesis, done [L. et al. 2013] 94

The main result Theorem | X | = 2 | Γ | • Lemma A ⇒ G D ∼ = C n − K ⇒ | G D | = 2 n − K 2 • G P ≤ G D ⇒ ∃ ℓ ∈ N ( G P ∼ = C ℓ 2 ) , with ℓ = | Γ | • Lemma E ⇒ ∀ x ∈ X G P x = X g − 1 = g • Idempotency ⇒ ∀ g ∈ G P ⇒ ∀ g, h ∈ G P , x ∈ X ( gx = hx → h − 1 gx = x → hgx = x → hg = I → h = g − 1 = g ) ⇒ the mapping G P x → G P given by gx → g is injective • ∀ g, h ∈ G P , x ∈ X ( g � = h → gx � = hx ) ⇒ the mapping gx → g is surjective • ⇒ the mapping gx → g is a bijection • ⇒ | G P x | = | G P | | X | = | G P x | = | G P | = 2 | Γ | • ⇒ ∀ x ∈ X [L. et al. 2013] 95

Symmetry-aware BP • Don’t need to explore all branches of BP tree • Build Γ as a pre-processing step • Run BP, terminating as soon as | X | = 1 • For each g ∈ G P , compute gx [Mucherino et al. JBCB 2012] 96

Complexity • Computing Γ: O ( mn ) 1. initialize indicator vector ι = ( ι K +1 , . . . , ι n ) for g v ∈ Γ 2. initialize ι = 1 3. for each { u, v } ∈ E P and w ∈ W uv let ι w = 0 • BP: O (2 n ) • Compute gx for each g ∈ G P : O (2 | Γ | ) • Overall: O (2 n ) • Gains depend on the instance 97

Tractability of protein instances 1. Applications 2. Definition 3. Complexity primer 4. Complexity of the DGP 5. Number of solutions 6. Mathematical optimization formulations 7. Realizing complete graphs 8. The Branch-and-Prune algorithm 9. Symmetry in the K DMDGP 10. Tractability of protein instances 11. Finding vertex orders 12. Approximate realizations [L. et al. 2013] 98

Let’s handle the BP tree 1 2 3 4 29 5 17 30 42 6 13 18 22 31 38 43 47 7 12 14 15 19 21 23 24 32 37 39 40 44 46 48 49 8 16 20 25 33 41 45 50 9 26 34 51 10 27 35 52 11 28 36 53 Max depth: n , looks good! Aim to prove width is bounded 99

Number of solutions at each BP tree level Depends on range of longer pruning edge incident to level v 1 2 3 4 5 6 7 8 9 K + 2 4 8 16 32 64 128 256 512 no pruning edges 2 4 8 16 32 64 128 256 2 4 8 16 32 64 128 2 4 8 16 32 64 2 4 8 16 32 Green path 2 4 8 16 8 nodes at level K + 4 { 3 , K +5 } ∈ E P ⇒ g K +4 , g K +5 �∈ Γ 2 4 8 ⇒ no symmetry at levels K + 4 and K + 5 2 4 ⇒ only 4 nodes at level K + 5 pruning edges make 2 graph K -trilaterative 100

Euclidean Distance Geometry Leo Liberti IBM Research, USA CNRS LIX - PowerPoint PPT Presentation

Euclidean Distance Geometry Leo Liberti IBM Research, USA CNRS LIX Ecole Polytechnique, France MFD 2014, Campinas [L., Lavor: Introduction to Euclidean Distance Geometry , in preparation] Table of contents 1. Applications 2. Definition 3.

Sampling in Euclidean and Non-Euclidean Domains: A Unified Approach NIST ACMD Seminar Series

Euclidean Domains and Euclidean Functions Rod Downey (Joint Work with Asher Kach) Chicago May

Distance in data space Notion of distance (metrics) in data space Who is my closest neighbor?

Using non-Euclidean Geometry to teach Euclidean Geometry to K12 teachers David Damcke

Who cares about Euclidean geometry? Miroslav Olk Euclidean geometry Points, lines and

Distance Education Distance education used to be about the distance. 1700s 1800s 1900s 2000s

What makes geometry Euclidean or Non-Euclidean? I-1. Each two distinct points determine a line

Dynamical analysis of euclidean algorithms Introduction Dynamical analysis of euclidean

Stochastic geometry and random generation 1 Stochastic geometry and random generation

Hyperbolic Geometry Victor Gonzalez Mentor: Ryan Kirk May 4, 2016 Hyperbolic Geometry We are

Mark-recapture distance sampling (MRDS) in Distance 7.1 Setting up Distance for MRDS

SYDE 372 - Winter 2011 Introduction to Pattern Recognition Distance Measures for Pattern

Distance constraints in Euclidean geometry Leo Liberti IBM Research, Yorktown Heights LIX,

EUCLIDEAN DISTANCE TRANSFORM ON XAVIER Vincent Bao, Stanley Tzeng, Ching Hung AGENDA This talk

48-175 Descriptive Geometry Basic Concepts of Descriptive Geometry Descriptive geometry is

The Geometry of Vector Spaces x E N : vector x belongs to an N -dimensional Euclidean space.

Anticipating Concept Drift in Online Learning Micha l Derezi nski (speaker), Badri Narayan

The Traveling Salesman Problem Under Squared Euclidean Distances Mark de Berg Fred van Nijnatten

Similarity-based Analysis for Trajectory Data Kevin Zheng 25/04/2014 DASFAA 2014 Tutorial 1

Closest Pair of Points Cormen et.al 33.4 Closest Pair of Points Closest pair. Given n points in

Learning Nearest Neighbor Graphs from Noisy Distance Samples Noisy Distance Samples Blake Mason,

A quick review The parsimony principle: Find the tree that requires the fewest

Learning From Data Lecture 16 Similarity and Nearest Neighbor Similarity Nearest Neighbor M.

Covered Topics! v Big Graph Data Mining Sampling Ranking v Big Data Management Indexing v