Clusters of solutions to random linear equations Michael Molloy - - PowerPoint PPT Presentation

clusters of solutions to random linear equations
SMART_READER_LITE
LIVE PREVIEW

Clusters of solutions to random linear equations Michael Molloy - - PowerPoint PPT Presentation

Clusters of solutions to random linear equations Michael Molloy Dept of Computer Science University of Toronto includes work with Dimitris Achlioptas and with Jane Gao Michael Molloy Clusters of solutions to random linear equations A Random


slide-1
SLIDE 1

Clusters of solutions to random linear equations

Michael Molloy

Dept of Computer Science University of Toronto

includes work with Dimitris Achlioptas and with Jane Gao

Michael Molloy Clusters of solutions to random linear equations

slide-2
SLIDE 2

A Random System of Linear Equations

We have M = cn linear equations over n {0, 1} variables. All addition is mod 2. x5 + x1 + x6 = x2 + x6 + x1 = 1 x1 + x2 + x5 = 1 x3 + x5 + x4 = x6 + x4 + x2 = 1 Each equation contains a random k-tuple of variables a random {0, 1} RHS.

Michael Molloy Clusters of solutions to random linear equations

slide-3
SLIDE 3

A Random System of Linear Equations

We have M = cn linear equations over n {0, 1} variables. All addition is mod 2. x5 + x1 + x6 = x2 + x6 + x1 = 1 x1 + x2 + x5 = 1 x3 + x5 + x4 = x6 + x4 + x2 = 1 Each equation contains a random k-tuple of variables a random {0, 1} RHS. Also known as k-XORSAT.

Michael Molloy Clusters of solutions to random linear equations

slide-4
SLIDE 4

Random Constraint Satisfaction Problems

This is one of the standard models of random constraint satisfaction problems. It is one of the simplest of the commonly studied models: lots of symmetry solutions are well-understood

Michael Molloy Clusters of solutions to random linear equations

slide-5
SLIDE 5

Random Constraint Satisfaction Problems

This is one of the standard models of random constraint satisfaction problems. It is one of the simplest of the commonly studied models: lots of symmetry solutions are well-understood We’ve been able to prove things here that we can’t yet prove for, eg., random k-SAT and random graph colouring.

Michael Molloy Clusters of solutions to random linear equations

slide-6
SLIDE 6

Random Constraint Satisfaction Problems

This is one of the standard models of random constraint satisfaction problems. It is one of the simplest of the commonly studied models: lots of symmetry solutions are well-understood We’ve been able to prove things here that we can’t yet prove for, eg., random k-SAT and random graph colouring. Satisfiability Threshold: c = .917..., k = 3 Dubois and Mandler, 2002 k > 3 Dietzfelbinger, et al, 2010, Pittel and Sorkin, 2012.

Michael Molloy Clusters of solutions to random linear equations

slide-7
SLIDE 7

Clustering

cc cs Phenomenon seems to hold for a wide variety of random CSP’s.

Michael Molloy Clusters of solutions to random linear equations

slide-8
SLIDE 8

Clustering

cc cs Well-connected. One can move throughout the cluster changing o(n) variables at a time. Well-separated Moving from one cluster to another requires changing Θ(n) variables in one step.

Michael Molloy Clusters of solutions to random linear equations

slide-9
SLIDE 9

Clustering

cc cs This is mostly non-rigorous, but: It is based on some substantial mathematical analysis It explains a lot

earlier results algorithmic challenges

“Knowing” that it is true can inspire proof approaches (eg. the previous talk) Understanding random CSP’s will require understanding clustering

Michael Molloy Clusters of solutions to random linear equations

slide-10
SLIDE 10

Clustering

cc cs Well-connected. One can move throughout the cluster changing o(n) variables at a time. Well-separated Moving from one cluster to another requires changing Θ(n) variables in one step.

Michael Molloy Clusters of solutions to random linear equations

slide-11
SLIDE 11

Clustering

cc cs Well-connected. One can move throughout the cluster changing O(log n) variables at a time. Well-separated Moving from one cluster to another requires changing Θ(n) variables in one step. Ibrahimi, Kanoria, Kranning and Montanari (2011) Achlioptas and M (2011)

Michael Molloy Clusters of solutions to random linear equations

slide-12
SLIDE 12

The 2-core

Remove every variable that appears in at most one equation, along with the equation it belongs to. x5 + x1 + x6 = x2 + x6 + x1 = 1 x1 + x2 + x5 = 1 x3 + x5 + x4 = ← Remove x4 + x6 + x2 = 1 Reason: Every solution to what remains can easily be extended to the original system, by setting the deleted variable.

Michael Molloy Clusters of solutions to random linear equations

slide-13
SLIDE 13

The 2-core

Remove every variable that appears in at most one equation, along with the equation it belongs to. x5 + x1 + x6 = x2 + x6 + x1 = 1 x1 + x2 + x5 = 1 x3 + x5 + x4 = ← Remove x6 + x4 + x2 = 1 ← Remove Iterate

Michael Molloy Clusters of solutions to random linear equations

slide-14
SLIDE 14

The 2-core

Remove every variable that appears in at most one equation, along with the equation it belongs to. x5 + x1 + x6 = x2 + x6 + x1 = 1 x1 + x2 + x5 = 1 x3 + x5 + x4 = ← Remove x6 + x4 + x2 = 1 ← Remove What remains is the 2-core of the system.

Michael Molloy Clusters of solutions to random linear equations

slide-15
SLIDE 15

The 2-core

Remove every variable that appears in at most one equation, along with the equation it belongs to. x5 + x1 + x6 = x2 + x6 + x1 = 1 x1 + x2 + x5 = 1 x3 + x5 + x4 = ← Remove x6 + x4 + x2 = 1 ← Remove What remains is the 2-core of the system. This is also the 2-core of the underlying hypergraph: vertices are the variables hyperedges are the k-tuples of vertices that form equations

Michael Molloy Clusters of solutions to random linear equations

slide-16
SLIDE 16

The 2-core

Remove every variable that appears in at most one equation, along with the equation it belongs to. x5 + x1 + x6 = x2 + x6 + x1 = 1 x1 + x2 + x5 = 1 x3 + x5 + x4 = ← Remove x6 + x4 + x2 = 1 ← Remove What remains is the 2-core of the system. The satisfiability threshold is the point where the 2-core has density 1.

Michael Molloy Clusters of solutions to random linear equations

slide-17
SLIDE 17

The 2-core

Remove every variable that appears in at most one equation, along with the equation it belongs to. x5 + x1 + x6 = x2 + x6 + x1 = 1 x1 + x2 + x5 = 1 x3 + x5 + x4 = ← Remove x6 + x4 + x2 = 1 ← Remove What remains is the 2-core of the system. Clusters: σ is any solution to the 2-core. Cσ is all extensions of σ to the rest of the system.

Michael Molloy Clusters of solutions to random linear equations

slide-18
SLIDE 18

The 2-core

Remove every variable that appears in at most one equation, along with the equation it belongs to. x5 + x1 + x6 = x2 + x6 + x1 = 1 x1 + x2 + x5 = 1 x3 + x5 + x4 = ← Remove x6 + x4 + x2 = 1 ← Remove What remains is the 2-core of the system. Technical correction: We actually need to work with the 2-core minus O(1) variables because of short cycle effects.

Michael Molloy Clusters of solutions to random linear equations

slide-19
SLIDE 19

Clusters

Clusters: σ is any solution to the 2-core. Cσ is all extensions of σ to the rest of the system. Roughly speaking, clusters are: Well-connected. One can move throughout the cluster changing o(n) vertices at a time. Well-separated Moving from one cluster to another requires changing Θ(n) vertices in one step.

Michael Molloy Clusters of solutions to random linear equations

slide-20
SLIDE 20

Clusters

Clusters: σ is any solution to the 2-core. Cσ is all extensions of σ to the rest of the system. Theorem (IKKM, AM) Any pair of 2-core solutions must differ on at least αn variables. We can move from any extension of σ to any other extension

  • f σ by changing O(log n) variables at a time.

Michael Molloy Clusters of solutions to random linear equations

slide-21
SLIDE 21

Clusters

Clusters: σ is any solution to the 2-core. Cσ is all extensions of σ to the rest of the system. By symmetry, all clusters are isomorphic.

Michael Molloy Clusters of solutions to random linear equations

slide-22
SLIDE 22

Clusters

Clusters: σ is any solution to the 2-core. Cσ is all extensions of σ to the rest of the system. By symmetry, all clusters are isomorphic. The same variables are frozen in every cluster.

Michael Molloy Clusters of solutions to random linear equations

slide-23
SLIDE 23

Clusters

Clusters: σ is any solution to the 2-core. Cσ is all extensions of σ to the rest of the system. By symmetry, all clusters are isomorphic. The same variables are frozen in every cluster. No condensation.

Michael Molloy Clusters of solutions to random linear equations

slide-24
SLIDE 24

Clusters

Clusters: σ is any solution to the 2-core. Cσ is all extensions of σ to the rest of the system. By symmetry, all clusters are isomorphic. The same variables are frozen in every cluster. No condensation. We only need to analyze the random hypergraph, not actual solutions of the CSP.

Michael Molloy Clusters of solutions to random linear equations

slide-25
SLIDE 25

Proven For Other CSPs

Well-Separated Clusters: Asymptotic in k threshold for k-SAT, k-colouring, hypergraph 2-colouring Achlioptas and Coja-Oghlin 2008 independent set Coja-Oghlin and Efthymiou 2010 several others Montanari, Restrepo, Tetali 2009

Michael Molloy Clusters of solutions to random linear equations

slide-26
SLIDE 26

Proven For Other CSPs

Well-Separated Clusters: Asymptotic in k threshold for k-SAT, k-colouring, hypergraph 2-colouring Achlioptas and Coja-Oghlin 2008 independent set Coja-Oghlin and Efthymiou 2010 several others Montanari, Restrepo, Tetali 2009 Freezing:

  • ccurs in k-SAT Achlioptas and Ricci-Tersinghi 2006

Asymptotic in k threshold for k-SAT, k-colouring, hypergraph 2-colouring Achlioptas and Coja-Oghlin 2008 exact threshold for k-colouring M 2011 exact threshold for hypergraph 2-colouring and others M and Restrepo 2013

Michael Molloy Clusters of solutions to random linear equations

slide-27
SLIDE 27

Gaussian Elimination

Stripping Digraph: When we remove x, we direct an edge from x to each of the other k − 1 variables in its equation.

Michael Molloy Clusters of solutions to random linear equations

slide-28
SLIDE 28

Gaussian Elimination

Stripping Digraph: When we remove x, we direct an edge from x to each of the other k − 1 variables in its equation. Gaussian Elimination: Eliminate variables as they are removed. xi =

  • xj∈χi

xj

Michael Molloy Clusters of solutions to random linear equations

slide-29
SLIDE 29

Gaussian Elimination

Stripping Digraph: When we remove x, we direct an edge from x to each of the other k − 1 variables in its equation. Gaussian Elimination: Eliminate variables as they are removed. xi =

  • xj∈χi

xj Easy: xi can reach every xj ∈ χi.

Michael Molloy Clusters of solutions to random linear equations

slide-30
SLIDE 30

Gaussian Elimination

Stripping Digraph: When we remove x, we direct an edge from x to each of the other k − 1 variables in its equation. Gaussian Elimination: Eliminate variables as they are removed. xi =

  • xj∈χi

xj Easy: xi can reach every xj ∈ χi. Lemma Each non 2-core xj is reachable from O(log n) other variables. So we can move between solutions in Cσ by changing base variables one at a time. Each change affects O(log n) variables.

Michael Molloy Clusters of solutions to random linear equations

slide-31
SLIDE 31

Depth in r-cores

r-core of a hypergraph: Repeatedly remove vertices of degree < r. Note that the order in which vertices are removed does not affect the core obtained. This is the largest subgraph with minimum degree at least r.

Michael Molloy Clusters of solutions to random linear equations

slide-32
SLIDE 32

Depth in r-cores

r-core of a hypergraph: Repeatedly remove vertices of degree < r. Note that the order in which vertices are removed does not affect the core obtained. This is the largest subgraph with minimum degree at least r. Analyzed for random graphs in Pittel, Spencer, Wormald(1996) and many other papers. Applications include colouring, hashing, coding, orientability,...

Michael Molloy Clusters of solutions to random linear equations

slide-33
SLIDE 33

Depth in r-cores

r-core of a hypergraph: Repeatedly remove vertices of degree < r. The depth of a non r-core vertex v is the shortest sequence of deletions that leads to removing v.

Michael Molloy Clusters of solutions to random linear equations

slide-34
SLIDE 34

Depth in r-cores

r-core of a hypergraph: Repeatedly remove vertices of degree < r. The depth of a non r-core vertex v is the shortest sequence of deletions that leads to removing v. Theorem (Achlioptas and M) For any c = c∗ (the r-core threshold), the maximum depth in Hk(n, M = cn) is O(log n).

Michael Molloy Clusters of solutions to random linear equations

slide-35
SLIDE 35

Key Step

Michael Molloy Clusters of solutions to random linear equations

slide-36
SLIDE 36

Key Step

When we remove a vertex, the expected number of degree r neighbours is at most 1 − ǫ.

Michael Molloy Clusters of solutions to random linear equations

slide-37
SLIDE 37

Inside the Clustering Threshold

c∗ is the clustering threshold.

Michael Molloy Clusters of solutions to random linear equations

slide-38
SLIDE 38

Inside the Clustering Threshold

c∗ is the clustering threshold. Theorem (IKKM, AM 2011) Any pair of 2-core solutions must differ on at least αn variables. We can move from any extension of σ to any other extension

  • f σ by changing O(log n) variables at a time.

Michael Molloy Clusters of solutions to random linear equations

slide-39
SLIDE 39

Inside the Clustering Threshold

c∗ is the clustering threshold. Theorem (Gao and M) For sufficiently small δ > 0, and c = c∗ + n−δ: Any pair of 2-core solutions must differ on at least n1−β variables. We can move from any extension of σ to any other extension

  • f σ by changing nβ variables at a time.

(β → 0 with δ.)

Michael Molloy Clusters of solutions to random linear equations

slide-40
SLIDE 40

Inside the Clustering Threshold

c∗ is the clustering threshold. Theorem (Gao and M) For sufficiently small δ > 0, and c = c∗ + n−δ: Any pair of 2-core solutions must differ on at least n1−β variables. We can move from any extension of σ to any other extension

  • f σ by changing nβ variables at a time.

(β → 0 with δ.) Theorem (Achlioptas and M) For any c = c∗, the maximum depth in Hk(n, M = cn) is O(log n).

Michael Molloy Clusters of solutions to random linear equations

slide-41
SLIDE 41

Inside the Clustering Threshold

c∗ is the clustering threshold. Theorem (Gao and M) For sufficiently small δ > 0, and c = c∗ + n−δ: Any pair of 2-core solutions must differ on at least n1−β variables. We can move from any extension of σ to any other extension

  • f σ by changing nβ variables at a time.

(β → 0 with δ.) Theorem (Gao and M) For sufficiently small δ > 0, and c = c∗ + n−δ: the maximum depth in Hk(n, M = cn) is at most nβ.

Michael Molloy Clusters of solutions to random linear equations

slide-42
SLIDE 42

Challenge

When we remove a vertex, the expected number of degree r neighbours is at most 1 − ǫ.

Michael Molloy Clusters of solutions to random linear equations

slide-43
SLIDE 43

Challenge

When we remove a vertex, the expected number of degree r neighbours approaches 1, as we approach the r-core.

Michael Molloy Clusters of solutions to random linear equations

slide-44
SLIDE 44

Parallel Stripping

Parallel Stripping Process: At each iteration, simultaneously remove every vertex of degree less than r. We begin by determining the number of iterations: c = c∗ + n−δ → ≈ nδ/2 iterations

Michael Molloy Clusters of solutions to random linear equations

slide-45
SLIDE 45

Parallel Stripping

Parallel Stripping Process: At each iteration, simultaneously remove every vertex of degree less than r. We begin by determining the number of iterations: c = c∗ + n−δ → ≈ nδ/2 iterations This lower bounds the maximum depth.

Michael Molloy Clusters of solutions to random linear equations

slide-46
SLIDE 46

Edge Switching

We carry out the parallel stripping process.

Michael Molloy Clusters of solutions to random linear equations

slide-47
SLIDE 47

Edge Switching

We carry out the parallel stripping process. Then we randomly switch the blue edges.

Michael Molloy Clusters of solutions to random linear equations

slide-48
SLIDE 48

Edge Switching

We carry out the parallel stripping process. Then we randomly switch the blue edges. Equivalently - we first expose the vertices deleted in each iteration, but not the edges between them.

Michael Molloy Clusters of solutions to random linear equations

slide-49
SLIDE 49

Edge Switching

We carry out the parallel stripping process. Then we randomly switch the blue edges. Subject to conditions such as: Every vertex as at least one neighbour in the preceding level.

Michael Molloy Clusters of solutions to random linear equations

slide-50
SLIDE 50

Edge Switching

We carry out the parallel stripping process. Then we randomly switch the blue edges. Subject to conditions such as: Every vertex as at least one neighbour in the preceding level. This provides enough randomness for us to analyze the stripping sequence leading to a particular vertex being removed.

Michael Molloy Clusters of solutions to random linear equations

slide-51
SLIDE 51

Big Steps Required Inside the Clusters

Within each Cσ, changing some variables requires changing nO(δ)

  • thers.

Michael Molloy Clusters of solutions to random linear equations

slide-52
SLIDE 52

Further Challenges

Push further into the clustering threshold. Gain a better rigorous understanding of clusters for other CSP’s.

Michael Molloy Clusters of solutions to random linear equations