Random Graphs (2 nd part) Omid Etesami Phase transitions for - - PowerPoint PPT Presentation

random graphs 2 nd part omid etesami phase transitions
SMART_READER_LITE
LIVE PREVIEW

Random Graphs (2 nd part) Omid Etesami Phase transitions for - - PowerPoint PPT Presentation

Topics in Algorithms and Data Science Random Graphs (2 nd part) Omid Etesami Phase transitions for CNF-SAT Phase transitions for other random structures We already saw phase transitions for random graphs Other random structures, like


slide-1
SLIDE 1

Topics in Algorithms and Data Science Random Graphs (2nd part)

Omid Etesami

slide-2
SLIDE 2

Phase transitions for CNF-SAT

slide-3
SLIDE 3

Phase transitions for other random structures

  • We already saw phase transitions for random graphs
  • Other random structures, like Boolean formula in conjunctive normal

form (CNF), also have phase transitions

slide-4
SLIDE 4

Random k-CNF formula

  • n variables
  • m clauses
  • k literals per clause (k constant)
  • literal = variable or negation
  • each clause independently chosen from possible clauses.
  • Unsatisfiability is an increasing property, so it has phase transition.
slide-5
SLIDE 5

Satisfiability conjecture

  • Conjecture. There is a constant rk such that m = rkn is a sharp

threshold for satisfiability. The conjecture was recently proved for large k by Ding, Sly, Sun!

slide-6
SLIDE 6

Upper bound on rk

  • Let m = cn.
  • Each truth assignment satisfies the CNF with probability (1 – 2-k)cn.
  • The probability that the CNF is satisfiable is at most 2n(1 – 2-k)cn

.

  • Thus rk ≤ 2k ln 2.

3-SAT solution space (height represents # of unsatisfied constraints)!

slide-7
SLIDE 7

Lower bound on rk

  • Lower bound more difficult. 2nd moment method doesn’t work.
  • We focus on k = 3.
  • Smallest Clause (SC) heuristic finds a satisfying solution almost surely

when m = cn and constant c < 2/3. Thus r3 ≥ 2/3.

slide-8
SLIDE 8

Smallest Clause (SC) heuristic

While not all clauses satisfied assign true to a random literal in a random smallest-length clause delete satisfied clauses; delete unsatisfied literals. If a 0-length clause is ever found, we have failed.

slide-9
SLIDE 9

Queue of 1-literal and 2-literal clauses

  • While queue is not empty, a member of the queue is satisfied.
  • Setting a literal to true, may add other clauses to the queue.
  • We will show that while the queue is non-empty, the arrival rate is

less than the departure rate.

slide-10
SLIDE 10

Principle of deferred decisions

  • We pretend that we do not know the literals appearing in each

clause.

  • During the algorithm, we only know the size of each clause.
slide-11
SLIDE 11

Queue arrival rate

  • When the t’th literal is assigned value, each 3-literal clause is added

to the queue with probability 3/(2(n-t+1)).

  • (With the same probability, the clause is satisfied.)
  • Therefore, the average # of clauses added to the queue at each step is

at most 3(cn – t + 1)/(2(n-t+1)) = 1 – Ω(1).

slide-12
SLIDE 12

The waiting time is O(lg n)

  • Thm. The # steps any clause remains in the queue is Ω(lg n) with probability

at most 1/n3. The probability that the queue is empty at step t and remains non-empty in steps t, t + 1, …, t + s - 1 is at most exp(-Ω(s)) by multiplicative Chernoff bound: the # arrivals should be at least s while mean # arrivals is s(1 – Ω(1)). (We upper-bound # arrivals with sum of independent Bernoullies.) There are only n choices for t. Therefore for suitable choice of s0 = Ө(lg n), any non-empty episode is of length at most s0 with probability 1 – 1/n3.

slide-13
SLIDE 13

The probability that setting a literal in the i’th clause makes the j’th clause false is o(1/n2)

If this trouble happens, then

  • either of i’th or j’th clause is added to the queue at some step t,
  • j’th clause consists of 1 literal when trouble happens,
  • by SC rule i’th clause also consists of 1 literal when its literals is assigned,
  • with probability 1 – 1/n3 the waiting time for both clauses is O(lg n).

If a1, a2, … is the sequence of literals that would be set to true (if clauses i and j didn’t exist), then 4 of the literals in these two clauses are the negation of the literals in at, at+1, …, at’ for t’ = t + O(lg n). This happens with probability O((ln 4 n)/n4) times # choices for t.

slide-14
SLIDE 14

Since there are O(n2) pairs of clauses, the algorithm fails with probability o(1)

by union bound.

slide-15
SLIDE 15

Nonuniform models of random graphs

slide-16
SLIDE 16

Nonuniform models

  • Fix a degree distribution: there is f(d) vertices of degree d
  • Choose a random graph among all graphs with this degree

distribution

  • Edges are no longer independent
slide-17
SLIDE 17

Degree distribution: vertex perspective vs edge perspective

  • Consider a graph where half of vertices have degree 1, half have degree 2
  • A random vertex is equally likely of degree 1 or 2
  • A random vertex of a random edge is twice more

likely to be of degree 2

  • In many algorithms, we traverse a random edge

to reach an endpoint: the probability of reaching a vertex of degree i is then proportional to i λi , where λi is the fraction of vertices of degree i

slide-18
SLIDE 18

Giant component in random graphs with given degree distribution

slide-19
SLIDE 19

[Molloy, Reed] There will be a giant component iff

  • Intuition: Consider BFS (branching process) from a fixed vertex.
  • After the first level, a vertex of degree i has exactly i – 1 children.
  • The branching process has probability of extinction < 1 iff the

expected # children E[i – 1] ≥ 1, or in other words E[i – 2] >= 0.

  • In calculating the expectation, the probability of degree i is from the

edge perspective (and not the vertex perspective). Thus it is proportional to i λi.

slide-20
SLIDE 20

Example: G(n, p=1/n)

slide-21
SLIDE 21

Poisson degree distribution

If vertices have Poisson degree distribution with mean d, then random endpoint of a random edge has degree distribution 1 + Poisson(d).

slide-22
SLIDE 22

Growth model without preferential attachment

slide-23
SLIDE 23

Growing graphs

  • Vertices and edges are added over time.
  • Preferential attachment = selecting endpoints for a new edge with

probability proportional to degrees

  • Without preferential attachment = selecting endpoints for a new edge

uniformly at random from the set of existing vertices

With preferential attachment

slide-24
SLIDE 24

Basic growth model without preferential attachment

  • Start with zero vertices and zero edges
  • At each time t, add a new vertex
  • With probability δ, join two random vertices by an edge

The resulting graph may become a multigraph. But since there are t2 pairs of vertices and O(t) existing edges, a multiple edge or self-loop happens at each step with small probability, and we ignore these cases.

new vertex new edge

slide-25
SLIDE 25

# vertices of each degree

Let dk(t) be expected # vertices of degree k at time t.

new vertex new edge

slide-26
SLIDE 26

degree distribution

Let dk(t) = pkt in the limit as t tends to infinity. Geometric distribution which like the Poisson Erdos-Renyi distribution falls off exponentially fast, unlike preferential attachment power-law.

slide-27
SLIDE 27

# components of each finite size

Let nk(t) be expected # components of size k at time t

  • A randomly picked component is of size k with probability

proportional to nk(t)

  • A randomly picked vertex is in a component of size k with probability

equal to k nk(t)

Components of size 4 and 2

slide-28
SLIDE 28

Recurrence relation for nk(t)

  • We use expectations instead of actual # of components of each size?!
  • We ignore edges falling inside components since we are interested in small

component sizes.

j vertices k – j vertices

slide-29
SLIDE 29

Recurrence relation for ak=nk(t) / t

j vertices k – j vertices

slide-30
SLIDE 30

Phase transition for non-finite components

slide-31
SLIDE 31

Size of non-finite components below critical threshold

slide-32
SLIDE 32

Summary of phase transition

slide-33
SLIDE 33

Comparison with static random graph having degree distribution

  • Could you explain why giant components appear for smaller δ in the

grown model?

slide-34
SLIDE 34

Why is δ = 1/4 the threshold for static model?

slide-35
SLIDE 35

Growth model with preferential attachment

slide-36
SLIDE 36

Description of the model

  • Begin with empty graph
  • At each time, add a new vertex

and with probability δ, attach the new vertex to a vertex selected at random with probability proportional to its degree Obviously the graph has no cycles.

slide-37
SLIDE 37

Degree of vertex i at time t

Let di(t) be the degree of vertex i at time t Thus di(t) = a t1/2. Since di(i) = δ, we have di(t) = δ (t/i)1/2.

slide-38
SLIDE 38

Power-law degree distribution

Vertex number tδ2/d2 has degree d. Therefore, # of vertices of degree d is In other words, probability of degree d is 2δ2/d3.

slide-39
SLIDE 39

Small world graphs

slide-40
SLIDE 40

Milgram’s experiment

  • Ask one in Nebraska to send a letter to one in

Massachusetts with given address and occupation

  • At each step, send to someone you know on a

“first name” basis who is closer

  • In successful experiments, it took 5 or 6 steps
  • Called “six degrees of separation”
slide-41
SLIDE 41

The Kleinberg model for random graphs

  • n × n grid with local and global edges
  • From each vertex u, there is a long-distance edge

to a vertex v

  • Vertex v is chosen with probability proportional to

d(u,v)-r where distance is Manhattan distance.

slide-42
SLIDE 42

Normalization factor

  • Let .
  • # nodes of distance k from u is at most 4k.
  • # nodes of distance k from u is at least k for k ≤ n/2.
  • We have
  • cr(u) = Θ(1) when r > 2.
  • cr(u) = Θ(lg n) when r = 2.
  • cr(u) = Ω(n2-r) when r < 2.
slide-43
SLIDE 43

No short (polylogarithmic) paths exist when r > 2.

  • Expected # of edges connecting vertices of distance ≥ d* is
  • Thus, with high probability there is no edge connecting vertices at

distance at least d* for some d* = n1-Ω(1).

  • Since many pairs of vertices are at distance Ω(n) from each other, the

shortest path between these pairs is at least nΩ(1).

A pair of vertices with distance Ω(n)

slide-44
SLIDE 44

Local algorithm when r = 2

The algorithm is local and greedy: At each step follow the edge that takes us closest to the target.

Target w

slide-45
SLIDE 45

Analysis of the algorithm

Claim: with high probability for any pair of vertices u, t, within O(ln2 n) steps the distance from u to t decreases by half:

Proof: If distance between u and t is k, there are Θ(k2) vertices at distance ≤ k/2 from t. All these vertices are at distance Θ(k) from u. Thus, with probability Θ(k2 k-r/cr(u))=Θ(1/ln n), there is an edge to a vertex half distant to t. Repeating this process Θ(ln2 n) steps, by independence of edges, we succeed with probability o(1/n4). Since there are n4 pairs, by union bound, we succeed for every pair (u, t).

slide-46
SLIDE 46

Local algorithm when r = 2 takes polylogarithmic steps

Since the distance is at most 2n in the beginning and halves every O(ln2 n) steps, we reach the target within O(ln3 n) steps.

Target w

slide-47
SLIDE 47

No local algorithm finds polylogarithmic paths when r < 2

Take u and t at distance ≥ nδ. We show any local algorithm with high probability takes ≥ nδ steps to go from u to t for small constant δ > 0. Otherwise, the algorithm should use an edge that takes us to a point at distance < nδ. At each step, this happens with probability ≤ O(n2δ/cr) = O(n-2+r+2δ), since we cannot plan

  • n the outgoing edges of vertices we haven’t yet visited in a local algorithm.

Since we should find such an edge in the first nδ steps, we can find it only with probability O(n-2+r+3δ), which is o(1) for small δ.

local algorithm for finding short path does not exist, despite existence of short paths

slide-48
SLIDE 48

Proof that logarithmic paths exist when r = 0

  • We show the diameter is O(lg n) in a way similar to the proof for

Erdos-Renyi.

  • Partition the grid into 3×3 squares: Now there are 9 non-local edges

going out of each square.

  • There are Ω(lg n) squares at distance Ө(lg1/2 n)
  • f any square.
  • W.h.p. the non-local neighbors of these squares

are at least twice these squares, since 9 > 2 and by Chernoff bound.

slide-49
SLIDE 49

Proof that logarithmic paths exist when r = 0 (continued)

  • Similarly, one can show that while half of squares have not been

visited, the neighbors visited at each level is at least twice the number in the previous level (since # outgoing edges × fraction of remaining squares ≥ 9 × 1/2 > 2)

  • Therefore more than half the squares are can be reached with O(lg n)

edges from any vertex.

  • Any two sets consisting of more than half the squares have nonempty
  • intersection. Q.E.D.