CS6100: Topics in Design and Analysis of Algorithms All Pairs - - PDF document

cs6100 topics in design and analysis of algorithms
SMART_READER_LITE
LIVE PREVIEW

CS6100: Topics in Design and Analysis of Algorithms All Pairs - - PDF document

CS6100: Topics in Design and Analysis of Algorithms All Pairs Shortest Paths John Augustine CS6100 (Even 2012): All Pairs Shortest Paths Problem Definition Input: An undirected and unweighted graph G = ( V, E ) , where V = { 1 , 2 , . . . , n


slide-1
SLIDE 1

CS6100: Topics in Design and Analysis of Algorithms

All Pairs Shortest Paths John Augustine

CS6100 (Even 2012): All Pairs Shortest Paths

slide-2
SLIDE 2

Problem Definition

Input: An undirected and unweighted graph G = (V, E), where V = {1, 2, . . . , n}. In particular, we are give the adjacency matrix A of G. The distance matrix D is an n × n matrix in which the entry Dij is the length of the shortest path between i and j. In the All Pairs Distance (or APD ) Problem, we are asked to construct the distance matrix D. In the All Pairs Shortest Paths (or APSP ) Problem, we are asked to report a shortest path between each pair of vertices. Naive Solution: Run BFS from each node. Takes O(nm) = O(n3) time. Can we solve APD in o(n3) time?

CS6100 (Even 2012): All Pairs Shortest Paths 1

slide-3
SLIDE 3

Connection to Matrix Multiplication

Consider A2 under Boolean matrix multiplication, i.e., product is Boolean and and addition is Boolean or. A2

ij = 1 iff ∃ path of distance 2 from i to j.

This notion extends to any Aℓ ∀ℓ ∈ [n]. Note that the closure A∗ of A given by A∗ =

  • ℓ=1

Aℓ =

n

  • 1

Aℓ gives the transitive closure of G. To solve APD, however, we will need all Aℓ, 1 < ℓ ≤ n, which can take O(n4) using naive matrix

  • multiplication. This can be improved to O(n · MM(n))1,

the time to multiply two matrices.

1Recently improved to n2.3727 by Virginia Vassilevska Williams CS6100 (Even 2012): All Pairs Shortest Paths 2

slide-4
SLIDE 4

Towards APD using Matrix Closure

As usual, A is the adjacency matrix (without self- loops). Our matrix multiplication, however, is over the closed semiring of reals augmented with ∞. More simply, to compute each element in the product C of matrices A and B, we use Cij = min

1≤k≤n(Aik + Bkj).

Under this definition of matrix multiplication, (A2)ij is the shortest distance between i and j as long as it is either 1 or 2. Generalizing this, the closure matrix A∗ captures the APD. See next page for deficiencies.

CS6100 (Even 2012): All Pairs Shortest Paths 3

slide-5
SLIDE 5

Deficiencies

  • 1. Computing A∗ leads to entries that require super-

linear number of bits.

  • 2. Does not lead to APSP.

CS6100 (Even 2012): All Pairs Shortest Paths 4

slide-6
SLIDE 6

Algorithm for APD that leads to APSP

Input: graph G = (V, E) as adjacency matrix A. Output: APD distance matrix D. Observation 1. Let Z = A2. (∃ a path of length 2 between i and j)

  • Zij > 0

(Entries in Z encode the # of distinct length 2 paths.)

CS6100 (Even 2012): All Pairs Shortest Paths 5

slide-7
SLIDE 7

Let G′ = (V, E′), the graph in which (i, j) ∈ E iff ∃ a path of length either 1 or 2 in G.

  • Let A′ be the adjacency matrix of G′ and
  • let D′ be the distance matrix (its entries can be ∞).

Note (given our defn of G′) that A′

ij =

   if i = j 1 if either Aij = 1 or Zij = 1

  • therwise.

Therefore, (G′ is complete)

  • (G has diameter 2)

Then, D = 2A′ − A, i.e., D can be constructed in O(n2) time.

CS6100 (Even 2012): All Pairs Shortest Paths 6

slide-8
SLIDE 8

What if diameter > 2?

Lemma 2.

  • 1. Dij is even ⇒ Dij = 2D′

ij.

  • 2. Dij is odd ⇒ Dij = 2D′

ij − 1.

Proof is trivial, but the lemma leads to a simple recursive algorithm (with one deficiency). Algorithm For APD (but with a deficiency) Compute Z = A2 Compute A′ if A′

ij = 1 ∀i = j then

Return D = 2A′ − A. end if Recursively compute D′. Compute D such that for each pair i and j, Dij = 2D′

ij

if Dij is even 2D′

ij − 1

if Dij is odd

CS6100 (Even 2012): All Pairs Shortest Paths 7

slide-9
SLIDE 9

Focus Now is on Finding Parity of Dij

First a lemma for which the proof is trivial. Lemma 3. Suppose i = j.

  • 1. ∀k ∈ Γ(i).

Dij − 1 ≤ Dkj ≤ Dij + 1. (1)

  • 2. Furthermore, ∃k ∈ Γ(i) such that

Dkj = Dij − 1 (2)

CS6100 (Even 2012): All Pairs Shortest Paths 8

slide-10
SLIDE 10

Lemma 4. Suppose i = j. Case Even. (Dij is even) ⇒ D′

kj ≥ D′ ij

∀k ∈ Γ(i). Case Odd. (Dij is odd) ⇒ D′

kj ≤ D′ ij

∀k ∈ Γ(i). And ∃k ∈ Γ(i) such that D′

kj < D′ ij.

Proof. Case Even. Dij = 2ℓ for some integer ℓ. Therefore, D′

ij = ℓ.

(3) From Lemma 3 Dkj ≥ 2ℓ − 1 ∀k ∈ Γ(i). (4)

CS6100 (Even 2012): All Pairs Shortest Paths 9

slide-11
SLIDE 11

From Lemma 2 D′

kj ≥ Dkj

2 ≥ ℓ − 1/2. Since distances are integral, Dkj ≥ ℓ. (5) Equations 3 and 5 complete the proof of Case Even. Case Odd in next page.

CS6100 (Even 2012): All Pairs Shortest Paths 10

slide-12
SLIDE 12

Case Odd. We are give that Dij = 2ℓ − 1. From Lemma 2, D′

ij = ℓ.

(6) Lemma 3 → Dkj ≤ 2ℓ ∀k ∈ Γ(i). Lemma 2 → D′

kj ≤ Dkj+1 2

≤ ℓ + 1/2. Due to integrality of distances, D′

kj ≤ ℓ.

(7) From Equations 6 and 7, D′

kj ≤ D′ ij.

And, ∃k ∈ Γ(i) such that Dkj = Dij − 1 = 2ℓ − 2. From Lemma 2, D′

kj = ℓ − 1 < ℓ = D′ ij

(by Equation 6). I.e., D′

kj < D′ ij.

CS6100 (Even 2012): All Pairs Shortest Paths 11

slide-13
SLIDE 13

A Cleaner Form

Corollary 1. Suppose i = j and d(i) denote the degree

  • f i.

(Dij is even) ⇔

  • k∈Γ(i)

D′

kj ≥ d(i) · D′ ij.

(Dij is odd) ⇔

  • k∈Γ(i)

D′

kj < d(i) · D′ ij.

  • Proof. Summ the inequalities in Lemma 4 over the

neighbours of each vertex i. Now we need to compute:

  • k∈Γ(i)

D′

kj = n

  • k=1

AikD′

kj = Sij

(8) Or, more compactly, S = AD′.

CS6100 (Even 2012): All Pairs Shortest Paths 12

slide-14
SLIDE 14

Algorithm for APD with No Deficiency

Compute Z = A2 Compute A′ if A′

ij = 1 ∀i = j then

Return D = 2A′ − A. end if Recursively compute D′. S = AD′ Compute D such that for each pair i and j, Dij = 2D′

ij

if Sij ≥ D′

ijZii

2D′

ij − 1

if Sij < D′

ijZii

Theorem 5. The APD algorithm takes

  • 1. Time taken is O(MM(n) log n).
  • 2. Entries in matrices bounded in # bits by O(log n).

Proof Sketch. Let δ be diameter of G. When δ > 2, T(n, δ) = 2MM(n) + T(n, δ 2

  • ) + O(n2).

The proof follows by solving this recurrence.

CS6100 (Even 2012): All Pairs Shortest Paths 13

slide-15
SLIDE 15

Witnessing Boolean Matrix Multiplication

Let A and B be two Boolean matrices and P = AB under Boolean multiplication. When will Pij = 1? When there exists k such that Aik = Bkj = 1. Then, we say that k “witnesses” Pij. In a witness matrix for P, each entry Wij is a witness for Pij. Intuitively, witness matrix W plays the role

  • f

identifying an intermediate node between nodes i and j that are distance 2 apart. So before we tackle APSP, we ask: How do we compute the witness matrix? Answer: Easy if we use run-jump algorithm for matrix multiplication. What if we wanted to use black box matrix multiplication to get a o(n3)-time algorithm?

CS6100 (Even 2012): All Pairs Shortest Paths 14

slide-16
SLIDE 16

A Warmup: Each Pij Has Unique Wij

Let ˆ A = k · Aik, where we use integer multiplication. Then, W = ˆ AB, where we use integer multiplication. But of course, there can be multiple witnesses, so we need to “isolate” one. How? Lemma 6. An urn contains w white balls and n − w black balls. Let r be a value such that n/2 ≤ wr ≤ n. Suppose we choose r balls at random (without replacement). Then, Pr[exactly one white ball is chosen] ≥ 1 2e. (Proof is available on page 285 of Motwani and Raghavan.)

CS6100 (Even 2012): All Pairs Shortest Paths 15

slide-17
SLIDE 17

Let R be a random Boolean vector of cardinality n in which r entries (uniformly at random) are set to 1 and the rest are set to 0. In similar spirit to ˆ A, we obtain AR by setting each entry AR

ik = kRkAik.

We obtain BR by setting BR

kj = RkBkj.

If Pij has a unique witness w. r. t. the indices in which R entries have 1. Then, entry (i, j) of ARBR has that unique witness. By Lemma 6, we can isolate a unique witness with constant probability. We may have to try several values for r, but all in multiples of 2. Furthermore, we can boost up the probability by repeating the process O(log n) times.

CS6100 (Even 2012): All Pairs Shortest Paths 16

slide-18
SLIDE 18

Algorithm BPWM

Require: Two n × n Boolean matrices A and B. Ensure: Compute witness matrix W for P = AB.

1: W = −AB. 2: for t = 0, . . . ⌊log n⌋ do 3:

r = 2t

4:

for O(log n) times do

5:

Choose a random R with r one entries.

6:

Compute AR and BR.

7:

Z = ARBR

8:

for all (i, j) do

9:

if Wij < 0 and Zij is witness then

10:

Wij = Zij.

11:

end if

12:

end for

13:

end for

14: end for 15: for all (i, j) do 16:

if Wij < 0 then

17:

Find witnesses by brute force.

18:

end if

19: end for

CS6100 (Even 2012): All Pairs Shortest Paths 17

slide-19
SLIDE 19

Theorem 7. The BPWM algorithm is a Las Vegas algorithm for the BPWM problem with expected running time O(MM(n) log2 n). Proof Sketch. The non-trivial part of the proof is to limit the time taken in the brute force part. In the first part, we have to show that a witness for each Pij is found with probability at least 1 − 1/n. Then, the brute force part is executed only for O(n) entries, each taking O(n) time. Take any Pij; let it have w non-zero entries. At least

  • ne r value will satisfy condition for Lemma 6.

So witness will be isolated with constant probability. A simple application of Chernoff bounds will yield the result.

CS6100 (Even 2012): All Pairs Shortest Paths 18

slide-20
SLIDE 20

Determinining APSP First, a super-cubic algorithm

First, note that APSP can be represented succinctly by a “successor” matrix S in which ∀(i, j), Sij = k where k is a neighbour of i that lies on a shortest path from i → j. (Sij = k) ⇐ ⇒ (Dkj = d − 1 and Aik = 1). Define Bd such that (Bd

kj = 1) ⇐

⇒ (Dkj = d − 1). Compute the witness matrix W d of A and Bd. Note that W d will contain successors for pairs (i, j) at distance d. Repeating for all 2 ≤ d ≤ n − 1, we can solve APSP.

CS6100 (Even 2012): All Pairs Shortest Paths 19

slide-21
SLIDE 21

A Sub-Cubic Algorithm for APSP

Recall Lemma 3. For i = j, ∀k ∈ Γ(i). Dij − 1 ≤ Dkj ≤ Dij + 1. (9) We are specifically interested in k ∈ Γ(i) such that Dkj = Dij − 1. Therefore, any k such that Aik = 1 and Dkj ≡ Dij−1( mod 3) is a valid candidate for Sij. Define D(s) to be a n × n Boolean matrix such that (D(s)

kj = 1) ⇐

⇒ Dkj + 1 ≡ s mod 3. The witness matrix W (s) for each of three pairings of A with D(s), s ∈ {0, 1, 2} is computed. Finally, compute S such that Sij = W

(Dij mod 3) ij

. Theorem 8. APSP can be computed in O(MM log2 n) time.

CS6100 (Even 2012): All Pairs Shortest Paths 20