EFFICIENT MAXIMUM FLOW ALGORITHM Nikolay Sakharnykh Hugo Braun; May - - PowerPoint PPT Presentation

efficient maximum flow algorithm
SMART_READER_LITE
LIVE PREVIEW

EFFICIENT MAXIMUM FLOW ALGORITHM Nikolay Sakharnykh Hugo Braun; May - - PowerPoint PPT Presentation

EFFICIENT MAXIMUM FLOW ALGORITHM Nikolay Sakharnykh Hugo Braun; May 10, 2017 MAXIMUM FLOW Definition Example : How much instant power can Palo Alto get using that electric grid? s 30 25 Power SF plant 1 8 20 Directed graph


slide-1
SLIDE 1

Nikolay Sakharnykh – Hugo Braun; May 10, 2017

EFFICIENT MAXIMUM FLOW ALGORITHM

slide-2
SLIDE 2

2

MAXIMUM FLOW

Definition

s t

  • Directed graph
  • Flow capacities on edges
  • Maximum flow from s to t?

Example: How much instant power can Palo Alto get using that electric grid?

Power plant

SF SJ PA 20 30 8 +∞ 25 12 1

slide-3
SLIDE 3

3

Applications

MAXIMUM FLOW

Image segmentation Community detection

slide-4
SLIDE 4

4

MAXIMUM FLOW SOLVERS

OTHER PREFLOW AUGMENTING PATHS

Iteratively find a new augmenting path

  • Ford–Fulkerson
  • Edmonds–Karp
  • Dinic’s/MPM

Linear programming Push flow locally in a preflow graph

  • Push relabel and its

variants

slide-5
SLIDE 5

5

AGENDA

Edmonds-Karp Push-relabel MPM

slide-6
SLIDE 6

6

FORD FULKERSON

Workflow

while a path p of capacity c > 0 exists from s to t: maxflow += c for all edges in path p: edge.capacity -= c add reverse edge from edge.destination to edge.source of capacity c

Augmenting path on first iteration, path of capacity min(1,3,1) = 1 s t a b c d 1 1 3 3 5 1 1

slide-7
SLIDE 7

7

FORD FULKERSON

Workflow

Augmenting path on first iteration, path of capacity min(1,3,1) = 1 s t a b c d 1 2 1 5 1 1 1 3

while a path p of capacity c > 0 exists from s to t: maxflow += c for all edges in path p: edge.capacity -= c add reverse edge from edge.destination to edge.source of capacity c

slide-8
SLIDE 8

8

FORD FULKERSON

Workflow

Augmenting path on first iteration, path of capacity min(1,3,1) = 1 s t a b c d 1 2 1 5 1 1 1 3 Augmenting path on second iteration, path of capacity min(1,1,1,3,5) = 1

while a path p of capacity c > 0 exists from s to t: maxflow += c for all edges in path p: edge.capacity -= c add reverse edge from edge.destination to edge.source of capacity c

slide-9
SLIDE 9

9

EDMONDS-KARP

Edmonds-Karp: variation of Ford Fulkerson Main idea: use the shortest augmenting path One augmenting path needs one BFS Wikipedia graph: ~5000 augmenting paths s t a b c d 1 2 1 5 1 1 1 3 Ford-Fulkerson & variants use too many graph traversals

slide-10
SLIDE 10

10

PUSH-RELABEL

slide-11
SLIDE 11

11

PUSH-RELABEL

Workflow

Saturate all out-edges of s, create reverse edges 𝓂[s] = number of vertices 𝓂[v] = 0 for all v ∈ V \ {s} while there is an applicable push or relabel operation execute the operation push(u, v): if(e[u] > 0 and 𝓂[u] == 𝓂[v] + 1) push e[u] amount of flow from u to v relabel(u): if(e[u] > 0 and 𝓂[u] <= 𝓂[v] for all current neighbors) 𝓂[u] = minimum 𝓂[v] among neighbors + 1

s a b 1 1 l=? l=? e=? l=? e=? d l=? e=? 1

slide-12
SLIDE 12

12

PUSH-RELABEL

Workflow

Saturate all out-edges of s, create reverse edges 𝓂[s] = number of vertices 𝓂[v] = 0 for all v ∈ V \ {s} while there is an applicable push or relabel operation execute the operation push(u, v): if(e[u] > 0 and 𝓂[u] == 𝓂[v] + 1) push e[u] amount of flow from u to v relabel(u): if(e[u] > 0 and 𝓂[u] <= 𝓂[v] for all current neighbors) 𝓂[u] = minimum 𝓂[v] among neighbors + 1

s a b l=6 l=0 e=1 l=0 e=1 1 1 d l=0 e=0 1

slide-13
SLIDE 13

13

PUSH-RELABEL

Workflow

Saturate all out-edges of s, create reverse edges 𝓂[s] = number of vertices 𝓂[v] = 0 for all v ∈ V \ {s} while there is an applicable push or relabel operation execute the operation push(u, v): if(e[u] > 0 and 𝓂[u] == 𝓂[v] + 1) push e[u] amount of flow from u to v relabel(u): if(e[u] > 0 and 𝓂[u] <= 𝓂[v] for all current neighbors) 𝓂[u] = minimum 𝓂[v] among neighbors + 1

s a b l=6 l=0 e=1 l=1 e=1 1 1 d l=0 e=0 1

slide-14
SLIDE 14

14

PUSH-RELABEL

Workflow

Saturate all out-edges of s, create reverse edges 𝓂[s] = number of vertices 𝓂[v] = 0 for all v ∈ V \ {s} while there is an applicable push or relabel operation execute the operation push(u, v): if(e[u] > 0 and 𝓂[u] == 𝓂[v] + 1) push e[u] amount of flow from u to v relabel(u): if(e[u] > 0 and 𝓂[u] <= 𝓂[v] for all current neighbors) 𝓂[u] = minimum 𝓂[v] among neighbors + 1

s a b l=6 l=0 e=1 l=1 e=0 1 1 d l=0 e=1 1

slide-15
SLIDE 15

15

PUSH-RELABEL

Parallelism issues

while there is an applicable push or relabel operation execute the operation

s a b l=6 l=0 e=1 l=1 e=0 1 1 d l=0 e=1 1 At this step, we could relabel a or d. Which one? Complexity of heuristics : PRIORITY LARGEST L SMALLEST L FIFO Complexity O(V 2√E) O(V 2E) O(V 3) Source of parallelism:

Order affects convergence. Massive parallelism yields random order

slide-16
SLIDE 16

16

PUSH-RELABEL

Parallelism issues

Parallelism drops. Not enough to saturate the GPU

Source : The University of Texas at Austin

In theory, number of threads = number of vertices In practice, number of active vertices << number of vertices

slide-17
SLIDE 17

17

PUSH-RELABEL

Conclusion

s a b l=6 l=0 e=1 l=1 e=0 1 1 d l=0 e=1 1

  • Actual parallelism is low
  • Massive parallelism yields random order which

damages performance

  • We need graph traversals (BFS) for some critical

heuristics Push-relabel not suited for GPU implementation road_usa: GPU does 20 BFS, CPU does only 3 BFS CPU is faster since it requires fewer traversals

slide-18
SLIDE 18

18

MPM

slide-19
SLIDE 19

19

DINIC’S

Workflow

s t a b c d 2 1 3 3 2 1 1 Two augmenting paths of length 3 They have been discovered using just one BFS Avoid running BFS twice here Main idea of Dinic’s: reuse BFS results Edges on paths of length 3

slide-20
SLIDE 20

20

DINIC’S

Workflow

s t a b c d 2 1 3 3 2 1 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find augmenting path from s to t in GL Push corresponding flow in GL, update edges

Graph G

slide-21
SLIDE 21

21

DINIC’S

Workflow

s t a b c d 2 3 3 2 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find augmenting path from s to t in GL Push corresponding flow in GL, update edges

Graph GL(3)

slide-22
SLIDE 22

22

DINIC’S

Workflow

s t a b c d 2 2 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find augmenting path from s to t in GL Push corresponding flow in GL, update edges

Graph GL(3) 1 1 1 2 1 DFS

slide-23
SLIDE 23

23

DINIC’S

Workflow

s t a b c d 2 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find augmenting path from s to t in GL Push corresponding flow in GL, update edges

Graph GL(4) 1 1 1

slide-24
SLIDE 24

24

DINIC’S

Workflow

s t a b c d 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find augmenting path from s to t in GL Push corresponding flow in GL, update edges

Graph GL(4) 1 1 1 1 1 DFS DFS traverse all vertices on GPU We lose all advantages of Dinic’s

slide-25
SLIDE 25

25

MPM

Workflow

s t a b c d 2 1 3 3 2 1 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find vertex u with minimum potential m, with potential(u) = min(degreein(u), degreeout(u)) push m from u to t, pull m from s to u remove all vertex with min(degreein(u), degreeout(u))=0

Graph G

slide-26
SLIDE 26

26

MPM

Workflow

s t b c d 2 3 3 2 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find vertex u with minimum potential m, with potential(u) = min(degreein(u), degreeout(u)) push m from u to t, pull m from s to u remove all vertex with min(degreein(u), degreeout(u))=0

Graph G

slide-27
SLIDE 27

27

MPM

Workflow

s t b c d 2 3 3 2 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find vertex u with minimum potential m, with potential(u) = min(degreein(u), degreeout(u)) push m from u to t, pull m from s to u remove all vertex with min(degreein(u), degreeout(u))=0

Graph GL(3) c

is selected so that we know 1 amount of flow will pass through

slide-28
SLIDE 28

28

MPM

Workflow

s t b c d 1 2 3 2

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find vertex u with minimum potential m, with potential(u) = min(degreein(u), degreeout(u)) push m from u to t, pull m from s to u remove all vertex with min(degreein(u), degreeout(u))=0

Graph GL(3) c

is selected so that we know 1 amount of flow will pass through

slide-29
SLIDE 29

29

MPM

Workflow

s t b d 1 3 2

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find vertex u with minimum potential m, with potential(u) = min(degreein(u), degreeout(u)) push m from u to t, pull m from s to u remove all vertex with min(degreein(u), degreeout(u))=0

Graph GL(3)

slide-30
SLIDE 30

30

MPM

Dinic’s vs MPM

s t a h Dinic’s - DFS e f b 3 5 7 3 2 3 1 g 2 Processed but useless edges Processed and acceptable edges s t h 7 3 1 c c MPM – Push/Pull/Prune Across the graph, min potential = 1 (vertex h) Pushing 1 to t, pulling 1 from s, using any edges d 3 4 d 3 4

slide-31
SLIDE 31

31

MPM

Dinic’s vs MPM

s t h 7 3 1 c d 3 4 Saturating one augmenting path on GPU:

  • MPM: Push/pull/prune process 30us
  • Edmonds-Karp/Dinic’s: one BFS >1ms
  • Perf. bounded by kernel launch latency

Example: Wikipedia 2011

  • MPM: 5 BFS, 6000 augmenting paths
  • EK: 6000 BFS
slide-32
SLIDE 32

32

MPM

GPU design

s t h 7 3 1 c d 3 4 MPM paper gives a high level implementation Most of the work went into GPU implementation design (2 out of 3 months)

slide-33
SLIDE 33

33

MAXIMUM FLOW RESULTS

GRAPH N NNZ SPEED UP AVG MIN MAX wiki03

455436 3811198

9.1 1.7 15.3

wiki11

3721339 121043107

22.5 19.8 28.9

road_usa

23947347 57708624

2 0.7 4.2

road CA

1971281 5533214

2.3 0.8 4.9

Galois on dual socket Haswell 16 cores vs NVIDIA Titan X (Pascal)

slide-34
SLIDE 34

34

EFFICIENT MAXIMUM FLOW ALGORITHM

  • Black-box solver: large variety of applications can be seen as the flow problem.
  • Data-dependent, irregular algorithm: how to create enough “real” parallelism

and how to avoid latency issues on the GPU.

  • Order of magnitude speed-ups on wide graphs. Long graphs require a more

efficient graph traversal implementation.

Takeaways

slide-35
SLIDE 35

35

REFERENCES

An Experimental Comparison of Min

  • Cut/Max-Flow Algorithms for Energy

Minimization in Vision, Yuri Boykov, Vladimir Kolmogorov Finding Web Communities by Maximum Flow Algorithm using Well

  • Assigned Edge

Capacities, Noriko IMAFUJI, and Masaru KITSUREGAWA An O(|

  • V|3) algorithm for finding maximum flows in networks, V.M. Malhotra,

M.Pramodh Kumar, S.N. Maheshwari Parallizing the Push

  • Relabel Max Flow Algorithm, Victoria Popic, Javier Vélez
slide-36
SLIDE 36