parallel depth first on gpu
play

Parallel Depth First on GPU M. Naumov, A. Vrielink and M. Garland, - PowerPoint PPT Presentation

Parallel Depth First on GPU M. Naumov, A. Vrielink and M. Garland, GTC 2017 Introduction Directed Trees Directed Acyclic Graphs (DAGs) AGENDA Path- and SSSP-based variants Optimizations Performance Experiments 2 What is DFS? a Node:


  1. Parallel Depth First on GPU M. Naumov, A. Vrielink and M. Garland, GTC 2017

  2. Introduction Directed Trees Directed Acyclic Graphs (DAGs) AGENDA  Path- and SSSP-based variants  Optimizations Performance Experiments 2

  3. What is DFS? a Node: a,b,c,d,e,f,g,i,j Parent: d c b Discovery: g e Finish: f i j 3

  4. What is DFS? a Node: a,b,c,d,e,f,g,i,j Parent: /,a d c b Discovery: a,b g e Finish: f i j 4

  5. What is DFS? a Node: a,b,c,d,e,f,g,i,j Parent: /,a, b, d c b Discovery: a,b,e g e Finish: e f i j 5

  6. What is DFS? a Node: a,b,c,d,e,f,g,i,j Parent: /,a, b,b d c b Discovery: a,b,e,f g e Finish: e f i j 6

  7. What is DFS? a Node: a,b,c,d,e,f,g,i,j Parent: /,a, b,b, ,f d c b Discovery: a,b,e,f,i g e Finish: e,i f i j 7

  8. What is DFS? a Node: a,b,c,d,e,f,g,i,j Parent: /,a, b,b, ,f,f d c b Discovery: a,b,e,f,i,j g e Finish: e,i,j f i j 8

  9. What is DFS? a Node: a,b,c,d,e,f,g,i,j Parent: /,a,a,a,b,b,d,f,f d c b Discovery: a,b,e,f,i,j,c,d,g g e Finish: e,i,j,f,b,c,g,d,a f i j 9

  10. Previous Work on DFS Lexicographic DFS Planar Graphs Directed Graphs with Cycles Directed Acyclic Graphs (DAGs) Time O( 𝑜 log 11 n) Time O(log 2 n) Time O(log 2 n) Processors O(n 3 ) Processors O(n ω /log n) Processors O(n) where ω < 2.373 is the matrix multiplication exponent 10

  11. Previous Work on DFS Lexicographic DFS Planar Graphs Directed Graphs with Cycles Directed Acyclic Graphs (DAGs) Time O( 𝑜 log 11 n) Time O(log 2 n) Time O(log 2 n) Processors O(n 3 ) Processors O(n ω /log n) Processors O(n) topological sort, bi-connectivity and planarity testing where ω < 2.373 is the matrix multiplication exponent 11

  12. DIRECTED TREES 12

  13. Directed Tree a c d b [0] f g e [0] [0] i j [0] [0] Phase 2: Bottom-Up Traversal 13

  14. Directed Tree a c d b [0,1] [0] f g e [0,1,1] [0] [0] i j [0] [0] Phase 2: Bottom-Up Traversal 14

  15. Directed Tree a c d b [0,1] [0] f g e [0,1,2] [0] [0] i j prefix sum [0] [0] Phase 2: Bottom-Up Traversal 15

  16. Directed Tree a c d b [0,1,3] [0,1] [0] f g e [0,1,2] [0] [0] i j [0] [0] Phase 2: Bottom-Up Traversal 16

  17. Directed Tree a c d b [0,1,4] [0,1] [0] f g e [0,1,2] [0] [0] i j prefix sum [0] [0] Phase 2: Bottom-Up Traversal 17

  18. Directed Tree [0,5,1,2] a c d b [0,1,4] [0,1] [0] f g e [0,1,2] [0] [0] i j [0] [0] Phase 2: Bottom-Up Traversal 18

  19. Directed Tree [0,5,6,8] a c d b [0,1,4] [0,1] [0] f g e [0,1,2] [0] [0] i j [0] [0] Phase 2: Bottom-Up Traversal 19

  20. Directed Tree [0,5,6,8] a c d b [0,1,4] [0,1] [0] f g e [0,1,2] [0] [0] i j [0] [0] This phase is done, next phase is about to start … 20

  21. Directed Tree [0,5,6,8] a c d b [0,1,4] [0,1] [0] f g e [0,1,2] offset 0 [0] [0] i j [0] [0] Phase 3: Top-down Traversal 21

  22. Directed Tree [0,5,6,8] a c d b [0,1,4] [0,1] [0] f g e [0,1,2] offset 0 [0] [0] i j offset 1 [0] [0] Phase 3: Top-down Traversal 22

  23. Directed Tree [0,5,6,8] a c d b [0,1,4] [0,1] [0] offset 6 f g e [0,1,2] offset 0 [0] [0] i j offset 1 [0] [0] Phase 3: Top-down Traversal 23

  24. Directed Tree [0,5,6,8] a c d b [0,1,4] [0,1] [0] discovery 6+1 f g e [0,1,2] discovery 0+2 [0] [0] i j discovery 1+3 [0] [0] discovery = offset + depth Phase 3: Top-down Traversal 24

  25. Directed Tree [0,5,6,8] a c d b [0,1,4] [0,1] [0] finish 6+1 f g e [0,1,2] finish 0+0 [0] [0] i j finish 1+0 [0] [0] finish = offset + sub-tree size Phase 3: Top-down Traversal 25

  26. DIRECTED ACYCLIC GRAPHS PATH-BASED VARIANT 26

  27. Path-Based (for DAGs) a c d b f g e i j collision left right [a,b,f] f [a,d,f] Phase 1 27

  28. Path-Based (for DAGs) a c d b f g e i j collision left right • wait until all paths to a node are traversed • align path sequences [a,b,f] f [a,d,f] left [a,b,f] resolution (lexicographically smallest) right [a,d,f] • compare left-to-right and choose smallest Phase 1 28

  29. Path-Based (for DAGs) a c d b f g e i j This phase is done 29

  30. OPTIMIZATIONS 30

  31. Path Pruning a c b e d [a,c,d,f] [a,b,e,f] f 31

  32. Path Pruning When two paths reach the same node a  There exists a parent “a” where the path split [a,b ,…] and [ a,c ,…] c b e d [a,c,d,f] [a,b,e,f] f 32

  33. Path Pruning When two paths reach the same node a  There exists a parent “a” where the path split [a,b ,…] and [ a,c ,…] c b  It is the comparison between “b” and “c” that allows us to distinguish between paths e d [a,c,d,f] [a,b,e,f] f 33

  34. Path Pruning When two paths reach the same node a  There exists a parent “a” where the path split [a,b ,…] and [ a,c ,…] c b  It is the comparison between “b” and “c” that allows us to distinguish between paths  Parent node with a single edge e d will never be a decision point [a,c,d,f] [a,b,e,f] f 34

  35. Path Pruning When two paths reach the same node a  There exists a parent “a” where the path split [a,b ,…] and [ a,c ,…] c b  It is the comparison between “b” and “c” that allows us to distinguish between paths  Parent node with a single edge e d will never be a decision point  No need to store nodes with such parents [a,c,f] [a,b,f] f 35

  36. Path Pruning 36

  37. Phase Composition 37

  38. SSSP-BASED VARIANT 38

  39. SSSP-based (for DAGs) a c d b [1] f g e [1] [1] i j [1] [1] Run the algorithm for Directed Trees, but Run the algorithm for Directed Trees, but  Propagate # of nodes to all the parents  Propagate # of nodes to all the parents  Start prefix sum with 1 (instead of 0)  Start prefix sum with 1 (instead of 0) Phase 1: Bottom-Up Traversal 39

  40. SSSP-based (for DAGs) a c d b [1,1] [1] f g e [1,1,1] [1] [1] i j [1] [1] Run the algorithm for Directed Trees, but Run the algorithm for Directed Trees, but  Propagate # of nodes to all the parents  Propagate # of nodes to all the parents  Start prefix sum with 1 (instead of 0)  Start prefix sum with 1 (instead of 0) Phase 1: Bottom-Up Traversal 40

  41. SSSP-based (for DAGs) a c d b [1,2] [1] f g e [1,2,3] [1] [1] i j prefix sum [1] [1] Run the algorithm for Directed Trees, but Run the algorithm for Directed Trees, but  Propagate # of nodes to all the parents  Propagate # of nodes to all the parents  Start prefix sum with 1 (instead of 0)  Start prefix sum with 1 (instead of 0) Phase 1: Bottom-Up Traversal 41

  42. SSSP-based (for DAGs) a c d b [1,1,3] [1,2] [1] f g e [1,2,3] [1] [1] i j [1] [1] Run the algorithm for Directed Trees, but Run the algorithm for Directed Trees, but  Propagate # of nodes to all the parents  Propagate # of nodes to all the parents  Start prefix sum with 1 (instead of 0)  Start prefix sum with 1 (instead of 0) Phase 1: Bottom-Up Traversal 42

  43. SSSP-based (for DAGs) a c d b [1,2,4] [1,2] [1] f g e [1,2,3] prefix sum [1] [1] i j [1] [1] Run the algorithm for Directed Trees, but Run the algorithm for Directed Trees, but  Propagate # of nodes to all the parents  Propagate # of nodes to all the parents  Start prefix sum with 1 (instead of 0)  Start prefix sum with 1 (instead of 0) Phase 1: Bottom-Up Traversal 43

  44. SSSP-based (for DAGs) [1,5,1,2,1] a c d b [1,2,4] [1,2] [1] f g e [1,2,3] [1] [1] i j [1] [1] Run the algorithm for Directed Trees, but Run the algorithm for Directed Trees, but  Propagate # of nodes to all the parents  Propagate # of nodes to all the parents  Start prefix sum with 1 (instead of 0)  Start prefix sum with 1 (instead of 0) Phase 1: Bottom-Up Traversal 44

  45. SSSP-based (for DAGs) [1,6,7,9,10] a c d b [1,2,4] [1,2] [1] f g e [1,2,3] [1] [1] i j [1] [1] Run the algorithm for Directed Trees, but  Propagate # of nodes to all the parents  Start prefix sum with 1 (instead of 0) Phase 1: Bottom-Up Traversal 45

  46. SSSP-based (for DAGs) a 6 7 1 9 c d b 1 1 2 f g e 1 2 i j Assign # of nodes as the edge weight This phase is done, next phase is about to start … 46

  47. SSSP-based (for DAGs) a 6 7 1 9 c d b 1 1 2 f g e 1 2 i j 1+2+2=5 < 9 Phase 2: Top-down traversal 47

  48. SSSP-based (for DAGs) a 6 7 1 9 c d b 1 1 2 f g e 1 2 i j 1+2+2=5 < 9 Shortest Path is the DFS path Phase 2: Top-down traversal 48

  49. SSSP-based (for DAGs) a c d b f g e i j Phase 2: This phase is done 49

  50. OPTIMIZATIONS 50

  51. Discovery time  The length of shortest path a 0 defines an ordering of nodes c d b 1 6 7 f g e 8 3 2 i j 4 5 Phase 3a: Sorting 51

  52. Discovery time  The length of shortest path a 0 defines an ordering of nodes  We can sort them to obtain c d discovery time b 1 6 7 f g e 8 3 2 i j 4 5 Phase 3a: Sorting 52

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend