Broadcast Trees for Heterogeneous Platforms Olivier Beaumont, Yves - - PowerPoint PPT Presentation

broadcast trees for heterogeneous platforms
SMART_READER_LITE
LIVE PREVIEW

Broadcast Trees for Heterogeneous Platforms Olivier Beaumont, Yves - - PowerPoint PPT Presentation

Broadcast Trees for Heterogeneous Platforms Olivier Beaumont, Yves Robert and Loris Marchal Laboratoire de lInformatique du Parall elisme Ecole Normale Sup erieure de Lyon, France Yves.Robert@ens-lyon.fr http://graal.ens-lyon.fr/


slide-1
SLIDE 1

Broadcast Trees for Heterogeneous Platforms

Olivier Beaumont, Yves Robert and Loris Marchal

Laboratoire de l’Informatique du Parall´ elisme ´ Ecole Normale Sup´ erieure de Lyon, France Yves.Robert@ens-lyon.fr http://graal.ens-lyon.fr/∼yrobert

December 2004

Yves Robert Broadcast Trees for Heterogeneous Platforms 1/ 34

slide-2
SLIDE 2

Outline

Introduction Models and Framework Platform-based Heuristics

One port-model Multi-port

LP-based heuristics Simulations Conclusion

Yves Robert Broadcast Trees for Heterogeneous Platforms 2/ 34

slide-3
SLIDE 3

Outline

Introduction Models and Framework Platform-based Heuristics

One port-model Multi-port

LP-based heuristics Simulations Conclusion

Yves Robert Broadcast Trees for Heterogeneous Platforms 2/ 34

slide-4
SLIDE 4

Outline

Introduction Models and Framework Platform-based Heuristics

One port-model Multi-port

LP-based heuristics Simulations Conclusion

Yves Robert Broadcast Trees for Heterogeneous Platforms 2/ 34

slide-5
SLIDE 5

Outline

Introduction Models and Framework Platform-based Heuristics

One port-model Multi-port

LP-based heuristics Simulations Conclusion

Yves Robert Broadcast Trees for Heterogeneous Platforms 2/ 34

slide-6
SLIDE 6

Outline

Introduction Models and Framework Platform-based Heuristics

One port-model Multi-port

LP-based heuristics Simulations Conclusion

Yves Robert Broadcast Trees for Heterogeneous Platforms 2/ 34

slide-7
SLIDE 7

Outline

Introduction Models and Framework Platform-based Heuristics

One port-model Multi-port

LP-based heuristics Simulations Conclusion

Yves Robert Broadcast Trees for Heterogeneous Platforms 2/ 34

slide-8
SLIDE 8

Outline

Introduction Models and Framework Platform-based Heuristics

One port-model Multi-port

LP-based heuristics Simulations Conclusion

Yves Robert Broadcast Trees for Heterogeneous Platforms 3/ 34

slide-9
SLIDE 9

Broadcasting data

◮ Key collective communication operation ◮ Start: one processor has the data ◮ End: all processors own a copy ◮ Vast literature about broadcast, MPI Bcast ◮ Standard approach: use a spanning tree ◮ Finding the best spanning tree: NP-Complete problem

(even in the telephone model)

Yves Robert Broadcast Trees for Heterogeneous Platforms 4/ 34

slide-10
SLIDE 10

Broadcasting data

◮ Key collective communication operation ◮ Start: one processor has the data ◮ End: all processors own a copy ◮ Vast literature about broadcast, MPI Bcast ◮ Standard approach: use a spanning tree ◮ Finding the best spanning tree: NP-Complete problem

(even in the telephone model)

Yves Robert Broadcast Trees for Heterogeneous Platforms 4/ 34

slide-11
SLIDE 11

Broadcasting data

◮ Key collective communication operation ◮ Start: one processor has the data ◮ End: all processors own a copy ◮ Vast literature about broadcast, MPI Bcast ◮ Standard approach: use a spanning tree ◮ Finding the best spanning tree: NP-Complete problem

(even in the telephone model)

Yves Robert Broadcast Trees for Heterogeneous Platforms 4/ 34

slide-12
SLIDE 12

Broadcasting data

◮ Key collective communication operation ◮ Start: one processor has the data ◮ End: all processors own a copy ◮ Vast literature about broadcast, MPI Bcast ◮ Standard approach: use a spanning tree ◮ Finding the best spanning tree: NP-Complete problem

(even in the telephone model)

Yves Robert Broadcast Trees for Heterogeneous Platforms 4/ 34

slide-13
SLIDE 13

Broadcasting data

◮ Key collective communication operation ◮ Start: one processor has the data ◮ End: all processors own a copy ◮ Vast literature about broadcast, MPI Bcast ◮ Standard approach: use a spanning tree ◮ Finding the best spanning tree: NP-Complete problem

(even in the telephone model)

Yves Robert Broadcast Trees for Heterogeneous Platforms 4/ 34

slide-14
SLIDE 14

Different broadcast problems

Broadcast large messages ⇒ pipelining strategies

◮ split the messages into slices (application level) ◮ route them concurrently, possibly using different spanning

trees

◮ throughput optimization (relaxation of makespan

minimization) STA Singe Tree, Atomic message heuristics to minimize makespan: FNF. . . STP Single Tree, Pipelined series of messages MTP Multiple Tree, Pipelined series of messages

◮ polynomial algorithm to find optimal solution

(LP formulation)

◮ hard to implement ⇒ concentrate on STP Yves Robert Broadcast Trees for Heterogeneous Platforms 5/ 34

slide-15
SLIDE 15

Different broadcast problems

Broadcast large messages ⇒ pipelining strategies

◮ split the messages into slices (application level) ◮ route them concurrently, possibly using different spanning

trees

◮ throughput optimization (relaxation of makespan

minimization) STA Singe Tree, Atomic message heuristics to minimize makespan: FNF. . . STP Single Tree, Pipelined series of messages MTP Multiple Tree, Pipelined series of messages

◮ polynomial algorithm to find optimal solution

(LP formulation)

◮ hard to implement ⇒ concentrate on STP Yves Robert Broadcast Trees for Heterogeneous Platforms 5/ 34

slide-16
SLIDE 16

Different broadcast problems

Broadcast large messages ⇒ pipelining strategies

◮ split the messages into slices (application level) ◮ route them concurrently, possibly using different spanning

trees

◮ throughput optimization (relaxation of makespan

minimization) STA Singe Tree, Atomic message heuristics to minimize makespan: FNF. . . STP Single Tree, Pipelined series of messages MTP Multiple Tree, Pipelined series of messages

◮ polynomial algorithm to find optimal solution

(LP formulation)

◮ hard to implement ⇒ concentrate on STP Yves Robert Broadcast Trees for Heterogeneous Platforms 5/ 34

slide-17
SLIDE 17

Different broadcast problems

Broadcast large messages ⇒ pipelining strategies

◮ split the messages into slices (application level) ◮ route them concurrently, possibly using different spanning

trees

◮ throughput optimization (relaxation of makespan

minimization) STA Singe Tree, Atomic message heuristics to minimize makespan: FNF. . . STP Single Tree, Pipelined series of messages MTP Multiple Tree, Pipelined series of messages

◮ polynomial algorithm to find optimal solution

(LP formulation)

◮ hard to implement ⇒ concentrate on STP Yves Robert Broadcast Trees for Heterogeneous Platforms 5/ 34

slide-18
SLIDE 18

Different broadcast problems

Broadcast large messages ⇒ pipelining strategies

◮ split the messages into slices (application level) ◮ route them concurrently, possibly using different spanning

trees

◮ throughput optimization (relaxation of makespan

minimization) STA Singe Tree, Atomic message heuristics to minimize makespan: FNF. . . STP Single Tree, Pipelined series of messages MTP Multiple Tree, Pipelined series of messages

◮ polynomial algorithm to find optimal solution

(LP formulation)

◮ hard to implement ⇒ concentrate on STP Yves Robert Broadcast Trees for Heterogeneous Platforms 5/ 34

slide-19
SLIDE 19

Different broadcast problems

Broadcast large messages ⇒ pipelining strategies

◮ split the messages into slices (application level) ◮ route them concurrently, possibly using different spanning

trees

◮ throughput optimization (relaxation of makespan

minimization) STA Singe Tree, Atomic message heuristics to minimize makespan: FNF. . . STP Single Tree, Pipelined series of messages MTP Multiple Tree, Pipelined series of messages

◮ polynomial algorithm to find optimal solution

(LP formulation)

◮ hard to implement ⇒ concentrate on STP Yves Robert Broadcast Trees for Heterogeneous Platforms 5/ 34

slide-20
SLIDE 20

Different broadcast problems

Broadcast large messages ⇒ pipelining strategies

◮ split the messages into slices (application level) ◮ route them concurrently, possibly using different spanning

trees

◮ throughput optimization (relaxation of makespan

minimization) STA Singe Tree, Atomic message heuristics to minimize makespan: FNF. . . STP Single Tree, Pipelined series of messages MTP Multiple Tree, Pipelined series of messages

◮ polynomial algorithm to find optimal solution

(LP formulation)

◮ hard to implement ⇒ concentrate on STP Yves Robert Broadcast Trees for Heterogeneous Platforms 5/ 34

slide-21
SLIDE 21

Outline

Introduction Models and Framework Platform-based Heuristics

One port-model Multi-port

LP-based heuristics Simulations Conclusion

Yves Robert Broadcast Trees for Heterogeneous Platforms 6/ 34

slide-22
SLIDE 22

Models

Network = directed graph P = (V, E)

P0 P1 P3 P2

time

◮ General case: affine model (includes latencies) ◮ Common variant: sending and receiving processors busy

during whole transfer

Yves Robert Broadcast Trees for Heterogeneous Platforms 7/ 34

slide-23
SLIDE 23

Models

Network = directed graph P = (V, E)

P0 P1 P3 P2

time

◮ General case: affine model (includes latencies) ◮ Common variant: sending and receiving processors busy

during whole transfer

Yves Robert Broadcast Trees for Heterogeneous Platforms 7/ 34

slide-24
SLIDE 24

Models

Network = directed graph P = (V, E)

P0 P1 P3 P2

time T2,3(L) link e2,3

◮ General case: affine model (includes latencies) ◮ Common variant: sending and receiving processors busy

during whole transfer

Yves Robert Broadcast Trees for Heterogeneous Platforms 7/ 34

slide-25
SLIDE 25

Models

Network = directed graph P = (V, E)

P0 P1 P3 P2

time T2,3(L) link e2,3 send 2,3 P2

◮ General case: affine model (includes latencies) ◮ Common variant: sending and receiving processors busy

during whole transfer

Yves Robert Broadcast Trees for Heterogeneous Platforms 7/ 34

slide-26
SLIDE 26

Models

Network = directed graph P = (V, E)

P0 P1 P3 P2

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

◮ General case: affine model (includes latencies) ◮ Common variant: sending and receiving processors busy

during whole transfer

Yves Robert Broadcast Trees for Heterogeneous Platforms 7/ 34

slide-27
SLIDE 27

Models

Network = directed graph P = (V, E)

P0 P1 P3 P2

time r2,3 r2,3 · L P3 α2,3 β2,3 · L link e2,3 s2,3 · L s2,3 P2

◮ General case: affine model (includes latencies) ◮ Common variant: sending and receiving processors busy

during whole transfer

Yves Robert Broadcast Trees for Heterogeneous Platforms 7/ 34

slide-28
SLIDE 28

Models

Network = directed graph P = (V, E)

P0 P1 P3 P2

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

◮ General case: affine model (includes latencies) ◮ Common variant: sending and receiving processors busy

during whole transfer

Yves Robert Broadcast Trees for Heterogeneous Platforms 7/ 34

slide-29
SLIDE 29

Models

Network = directed graph P = (V, E)

P0 P1 P3 P2

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

◮ General case: affine model (includes latencies) ◮ Common variant: sending and receiving processors busy

during whole transfer

Yves Robert Broadcast Trees for Heterogeneous Platforms 7/ 34

slide-30
SLIDE 30

Multi-port

◮ Banikazemi et al.

no overlap between link and processor occupation:

time recv 2,3 T2,3(L) send 2,3 P3 link e2,3 P2

⇒ methodology to instantiate parameters

Yves Robert Broadcast Trees for Heterogeneous Platforms 8/ 34

slide-31
SLIDE 31

Multi-port

◮ Bar-Noy et al.

  • ccupation time of sender Pu independent of target Pv

time recv v Pv Tu,v(L) link eu,v send u Pu not fully multi-port model, but allows for starting a new transfer from Pu without waiting for previous one to finish

Yves Robert Broadcast Trees for Heterogeneous Platforms 9/ 34

slide-32
SLIDE 32

One-port

◮ Bhat et al.

same parameters for sender Pu, link eu,v and receiver Pv

time ru,v · L ru,v Pv βu,v · L αu,v link eu,v su,v · L su,v Pu Two flavors:

◮ bidirectional: simultaneous send and receive transfers allowed ◮ unidirectional: only one send or receive transfer at a given time-step

Yves Robert Broadcast Trees for Heterogeneous Platforms 10/ 34

slide-33
SLIDE 33

One-port

◮ Bhat et al.

same parameters for sender Pu, link eu,v and receiver Pv

time ru,v · L ru,v Pv βu,v · L αu,v link eu,v su,v · L su,v Pu Two flavors:

◮ bidirectional: simultaneous send and receive transfers allowed ◮ unidirectional: only one send or receive transfer at a given time-step

Yves Robert Broadcast Trees for Heterogeneous Platforms 10/ 34

slide-34
SLIDE 34

One-port

◮ Bhat et al.

same parameters for sender Pu, link eu,v and receiver Pv

time ru,v · L ru,v Pv βu,v · L αu,v link eu,v su,v · L su,v Pu Two flavors:

◮ bidirectional: simultaneous send and receive transfers allowed ◮ unidirectional: only one send or receive transfer at a given time-step

Yves Robert Broadcast Trees for Heterogeneous Platforms 10/ 34

slide-35
SLIDE 35

Framework

◮ Platform graph P = (V, E) ◮ Source processor Psource ◮ Goal: broadcast a series of messages to all other nodes ◮ Transfers of successive messages are pipelined ◮ Send messages along a spanning tree ◮ Find a spanning tree with good throughput

(neglect initialization and clean-up phases)

◮ Bidirectional one-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

◮ Multi-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

Yves Robert Broadcast Trees for Heterogeneous Platforms 11/ 34

slide-36
SLIDE 36

Framework

◮ Platform graph P = (V, E) ◮ Source processor Psource ◮ Goal: broadcast a series of messages to all other nodes ◮ Transfers of successive messages are pipelined ◮ Send messages along a spanning tree ◮ Find a spanning tree with good throughput

(neglect initialization and clean-up phases)

◮ Bidirectional one-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

◮ Multi-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

Yves Robert Broadcast Trees for Heterogeneous Platforms 11/ 34

slide-37
SLIDE 37

Framework

◮ Platform graph P = (V, E) ◮ Source processor Psource ◮ Goal: broadcast a series of messages to all other nodes ◮ Transfers of successive messages are pipelined ◮ Send messages along a spanning tree ◮ Find a spanning tree with good throughput

(neglect initialization and clean-up phases)

◮ Bidirectional one-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

◮ Multi-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

Yves Robert Broadcast Trees for Heterogeneous Platforms 11/ 34

slide-38
SLIDE 38

Framework

◮ Platform graph P = (V, E) ◮ Source processor Psource ◮ Goal: broadcast a series of messages to all other nodes ◮ Transfers of successive messages are pipelined ◮ Send messages along a spanning tree ◮ Find a spanning tree with good throughput

(neglect initialization and clean-up phases)

◮ Bidirectional one-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

◮ Multi-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

Yves Robert Broadcast Trees for Heterogeneous Platforms 11/ 34

slide-39
SLIDE 39

Framework

◮ Platform graph P = (V, E) ◮ Source processor Psource ◮ Goal: broadcast a series of messages to all other nodes ◮ Transfers of successive messages are pipelined ◮ Send messages along a spanning tree ◮ Find a spanning tree with good throughput

(neglect initialization and clean-up phases)

◮ Bidirectional one-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

◮ Multi-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

Yves Robert Broadcast Trees for Heterogeneous Platforms 11/ 34

slide-40
SLIDE 40

Framework

◮ Platform graph P = (V, E) ◮ Source processor Psource ◮ Goal: broadcast a series of messages to all other nodes ◮ Transfers of successive messages are pipelined ◮ Send messages along a spanning tree ◮ Find a spanning tree with good throughput

(neglect initialization and clean-up phases)

◮ Bidirectional one-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

◮ Multi-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

Yves Robert Broadcast Trees for Heterogeneous Platforms 11/ 34

slide-41
SLIDE 41

Framework

◮ Platform graph P = (V, E) ◮ Source processor Psource ◮ Goal: broadcast a series of messages to all other nodes ◮ Transfers of successive messages are pipelined ◮ Send messages along a spanning tree ◮ Find a spanning tree with good throughput

(neglect initialization and clean-up phases)

◮ Bidirectional one-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

◮ Multi-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

Yves Robert Broadcast Trees for Heterogeneous Platforms 11/ 34

slide-42
SLIDE 42

Framework

◮ Platform graph P = (V, E) ◮ Source processor Psource ◮ Goal: broadcast a series of messages to all other nodes ◮ Transfers of successive messages are pipelined ◮ Send messages along a spanning tree ◮ Find a spanning tree with good throughput

(neglect initialization and clean-up phases)

◮ Bidirectional one-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

◮ Multi-port model:

time recv 2,3 P3 T2,3(L) link e2,3 send 2,3 P2

Yves Robert Broadcast Trees for Heterogeneous Platforms 11/ 34

slide-43
SLIDE 43

Outline

Introduction Models and Framework Platform-based Heuristics

One port-model Multi-port

LP-based heuristics Simulations Conclusion

Yves Robert Broadcast Trees for Heterogeneous Platforms 12/ 34

slide-44
SLIDE 44

One port-model

◮ Processors involved in one (sending or receiving)

communication

◮ Duration of a transfer = f (link eu,v)

sendu,v(L) = recvu,v(L) = Tu,v(L) = Tu,v.

Yves Robert Broadcast Trees for Heterogeneous Platforms 13/ 34

slide-45
SLIDE 45

One port-model

◮ Processors involved in one (sending or receiving)

communication

◮ Duration of a transfer = f (link eu,v)

sendu,v(L) = recvu,v(L) = Tu,v(L) = Tu,v.

Yves Robert Broadcast Trees for Heterogeneous Platforms 13/ 34

slide-46
SLIDE 46

Simple Platform Pruning

◮ Idea: delete edges of maximum weight, until we have a tree ◮ Algorithm:

SIMPLE-PLATFORM-PRUNING(P, Psource) TreeEdges ← all edges of E while |TreeEdges| > n − 1 do L ← edges of TreeEdges sorted by non-increasing weight Tu,v for each edge e ∈ L do if the graph (V, TreeEdges\{e}) is still connected then TreeEdges ← TreeEdges\{e} return (V, TreeEdges)

Yves Robert Broadcast Trees for Heterogeneous Platforms 14/ 34

slide-47
SLIDE 47

Simple Platform Pruning

◮ Idea: delete edges of maximum weight, until we have a tree ◮ Algorithm:

SIMPLE-PLATFORM-PRUNING(P, Psource) TreeEdges ← all edges of E while |TreeEdges| > n − 1 do L ← edges of TreeEdges sorted by non-increasing weight Tu,v for each edge e ∈ L do if the graph (V, TreeEdges\{e}) is still connected then TreeEdges ← TreeEdges\{e} return (V, TreeEdges)

Yves Robert Broadcast Trees for Heterogeneous Platforms 14/ 34

slide-48
SLIDE 48

Simple Platform Pruning

◮ Example of simple pruning:

1 3 4 6 2 2

Topology, costs of edges Tu,v

Yves Robert Broadcast Trees for Heterogeneous Platforms 15/ 34

slide-49
SLIDE 49

Simple Platform Pruning

◮ Example of simple pruning:

1 3 4 2 2 6

Choosing and pruning edge of maximum weight

Yves Robert Broadcast Trees for Heterogeneous Platforms 15/ 34

slide-50
SLIDE 50

Simple Platform Pruning

◮ Example of simple pruning:

1 3 4 2 2

Choosing and pruning edge of maximum weight

Yves Robert Broadcast Trees for Heterogeneous Platforms 15/ 34

slide-51
SLIDE 51

Simple Platform Pruning

◮ Example of simple pruning:

1 3 2 2 4

Choosing and pruning edge of maximum weight

Yves Robert Broadcast Trees for Heterogeneous Platforms 15/ 34

slide-52
SLIDE 52

Simple Platform Pruning

◮ Example of simple pruning:

1 3 2 2

Achievable throughput: 1/8

Yves Robert Broadcast Trees for Heterogeneous Platforms 15/ 34

slide-53
SLIDE 53

Refined Platform Pruning

◮ Idea:

◮ at each step, compute the out-degree of each node ◮ prune an edge from a node whose out-degree is maximum

◮ Example:

1 3 4 6 2 2

Topology, costs of edges Tu,v

Yves Robert Broadcast Trees for Heterogeneous Platforms 16/ 34

slide-54
SLIDE 54

Refined Platform Pruning

◮ Idea:

◮ at each step, compute the out-degree of each node ◮ prune an edge from a node whose out-degree is maximum

◮ Example:

1 3 4 2 2 10 6

Choosing maximum out-degree node, then maximum edge

Yves Robert Broadcast Trees for Heterogeneous Platforms 16/ 34

slide-55
SLIDE 55

Refined Platform Pruning

◮ Idea:

◮ at each step, compute the out-degree of each node ◮ prune an edge from a node whose out-degree is maximum

◮ Example:

1 3 4 2 2

Choosing maximum out-degree node, then maximum edge

Yves Robert Broadcast Trees for Heterogeneous Platforms 16/ 34

slide-56
SLIDE 56

Refined Platform Pruning

◮ Idea:

◮ at each step, compute the out-degree of each node ◮ prune an edge from a node whose out-degree is maximum

◮ Example:

1 4 2 2 3 8

Choosing maximum out-degree node, then maximum edge

Yves Robert Broadcast Trees for Heterogeneous Platforms 16/ 34

slide-57
SLIDE 57

Refined Platform Pruning

◮ Idea:

◮ at each step, compute the out-degree of each node ◮ prune an edge from a node whose out-degree is maximum

◮ Example:

1 4 2 2

Achievable throughput: 1/5

Yves Robert Broadcast Trees for Heterogeneous Platforms 16/ 34

slide-58
SLIDE 58

Refined Platform Pruning

REFINED-PLATFORM-PRUNING(P, Psource)

1: TreeEdges ← all edges of E 2: for each u ∈ V do 3:

OutDegree(u) ←

  • v, (u,v)∈E

Tu,v

4: while |TreeEdges| > n − 1 do 5:

SortedNodes ← nodes sorted by non-increasing value of OutDegree(u)

6:

for u ∈ SortedNodes do

7:

L ← edges sorted by decreasing weight Tu,v

8:

for each edge e = (u, v) ∈ L do

9:

if the graph (V, TreeEdges\{e}) is still connected then

10:

TreeEdges ← TreeEdges\{e}

11:

OutDegree(u) ← OutDegree(u) − Tu,v

12:

goto 4

13: return (V, TreeEdges)

Yves Robert Broadcast Trees for Heterogeneous Platforms 17/ 34

slide-59
SLIDE 59

Growing a Minimum Weighted Out-Degree Tree

◮ Idea: grow a tree as in Prim’s algorithm ◮ At each step, choose an edge optimizing metric ◮ Our metric:

◮ minimize the weighted out-degree of each node in the tree

◮ Example:

1 3 4 6 2 2

Yves Robert Broadcast Trees for Heterogeneous Platforms 18/ 34

slide-60
SLIDE 60

Growing a Minimum Weighted Out-Degree Tree

◮ Idea: grow a tree as in Prim’s algorithm ◮ At each step, choose an edge optimizing metric ◮ Our metric:

◮ minimize the weighted out-degree of each node in the tree

◮ Example:

3 4 6 2 2 1

Yves Robert Broadcast Trees for Heterogeneous Platforms 18/ 34

slide-61
SLIDE 61

Growing a Minimum Weighted Out-Degree Tree

◮ Idea: grow a tree as in Prim’s algorithm ◮ At each step, choose an edge optimizing metric ◮ Our metric:

◮ minimize the weighted out-degree of each node in the tree

◮ Example:

3 4 6 2 1 2

Yves Robert Broadcast Trees for Heterogeneous Platforms 18/ 34

slide-62
SLIDE 62

Growing a Minimum Weighted Out-Degree Tree

◮ Idea: grow a tree as in Prim’s algorithm ◮ At each step, choose an edge optimizing metric ◮ Our metric:

◮ minimize the weighted out-degree of each node in the tree

◮ Example:

3 6 2 1 4 2

Yves Robert Broadcast Trees for Heterogeneous Platforms 18/ 34

slide-63
SLIDE 63

Growing a Minimum Weighted Out-Degree Tree

◮ Idea: grow a tree as in Prim’s algorithm ◮ At each step, choose an edge optimizing metric ◮ Our metric:

◮ minimize the weighted out-degree of each node in the tree

◮ Example:

3 6 1 4 2 2

Achievable throughput: 1/5

Yves Robert Broadcast Trees for Heterogeneous Platforms 18/ 34

slide-64
SLIDE 64

Growing a Minimum Weighted Out-Degree Tree

GROWING-MINIMUM-WEIGHTED-OUT-DEGREE-TREE(P, Psource) TreeEdges ← ∅ TreeVertices ← {Psource} for each edge e = (u, v) do cost(u, v) ← Tu,v while TreeVertices = V do choose the link (u, v) such that u ∈ TreeVertices, v / ∈ TreeVertices and (u, v) has minimum value cost(u, v) TreeVertices ← TreeVertices ∪ {v} TreeEdges ← TreeEdges ∪ {(u, v)} for each edge (u, w) / ∈ TreeEdges do cost(u, w) ← cost(u, w) + cost(u, v) return (TreeVertices, TreeEdges)

Yves Robert Broadcast Trees for Heterogeneous Platforms 19/ 34

slide-65
SLIDE 65

Binomial tree heuristic

◮ For sake of comparison ◮ Close to MPI Bcast ◮ Construct a binomial tree (without topological information)

Yves Robert Broadcast Trees for Heterogeneous Platforms 20/ 34

slide-66
SLIDE 66

Binomial tree heuristic

◮ For sake of comparison ◮ Close to MPI Bcast ◮ Construct a binomial tree (without topological information)

Yves Robert Broadcast Trees for Heterogeneous Platforms 20/ 34

slide-67
SLIDE 67

Binomial tree heuristic

◮ For sake of comparison ◮ Close to MPI Bcast ◮ Construct a binomial tree (without topological information)

Yves Robert Broadcast Trees for Heterogeneous Platforms 20/ 34

slide-68
SLIDE 68

Multi-port

◮ Adapt the growing-tree heuristic to

multi-port model

◮ Congestion may come from:

◮ the number of send operations

from Pu,

◮ the length of a transfer Pu → Pv

◮ New computation of out-degree:

send u recv v1 Tu,v1 Tu,v2 recv v2 Tu,v3 recv v2 u v1 v3 v2 iv3 iv2 iv1 iu

Tperiod = max

  • δout(Pu) × sendu, max

i (Tu,vi)

  • Yves Robert

Broadcast Trees for Heterogeneous Platforms 21/ 34

slide-69
SLIDE 69

Multi-port

◮ Adapt the growing-tree heuristic to

multi-port model

◮ Congestion may come from:

◮ the number of send operations

from Pu,

◮ the length of a transfer Pu → Pv

◮ New computation of out-degree:

send u recv v1 Tu,v1 Tu,v2 recv v2 Tu,v3 recv v2 u v1 v3 v2 iv3 iv2 iv1 iu

Tperiod = max

  • δout(Pu) × sendu, max

i (Tu,vi)

  • Yves Robert

Broadcast Trees for Heterogeneous Platforms 21/ 34

slide-70
SLIDE 70

Multi-port

◮ Adapt the growing-tree heuristic to

multi-port model

◮ Congestion may come from:

◮ the number of send operations

from Pu,

◮ the length of a transfer Pu → Pv

◮ New computation of out-degree:

send u recv v1 Tu,v1 Tu,v2 recv v2 Tu,v3 recv v2 u v1 v3 v2 iv3 iv2 iv1 iu

Tperiod = max

  • δout(Pu) × sendu, max

i (Tu,vi)

  • Yves Robert

Broadcast Trees for Heterogeneous Platforms 21/ 34

slide-71
SLIDE 71

Multi-port

◮ Adapt the growing-tree heuristic to

multi-port model

◮ Congestion may come from:

◮ the number of send operations

from Pu,

◮ the length of a transfer Pu → Pv

◮ New computation of out-degree:

send u recv v1 Tu,v1 Tu,v2 recv v2 Tu,v3 recv v2 u v1 v3 v2 iv3 iv2 iv1 iu

Tperiod = max

  • δout(Pu) × sendu, max

i (Tu,vi)

  • Yves Robert

Broadcast Trees for Heterogeneous Platforms 21/ 34

slide-72
SLIDE 72

Multi-port

◮ Adapt the growing-tree heuristic to

multi-port model

◮ Congestion may come from:

◮ the number of send operations

from Pu,

◮ the length of a transfer Pu → Pv

◮ New computation of out-degree:

send u recv v1 Tu,v1 Tu,v2 recv v2 Tu,v3 recv v2 u v1 v3 v2 iv3 iv2 iv1 iu

Tperiod = max

  • δout(Pu) × sendu, max

i (Tu,vi)

  • Yves Robert

Broadcast Trees for Heterogeneous Platforms 21/ 34

slide-73
SLIDE 73

Multi-port

◮ Case where throughput is bounded by the serialized sendu: ◮ Case where throughput is bounded by the longest link

  • ccupation Tu,v:

Tu,v1 Tu,v2 recv v1 recv v2 Tu,v3 recv v3 send u

Yves Robert Broadcast Trees for Heterogeneous Platforms 22/ 34

slide-74
SLIDE 74

Multi-port

◮ Case where throughput is bounded by the serialized sendu: ◮ Case where throughput is bounded by the longest link

  • ccupation Tu,v:

Tu,v1 Tu,v2 recv v1 recv v2 Tu,v3 recv v3 send u

Yves Robert Broadcast Trees for Heterogeneous Platforms 22/ 34

slide-75
SLIDE 75

Outline

Introduction Models and Framework Platform-based Heuristics

One port-model Multi-port

LP-based heuristics Simulations Conclusion

Yves Robert Broadcast Trees for Heterogeneous Platforms 23/ 34

slide-76
SLIDE 76

LP formulation

◮ Solving the MTP problem with LP formulation:

◮ variables: average number of messages going through each link ◮ constraints: one-port model constraints, link occupation

◮ Solution of LP ⇒ network utilization to reach best throughput ◮ Complicated algorithm to reconstruct optimal set of trees for

MTP, not needed here

◮ Use results output by LP, optimal solution Sopt:

◮ TP = optimal throughput ◮ nu,v = number of messages through edge eu,v in one time-unit

in Sopt

Yves Robert Broadcast Trees for Heterogeneous Platforms 24/ 34

slide-77
SLIDE 77

LP formulation

◮ Solving the MTP problem with LP formulation:

◮ variables: average number of messages going through each link ◮ constraints: one-port model constraints, link occupation

◮ Solution of LP ⇒ network utilization to reach best throughput ◮ Complicated algorithm to reconstruct optimal set of trees for

MTP, not needed here

◮ Use results output by LP, optimal solution Sopt:

◮ TP = optimal throughput ◮ nu,v = number of messages through edge eu,v in one time-unit

in Sopt

Yves Robert Broadcast Trees for Heterogeneous Platforms 24/ 34

slide-78
SLIDE 78

LP formulation

◮ Solving the MTP problem with LP formulation:

◮ variables: average number of messages going through each link ◮ constraints: one-port model constraints, link occupation

◮ Solution of LP ⇒ network utilization to reach best throughput ◮ Complicated algorithm to reconstruct optimal set of trees for

MTP, not needed here

◮ Use results output by LP, optimal solution Sopt:

◮ TP = optimal throughput ◮ nu,v = number of messages through edge eu,v in one time-unit

in Sopt

Yves Robert Broadcast Trees for Heterogeneous Platforms 24/ 34

slide-79
SLIDE 79

LP formulation

◮ Solving the MTP problem with LP formulation:

◮ variables: average number of messages going through each link ◮ constraints: one-port model constraints, link occupation

◮ Solution of LP ⇒ network utilization to reach best throughput ◮ Complicated algorithm to reconstruct optimal set of trees for

MTP, not needed here

◮ Use results output by LP, optimal solution Sopt:

◮ TP = optimal throughput ◮ nu,v = number of messages through edge eu,v in one time-unit

in Sopt

Yves Robert Broadcast Trees for Heterogeneous Platforms 24/ 34

slide-80
SLIDE 80

LP-based heuristics

◮ Communication graph pruning:

◮ similar to the previous pruning heuristic ◮ based on the communication graph, labeled with nu,v values ◮ prune edges carrying the fewest messages in Sopt

◮ Growing a spanning tree over the communication graph

◮ start from the communication graph of Sopt ◮ grow a tree, selecting edges with maximal number of messages Yves Robert Broadcast Trees for Heterogeneous Platforms 25/ 34

slide-81
SLIDE 81

LP-based heuristics

◮ Communication graph pruning:

◮ similar to the previous pruning heuristic ◮ based on the communication graph, labeled with nu,v values ◮ prune edges carrying the fewest messages in Sopt

◮ Growing a spanning tree over the communication graph

◮ start from the communication graph of Sopt ◮ grow a tree, selecting edges with maximal number of messages Yves Robert Broadcast Trees for Heterogeneous Platforms 25/ 34

slide-82
SLIDE 82

Outline

Introduction Models and Framework Platform-based Heuristics

One port-model Multi-port

LP-based heuristics Simulations Conclusion

Yves Robert Broadcast Trees for Heterogeneous Platforms 26/ 34

slide-83
SLIDE 83

Platform

Simulations using both the one-port and multi-port models

  • 1. random generation of platforms, with parameters:

number of nodes : 10, 20,. . . , 50 density : 0.04, 0.08,. . . , 0.20 Tu,v : Gaussian distribution : (mean=100MB/s, deviation=20MB/s) sendu,v : 0.80 · minw,(u,w)∈E {Tu,w}

(for each set of parameters, 10 different configurations generated)

  • 2. realistic platforms generated by Tiers:

◮ 100 platforms with 30 nodes ◮ 100 platforms with 65 nodes ◮ density between 0.05 and 0.15 Yves Robert Broadcast Trees for Heterogeneous Platforms 27/ 34

slide-84
SLIDE 84

Platform

Simulations using both the one-port and multi-port models

  • 1. random generation of platforms, with parameters:

number of nodes : 10, 20,. . . , 50 density : 0.04, 0.08,. . . , 0.20 Tu,v : Gaussian distribution : (mean=100MB/s, deviation=20MB/s) sendu,v : 0.80 · minw,(u,w)∈E {Tu,w}

(for each set of parameters, 10 different configurations generated)

  • 2. realistic platforms generated by Tiers:

◮ 100 platforms with 30 nodes ◮ 100 platforms with 65 nodes ◮ density between 0.05 and 0.15 Yves Robert Broadcast Trees for Heterogeneous Platforms 27/ 34

slide-85
SLIDE 85

Results, one-port, random platforms

◮ Performance versus number of nodes

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 15 20 25 30 35 40 45 50 Prune Platform Simple Prune Platform Degree Grow Tree LP Grow Tree LP Prune Binomial Tree

Y axis: relative average performance compared to the optimal solution for MTP

Yves Robert Broadcast Trees for Heterogeneous Platforms 28/ 34

slide-86
SLIDE 86

Results, one-port, random platforms

◮ Performance versus density

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 Prune Platform Simple Prune Platform Degree Grow Tree LP Grow Tree LP Prune Binomial Tree

Y axis: relative average performance compared to the optimal solution for MTP

Yves Robert Broadcast Trees for Heterogeneous Platforms 29/ 34

slide-87
SLIDE 87

Results, multi-port, random platforms

◮ Performance versus number of nodes

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 10 15 20 25 30 35 40 45 50 Multi Port Prune Degree Multi Port Grow Tree LP Grow Tree LP Prune Binomial Tree

Yves Robert Broadcast Trees for Heterogeneous Platforms 30/ 34

slide-88
SLIDE 88

Results, one-port, realistic platforms

◮ Performance of the one-port heuristics on two types of

platforms generated by Tiers

20 40 60 80 100 65 nodes 30 nodes prune simple refined prune grow tree LP grow tree LP prune binomial

Yves Robert Broadcast Trees for Heterogeneous Platforms 31/ 34

slide-89
SLIDE 89

Analysis

◮ For the one-port model:

◮ small platforms: results close to the optimal ◮ large platforms: “advanced” heuristics within 60% of the

  • ptimal

◮ simple pruning heuristic: not scalable ◮ binomial heuristic: very poor results

◮ Under multi-port assumption:

◮ binomial heuristic performs slightly better ◮ adapted heuristic (Growing-Tree): much better results ◮ LP-based heuristics perform well Yves Robert Broadcast Trees for Heterogeneous Platforms 32/ 34

slide-90
SLIDE 90

Analysis

◮ For the one-port model:

◮ small platforms: results close to the optimal ◮ large platforms: “advanced” heuristics within 60% of the

  • ptimal

◮ simple pruning heuristic: not scalable ◮ binomial heuristic: very poor results

◮ Under multi-port assumption:

◮ binomial heuristic performs slightly better ◮ adapted heuristic (Growing-Tree): much better results ◮ LP-based heuristics perform well Yves Robert Broadcast Trees for Heterogeneous Platforms 32/ 34

slide-91
SLIDE 91

Outline

Introduction Models and Framework Platform-based Heuristics

One port-model Multi-port

LP-based heuristics Simulations Conclusion

Yves Robert Broadcast Trees for Heterogeneous Platforms 33/ 34

slide-92
SLIDE 92

Conclusion

◮ Designing efficient algorithms to broadcast data ◮ Use pipelining techniques, focus on steady-state ◮ Using multiple trees (MTP): polynomial algorithm, but

difficult to enforce in practice

◮ Using a single tree (STP): NP-Complete ◮ Design heuristics for STP, possibly using MTP linear program ◮ Avoid binomial approach (MPI)

Yves Robert Broadcast Trees for Heterogeneous Platforms 34/ 34