On the Impact of Platform Models
EPIT 2007 Arnaud Legrand
CNRS/INRIA, LIG laboratory
June 6, 2007
- A. Legrand (CNRS) INRIA-MESCAL
On the Impact of Platform Models 1 / 108
On the Impact of Platform Models EPIT 2007 Arnaud Legrand - - PowerPoint PPT Presentation
On the Impact of Platform Models EPIT 2007 Arnaud Legrand CNRS/INRIA, LIG laboratory June 6, 2007 A. Legrand (CNRS) INRIA-MESCAL 1 / 108 On the Impact of Platform Models Motivation Scientific computing : large needs in computation or
CNRS/INRIA, LIG laboratory
On the Impact of Platform Models 1 / 108
◮ Scientific computing : large needs in computation or storage
◮ Need to use systems with “several processors”:
◮ Parallel computers with shared/dis-
◮ Clusters ◮ Heterogeneous clusters ◮ Clusters of clusters ◮ Network of workstations ◮ The Grid ◮ Desktop Grids
◮ When modeling platform, communications modeling seems to
◮ Two kinds of people produce communication models: those
◮ All these models are imperfect and intractable.
On the Impact of Platform Models 2 / 108
On the Impact of Platform Models 3 / 108
On the Impact of Platform Models 4 / 108
1
2
3
4
On the Impact of Platform Models Topology 5 / 108
On the Impact of Platform Models Topology 6 / 108
1
2
3
4
On the Impact of Platform Models P2P Communication 7 / 108
On the Impact of Platform Models P2P Communication 8 / 108
◮ Communications are not “splitable” and each communication
◮ Communications are “splitable” but latency is considered to be
◮ Communications are “splitable” and latency cannot be neglected
On the Impact of Platform Models P2P Communication 9 / 108
◮ L is the network latency ◮ o is the middleware overhead (message splitting and packing,
◮ g is the gap (the minimum time between two packets commu-
◮ P is the number of processors/modules
g g g g
g g g L
Card Receiver Card Network
◮ Sending m bytes with packets of size w:
w
◮ Occupation on the sender and on the receiver:
w
On the Impact of Platform Models P2P Communication 10 / 108
◮ L is the network latency ◮ o is the middleware overhead (message splitting and packing,
◮ g is the gap (the minimum time between two packets commu-
◮ P is the number of processors/modules
g g g g
g g g L
Card Receiver Card Network
◮ Sending m bytes with packets of size w:
w
◮ Occupation on the sender and on the receiver:
w
On the Impact of Platform Models P2P Communication 10 / 108
w
On the Impact of Platform Models P2P Communication 11 / 108
m L+m/B.
200 400 600 800 1000 16Mo 4Mo 1Mo 256Ko 64Ko 16Ko 4Ko 2Ko 1Ko 256 128 64 32 16 8 4 2 1 Bande passante [Mbits/s] Taille des messages Mpich 1.2.6 sans optimisation Mpich 1.2.6 avec optimisation
On the Impact of Platform Models P2P Communication 12 / 108
m L+m/B.
200 400 600 800 1000 16Mo 4Mo 1Mo 256Ko 64Ko 16Ko 4Ko 2Ko 1Ko 256 128 64 32 16 8 4 2 1 Bande passante [Mbits/s] Taille des messages Mpich 1.2.6 sans optimisation Mpich 1.2.6 avec optimisation
On the Impact of Platform Models P2P Communication 12 / 108
◮ Probing for m = 1b and m = 1Mb leads to bad results. ◮ The whole middleware layers should be benchmarked (theoret-
On the Impact of Platform Models P2P Communication 13 / 108
1
2
3
4
On the Impact of Platform Models Modeling Concurency 14 / 108
◮ A given processor can communicate with as many other pro-
◮ This model is widely used by scheduling theoreticians (think
◮ Using MPI and synchronous communica-
On the Impact of Platform Models Modeling Concurency 15 / 108
◮ Assume now that we have threads or multi-core processors.
◮ Remember, the bounds due to the round-trip-time must not be
On the Impact of Platform Models Modeling Concurency 16 / 108
◮ A process can communicate with only one other process at a
◮ This model makes sense when using non-threaded versions of
On the Impact of Platform Models Modeling Concurency 17 / 108
On the Impact of Platform Models Modeling Concurency 18 / 108
On the Impact of Platform Models Modeling Concurency 19 / 108
r∈R ρr
On the Impact of Platform Models Modeling Concurency 20 / 108
r∈R ρr ATM
On the Impact of Platform Models Modeling Concurency 20 / 108
◮ Note that this model is a multi-port model with capacity-constraints
◮ When latencies are large, using multiple connections enables to
◮ Therefore many people enforce a sometimes artificial (but less
On the Impact of Platform Models Modeling Concurency 21 / 108
1
2
3
4
On the Impact of Platform Models Imperfection 22 / 108
◮ The previous sharing models are nice but you generally do not
◮ Communications use the memory bus and hence interfere with
◮ Interference between communications are sometimes. . . surprising.
On the Impact of Platform Models Imperfection 23 / 108
On the Impact of Platform Models 24 / 108
5
6
7
On the Impact of Platform Models Divisible Workload 25 / 108
◮ Point to point communication model (homogeneous/heterogeneous,
◮ Concurrency impact.
On the Impact of Platform Models Divisible Workload 26 / 108
◮ Model of the inner structure
◮ The model is validated by comparing the propagation time of
◮ Set of all seismic events of the year 1999: 817101 ◮ Original program written for a parallel computer:
On the Impact of Platform Models Divisible Workload 27 / 108
On the Impact of Platform Models Divisible Workload 28 / 108
5
6
7
On the Impact of Platform Models Divisible Workload 29 / 108
◮ The workers have different computational power. ◮ Communications from the master to the workers can be done
On the Impact of Platform Models Divisible Workload 30 / 108
◮ A set P1, ..., Pp of processors ◮ P1 is the master processor: initially, it holds all the data. ◮ The overall amount of work: Wtotal. ◮ Processor Pi receives an amount of work αiWtotal
i αi = 1.
◮ Time needed to send a unit-message from P1 to Pi: ci.
On the Impact of Platform Models Divisible Workload 31 / 108
i
ci+wi cj+wj
On the Impact of Platform Models Divisible Workload 32 / 108
On the Impact of Platform Models Divisible Workload 33 / 108
On the Impact of Platform Models Divisible Workload 33 / 108
On the Impact of Platform Models Divisible Workload 33 / 108
On the Impact of Platform Models Divisible Workload 33 / 108
On the Impact of Platform Models Divisible Workload 33 / 108
5
6
7
On the Impact of Platform Models Divisible Workload 34 / 108
◮ A set P1, ..., Pp of processors ◮ P1 is the master processor: initially, it holds all the data. ◮ The overall amount of work: Wtotal. ◮ Processor Pi receives an amount of work αiWtotal
i αi = 1.
◮ Time needed to send a unit-message from P1 to Pi: c.
On the Impact of Platform Models Divisible Workload 35 / 108
i
1ip
i
On the Impact of Platform Models Divisible Workload 36 / 108
On the Impact of Platform Models Divisible Workload 37 / 108
On the Impact of Platform Models Divisible Workload 38 / 108
On the Impact of Platform Models Divisible Workload 38 / 108
On the Impact of Platform Models Divisible Workload 38 / 108
On the Impact of Platform Models Divisible Workload 38 / 108
On the Impact of Platform Models Divisible Workload 38 / 108
On the Impact of Platform Models Divisible Workload 38 / 108
On the Impact of Platform Models Divisible Workload 38 / 108
On the Impact of Platform Models Divisible Workload 39 / 108
On the Impact of Platform Models Divisible Workload 39 / 108
On the Impact of Platform Models Divisible Workload 40 / 108
w1 c+w2 α1.
On the Impact of Platform Models Divisible Workload 40 / 108
w1 c+w2 α1.
w2 c+w3 α2.
On the Impact of Platform Models Divisible Workload 40 / 108
w1 c+w2 α1.
w2 c+w3 α2.
c+wi αi−1 for i 2.
On the Impact of Platform Models Divisible Workload 40 / 108
w1 c+w2 α1.
w2 c+w3 α2.
c+wi αi−1 for i 2.
i=1 αi = 1.
On the Impact of Platform Models Divisible Workload 40 / 108
w1 c+w2 α1.
w2 c+w3 α2.
c+wi αi−1 for i 2.
i=1 αi = 1.
j
On the Impact of Platform Models Divisible Workload 40 / 108
On the Impact of Platform Models Divisible Workload 41 / 108
1 c+wi T Wtotal .
1 c+wi+1 ( T Wtotal − αic) = wi (c+wi)(c+wi+1) T Wtotal .
On the Impact of Platform Models Divisible Workload 42 / 108
On the Impact of Platform Models Divisible Workload 43 / 108
1 w1 T Wtotal .
On the Impact of Platform Models Divisible Workload 43 / 108
1 w1 T Wtotal .
1 c+w2 T Wtotal .
On the Impact of Platform Models Divisible Workload 43 / 108
1 w1 T Wtotal .
1 c+w2 T Wtotal .
On the Impact of Platform Models Divisible Workload 43 / 108
1 w1 T Wtotal .
1 c+w2 T Wtotal .
On the Impact of Platform Models Divisible Workload 43 / 108
◮ Closed-form expressions for the execution time and the distri-
◮ Choice of the master. ◮ The ordering of the processors has no impact. ◮ All processors take part in the work.
On the Impact of Platform Models Divisible Workload 44 / 108
5
6
7
On the Impact of Platform Models Divisible Workload 45 / 108
◮ The workers have different computational power.
On the Impact of Platform Models Divisible Workload 46 / 108
◮ A set P1, ..., Pp of processors ◮ P1 is the master processor: initially, it holds all the data. ◮ The overall amount of work: Wtotal. ◮ Processor Pi receives an amount of work αiWtotal
i ni = Wtotal with αiWtotal ∈ Q and i αi = 1.
◮ Time needed to send a unit-message from P1 to Pi: ci.
On the Impact of Platform Models Divisible Workload 47 / 108
On the Impact of Platform Models Divisible Workload 48 / 108
On the Impact of Platform Models Divisible Workload 48 / 108
On the Impact of Platform Models Divisible Workload 48 / 108
◮ All
On the Impact of Platform Models Divisible Workload 49 / 108
◮ All
α2w2 αpwp Tf T2 Tp T1 α1w1 α1c1 αpcp ... ... α2c2 P2 Pp Network P1 Pi
On the Impact of Platform Models Divisible Workload 49 / 108
◮ All
◮ All
αiwi αici α2w2 αpwp Tf T2 Tp T1 α1w1 α1c1 αpcp ... ... α2c2 P2 Pp Network P1 Pi
On the Impact of Platform Models Divisible Workload 49 / 108
◮ All
◮ All
αici αiwi ... ... α2w2 αpwp Tf T2 Tp T1 α1w1 α1c1 αpcp ... ... α2c2 P2 Pp Network P1 Pi
k=1 βkck + βiwi Tf
On the Impact of Platform Models Divisible Workload 49 / 108
◮ All
◮ All
β2 β1
k=1 βkck + βiwi Tf
On the Impact of Platform Models Divisible Workload 49 / 108
◮ All
◮ All
β2 β1
k=1 βkck + βiwi Tf
On the Impact of Platform Models Divisible Workload 49 / 108
◮ All
◮ All
β2 β1
k=1 βkck + βiwi Tf
On the Impact of Platform Models Divisible Workload 49 / 108
◮ All
◮ All
β2 β1 (α1, α2)
k=1 βkck + βiwi Tf
On the Impact of Platform Models Divisible Workload 49 / 108
◮ All
◮ All
β2 β1 (α1, α2)
k=1 βkck + βiwi Tf
On the Impact of Platform Models Divisible Workload 49 / 108
On the Impact of Platform Models Divisible Workload 50 / 108
T P1 P2 t(A) α(A)
1 w1
α(A)
1 c1
α(A)
2 w2
α(A)
2 c2
T P1 P2 t(B) α(B)
2 c2
α(B)
2 w2
α(B)
1 c1
α(B)
1 w1
On the Impact of Platform Models Divisible Workload 50 / 108
T P1 P2 t(A) α(A)
1 w1
α(A)
1 c1
α(A)
2 w2
α(A)
2 c2
T P1 P2 t(B) α(B)
2 c2
α(B)
2 w2
α(B)
1 c1
α(B)
1 w1
On the Impact of Platform Models Divisible Workload 50 / 108
T P1 P2 t(A) α(A)
1 w1
α(A)
1 c1
α(A)
2 w2
α(A)
2 c2
T P1 P2 t(B) α(B)
2 c2
α(B)
2 w2
α(B)
1 c1
α(B)
1 w1
1
2
1
2
On the Impact of Platform Models Divisible Workload 50 / 108
◮ The processors must be ordered by decreasing bandwidths ◮ All processors are working ◮ All processors end their work at the same time ◮ Formulas for the execution time and the distribution of data
On the Impact of Platform Models Divisible Workload 51 / 108
5
6
7
On the Impact of Platform Models Divisible Workload 52 / 108
Tf T2 Tp ... T1 αpwp α2w2 α1w1 α1g α2g αpg Pp P2 P1 Network
R0 R1 Rk
Pp P2 P1 Network
On the Impact of Platform Models Divisible Workload 53 / 108
Tf T2 Tp ... T1 αpwp α2w2 α1w1 α1g α2g αpg Pp P2 P1 Network
R0 R1 Rk
Pp P2 P1 Network
On the Impact of Platform Models Divisible Workload 53 / 108
Tf T2 Tp ... T1 αpwp α2w2 α1w1 α1g α2g αpg Pp P2 P1 Network
R0 R1 Rk
Pp P2 P1 Network
◮ linear communication model leads to absurd solution ◮ resource selection ◮ number of rounds ◮ size of each round
On the Impact of Platform Models Divisible Workload 53 / 108
◮ A set P1, ..., Pp of processors ◮ P1 is the master processor: initially, it holds all the data. ◮ The overall amount of work: Wtotal. ◮ Processor Pi receives an amount of work αiWtotal
i ni = Wtotal with αiWtotal ∈ Q and i αi = 1.
◮ Time needed to send a message of size αi P1 to Pi: Li +
On the Impact of Platform Models Divisible Workload 54 / 108
On the Impact of Platform Models Divisible Workload 55 / 108
On the Impact of Platform Models Divisible Workload 55 / 108
On the Impact of Platform Models Divisible Workload 55 / 108
1 Number of activations : Nact; 2 Whether Pi is the processor used during activation j : χ(j)
i
Nact
p
i α(j) i
k
p
i (Li + α(j) i ci)
Nact
l α(j) l wl T
i
On the Impact of Platform Models Divisible Workload 56 / 108
Nact
p
i α(j) i
k
p
i (Li + α(j) i ci)
Nact
l α(j) l wl T
p
i
i
i
On the Impact of Platform Models Divisible Workload 57 / 108
. . . . . .
Transfer Compute Transfer Compute Transfer Compute Worker 1 Worker 2
round j TA
time Transfer Worker i
round j + 2 round j + 1 T
B
TC
Li
Worker p
α(j+1)
1
ci α(j)
1 ci
α(j)
1 w1
α(j)
i ci
Compute
α(j)
p cp
α(j)
i wi = α(j) 1 w1
α(j+1)
i
ci α(j+1)
p
cp α(j)
p wp = α(j) 1 w1
i wi = p
k
On the Impact of Platform Models Divisible Workload 58 / 108
. . . . . .
Transfer Compute Transfer Compute Transfer Compute Worker 1 Worker 2
round j TA
time Transfer Worker i
round j + 2 round j + 1 T
B
TC
Li
Worker p
α(j+1)
1
ci α(j)
1 ci
α(j)
1 w1
α(j)
i ci
Compute
α(j)
p cp
α(j)
i wi = α(j) 1 w1
α(j+1)
i
ci α(j+1)
p
cp α(j)
p wp = α(j) 1 w1
i wi = p
k
On the Impact of Platform Models Divisible Workload 58 / 108
Tp
Ln αncn Ln αncn Ln αncn . . . α1w1 α2w2 α3w3 αnwn α1c1 α1w1 α2w2 α3w3 αnwn α1w1 α2w2 α3w3 αnwn α1c1 α1c1 L2 L2 L2 α2c2 α2c2 α2c2 L3 L3 L3 α3c3 α3c3 α3c3 L1 L1 L1 Compute Transfer Compute Transfer Compute Transfer Compute Transfer
On the Impact of Platform Models Divisible Workload 59 / 108
◮ Divide total execution time T into k periods of duration Tp. ◮ I ⊂ {1, . . . , p} participating processors. ◮ Bandwidth limitation:
◮ No overlap:
On the Impact of Platform Models Divisible Workload 60 / 108
◮ βi average number of tasks processed by Pi during one time
On the Impact of Platform Models Divisible Workload 61 / 108
◮ βi average number of tasks processed by Pi during one time
◮ Linear program:
i=1 βi
Tp
P
i∈I Li
Tp
On the Impact of Platform Models Divisible Workload 61 / 108
◮ βi average number of tasks processed by Pi during one time
◮ Linear program:
i=1 βi
Tp
P
i∈I Li
Tp
i=1 xi
Li Tp
i=1 xici 1 − Pp
i=1 Li
Tp
On the Impact of Platform Models Divisible Workload 61 / 108
◮ βi average number of tasks processed by Pi during one time
◮ Linear program:
i=1 βi
Tp
P
i∈I Li
Tp
i=1 xi
Pp
i=1 Li
Tp
i=1 xici 1 − Pp
i=1 Li
Tp
On the Impact of Platform Models Divisible Workload 61 / 108
◮ Sort: c1 c2 . . . cp. ◮ Let q be the largest index so that q i=1 ci ci+wi 1. ◮ If q < p, ε = 1 − q i=1 ci ci+wi . ◮ Optimal solution to relaxed program:
Pp
i=1 Li
Tp
i=1 Li
On the Impact of Platform Models Divisible Workload 62 / 108
◮ Let Tp =
max and αi = xiTp for all i. ◮ Then T T ∗ max + O(
max). ◮ Closed-form expressions for resource selection and task assign-
On the Impact of Platform Models Divisible Workload 63 / 108
◮ Still sort resources according to the ci. ◮ Greedily select resources until the sum of the ratios ci wi
ci ci+wi
On the Impact of Platform Models Divisible Workload 64 / 108
◮ NP-hardness comes from the one-port model and latencies. ◮ The problem is however rather easy to approximate.
◮ Communications are much more important than computations.
On the Impact of Platform Models Divisible Workload 65 / 108
5
6
7
On the Impact of Platform Models Iterative Algorithms 66 / 108
5
6
7
On the Impact of Platform Models Iterative Algorithms 67 / 108
◮ Heterogeneity of processors (computational power, memory,
◮ Heterogeneity of communications links. ◮ Irregularity of interconnection network.
On the Impact of Platform Models Iterative Algorithms 68 / 108
◮ A set of data (typically, a matrix) ◮ Structure of the algorithms:
1
2
3
On the Impact of Platform Models Iterative Algorithms 69 / 108
◮ A set of data (typically, a matrix) ◮ Structure of the algorithms:
1
2
3
On the Impact of Platform Models Iterative Algorithms 69 / 108
◮ Which processors should be used ? ◮ What amount of data should we give them ? ◮ How do we cut the set of data ?
On the Impact of Platform Models Iterative Algorithms 70 / 108
◮ Data: a 2-D array
On the Impact of Platform Models Iterative Algorithms 71 / 108
◮ Data: a 2-D array
◮ Unidimensional cutting into vertical slices
On the Impact of Platform Models Iterative Algorithms 71 / 108
◮ Data: a 2-D array
◮ Unidimensional cutting into vertical slices ◮ Consequences:
On the Impact of Platform Models Iterative Algorithms 71 / 108
◮ Data: a 2-D array
◮ Unidimensional cutting into vertical slices ◮ Consequences:
1
On the Impact of Platform Models Iterative Algorithms 71 / 108
◮ Data: a 2-D array
◮ Unidimensional cutting into vertical slices ◮ Consequences:
1
2
On the Impact of Platform Models Iterative Algorithms 71 / 108
◮ Data: a 2-D array
◮ Unidimensional cutting into vertical slices ◮ Consequences:
1
2
3
On the Impact of Platform Models Iterative Algorithms 71 / 108
◮ Processors: P1, ..., Pp ◮ Processor Pi executes a unit task in a time wi ◮ Overall amount of work Dw;
j αj = 1) ◮ Cost of a unit-size communication from Pi to Pj: ci,j ◮ Cost of a sending from Pi to its successor in the ring: Dc.ci,succ(i)
On the Impact of Platform Models Iterative Algorithms 72 / 108
◮ send at most one message at any time; ◮ receive at most one message at any time; ◮ send and receive a message simultaneously.
On the Impact of Platform Models Iterative Algorithms 73 / 108
1 Select q processors among p
On the Impact of Platform Models Iterative Algorithms 74 / 108
1 Select q processors among p 2 Order them into a ring
On the Impact of Platform Models Iterative Algorithms 74 / 108
1 Select q processors among p 2 Order them into a ring 3 Distribute the data among them
On the Impact of Platform Models Iterative Algorithms 74 / 108
1 Select q processors among p 2 Order them into a ring 3 Distribute the data among them
1ip I{i}[αi.Dw.wi + Dc.(ci,pred(i) + ci,succ(i))]
On the Impact of Platform Models Iterative Algorithms 74 / 108
5
6
7
On the Impact of Platform Models Iterative Algorithms 75 / 108
1 There exists a communication link between any two processors 2 All links have the same capacity
On the Impact of Platform Models Iterative Algorithms 76 / 108
◮ Either the most powerful processor performs all the work, or all
On the Impact of Platform Models Iterative Algorithms 77 / 108
◮ Either the most powerful processor performs all the work, or all
◮ If all processors participate, all end their share of work simulta-
On the Impact of Platform Models Iterative Algorithms 77 / 108
◮ Either the most powerful processor performs all the work, or all
◮ If all processors participate, all end their share of work simulta-
i τ Dw.wi )
On the Impact of Platform Models Iterative Algorithms 77 / 108
◮ Either the most powerful processor performs all the work, or all
◮ If all processors participate, all end their share of work simulta-
i τ Dw.wi ) ◮ Time of the optimal solution:
1 wi
On the Impact of Platform Models Iterative Algorithms 77 / 108
5
6
7
On the Impact of Platform Models Iterative Algorithms 78 / 108
1 There exists a communication link between any two processors
On the Impact of Platform Models Iterative Algorithms 79 / 108
time Dc.c1,5 Dc.c1,2 Dc.c2,1 Dc.c2,3 Dc.c3,2 Dc.c4,3 Dc.c4,5 Dc.c5,4 Dc.c5,1 α5.Dw.w5 P1 P2 P3 P4 P5 α4.Dw.w4 Dc.c3,4 α3.Dw.w3 α2.Dw.w2 α1.Dw.w1 processors
On the Impact of Platform Models Iterative Algorithms 80 / 108
◮ All processors end simultaneously
On the Impact of Platform Models Iterative Algorithms 81 / 108
◮ All processors end simultaneously
◮ p
p
p
1 P
i 1 wi
On the Impact of Platform Models Iterative Algorithms 81 / 108
p
On the Impact of Platform Models Iterative Algorithms 82 / 108
p
p
On the Impact of Platform Models Iterative Algorithms 82 / 108
p
p
wi + cj,i wj
On the Impact of Platform Models Iterative Algorithms 82 / 108
p
p
wi + cj,i wj
On the Impact of Platform Models Iterative Algorithms 82 / 108
i=1
j=1 di,j.xi,j,
j=1 xi,j = 1
i=1 xi,j = 1
On the Impact of Platform Models Iterative Algorithms 83 / 108
i=1 xi,j 1
i=1
j=1 xi,j = q
i=1 xi,j = Pp i=1 xj,i
i=1 αi = 1
j=1 xi,j
Dw
j=1(xi,jci,j + xj,icj,i) T
i=1 yi = 1
On the Impact of Platform Models Iterative Algorithms 84 / 108
◮ Problems with rational variables: can be solved in polynomial
◮ Problems with integer variables: solved in exponential time in
◮ No relaxation in rationals seems possible here. . .
On the Impact of Platform Models Iterative Algorithms 85 / 108
1 Exhaustive search: feasible until a dozen of processors. . . 2 Greedy heuristic: initially we take the best pair of processors;
On the Impact of Platform Models Iterative Algorithms 86 / 108
1 Exhaustive search: feasible until a dozen of processors. . . 2 Greedy heuristic: initially we take the best pair of processors;
On the Impact of Platform Models Iterative Algorithms 86 / 108
1 Exhaustive search: feasible until a dozen of processors. . . 2 Greedy heuristic: initially we take the best pair of processors;
On the Impact of Platform Models Iterative Algorithms 86 / 108
5
6
7
On the Impact of Platform Models Iterative Algorithms 87 / 108
On the Impact of Platform Models Iterative Algorithms 88 / 108
On the Impact of Platform Models Iterative Algorithms 88 / 108
On the Impact of Platform Models Iterative Algorithms 88 / 108
On the Impact of Platform Models Iterative Algorithms 88 / 108
◮ A set of communications links: e1, ..., en ◮ Bandwidth of link em: bem ◮ There is a path Si from Pi to Psucc(i) in the network
◮ Si uses a fraction si,m of the bandwidth bem of link em ◮ Pi needs a time Dc.
◮ Constraints on the bandwidth of em:
◮ Symmetrically, there is a path Pi from Pi to Ppred(i) in the
On the Impact of Platform Models Iterative Algorithms 89 / 108
◮ 7 processors and 8 bidirectional communications links ◮ We choose a ring of 5 processors:
On the Impact of Platform Models Iterative Algorithms 90 / 108
◮ 7 processors and 8 bidirectional communications links ◮ We choose a ring of 5 processors:
On the Impact of Platform Models Iterative Algorithms 90 / 108
On the Impact of Platform Models Iterative Algorithms 91 / 108
On the Impact of Platform Models Iterative Algorithms 91 / 108
On the Impact of Platform Models Iterative Algorithms 91 / 108
From P1: to P2, S1 = {a, b} and to P5, P1 = {h} From P2: to P3, S2 = {c, d} and to P1, P2 = {b, g, h} From P3: to P4, S3 = {d, e} and to P2, P3 = {d, e, f} From P4: to P5, S4 = {f, b, g} and to P3, P4 = {e, d} From P5: to P1, S5 = {h} and to P4, P5 = {g, b, f}
On the Impact of Platform Models Iterative Algorithms 91 / 108
1 min(s1,a,s1,b).
1 p1,h .
On the Impact of Platform Models Iterative Algorithms 92 / 108
1 min(s1,a,s1,b).
1 p1,h .
Lien a: s1,a ba Lien b: s1,b + s4,b + p2,b + p5,b bb Lien c: s2,c bc Lien d: s2,d + s3,d + p3,d + p4,d bd Lien e: s3,e + p3,e + p4,e be Lien f: s4,f + p3,f + p5,f bf Lien g: s4,g + p2,g + p5,g bg Lien h: s5,h + p1,h + p2,h bh
On the Impact of Platform Models Iterative Algorithms 92 / 108
i=1 αi = 1
On the Impact of Platform Models Iterative Algorithms 93 / 108
1 The processors are selected; 2 The processors are ordered into a ring; 3 The communication paths between the processors are known.
On the Impact of Platform Models Iterative Algorithms 94 / 108
1 The processors are selected; 2 The processors are ordered into a ring; 3 The communication paths between the processors are known.
◮ Complete graph: closed-form expression; ◮ General graph: quadratic system.
On the Impact of Platform Models Iterative Algorithms 94 / 108
1 Initially: best pair of processors 2 For each processor Pk (not already included in the ring) ◮ For each pair (Pi, Pj) of neighbors in the ring 1
2
3
3 We keep the best solution found at step 2 and we start again
On the Impact of Platform Models Iterative Algorithms 95 / 108
◮ No guarantee, neither theoretical, nor practical ◮ Simple solution:
1
2
3
On the Impact of Platform Models Iterative Algorithms 96 / 108
moby canaria mryi0 popc0 sci0 Hub Switch sci3 sci2 sci4 sci5 sci6 sci1 myri1 myri2 Hub router backbone routlhpc
P0 P1 P2 P3 P4 P5 P6 P7 P8 0.0206 0.0206 0.0206 0.0206 0.0291 0.0206 0.0087 0.0206 0.0206 P9 P10 P11 P12 P13 P14 P15 P16 0.0206 0.0206 0.0206 0.0291 0.0451
On the Impact of Platform Models Iterative Algorithms 97 / 108
On the Impact of Platform Models Iterative Algorithms 98 / 108
◮ Processors have different characteristics ◮ Communications links have different characteristics ◮ There is an irregular interconnection network with complex
On the Impact of Platform Models Iterative Algorithms 99 / 108
5
6
7
On the Impact of Platform Models Data Redistribution 100 / 108
100Mb/s 100Mb/s 200Mb/s
On the Impact of Platform Models Data Redistribution 101 / 108
200Mb 100Mb 100Mb 100Mb/s 100Mb/s 200Mb/s
On the Impact of Platform Models Data Redistribution 101 / 108
On the Impact of Platform Models Data Redistribution 101 / 108
On the Impact of Platform Models Data Redistribution 101 / 108
On the Impact of Platform Models Data Redistribution 101 / 108
On the Impact of Platform Models Data Redistribution 101 / 108
On the Impact of Platform Models Data Redistribution 101 / 108
On the Impact of Platform Models Data Redistribution 101 / 108
◮ b1 is the bandwidth of the sending cluster ◮ b2 is the bandwidth of the receiving cluster ◮ bb is the bandwidth of the backbone ◮ β is the latency of communications ◮ The redistribution is modeled by a bipartite graph G =
h
On the Impact of Platform Models Data Redistribution 102 / 108
On the Impact of Platform Models Data Redistribution 103 / 108
On the Impact of Platform Models Data Redistribution 103 / 108
◮ β is the latency of communications ◮ The redistribution is modeled by a bipartite graph G =
◮ At most k simultaneous communications can be done.
h
On the Impact of Platform Models Data Redistribution 104 / 108
e∈E wl(e) + β
h
h
e∈E wl(e)
◮ The trade-off between the number of steps and the latency. ◮ We look for bounded-size matchings.
On the Impact of Platform Models Data Redistribution 105 / 108
◮ KPBS is strong NP-hard. ◮ PBS cannot be approximated with a ratio smaller than 7 6. ◮ PBS can be approximated with a ratio 2 − 1 β+1. ◮ KPBS can be approximated with a ratio 8 3.
On the Impact of Platform Models Data Redistribution 106 / 108
◮ The k bound is somehow artificial but is due to the 1-port
◮ By getting rid of the latencies, you get a polynomial fractionnal
◮ With a few “standard tricks” you can even introduce release
◮ However, taking the whole topology into account is more tricky.
◮ Indeed, under a bounded multiport model, the problem is trivial. ◮ However, if you want to keep the 1-port constraint, you need
On the Impact of Platform Models Data Redistribution 107 / 108
On the Impact of Platform Models Data Redistribution 108 / 108
On the Impact of Platform Models Data Redistribution 108 / 108
On the Impact of Platform Models Data Redistribution 108 / 108
On the Impact of Platform Models Data Redistribution 108 / 108
On the Impact of Platform Models Data Redistribution 108 / 108