6th SOS Workshop on Distributed Supercomputing: Data Intensive Computing March 4-6, 2002, Badehotel Bristol, Leukerbad, Valais, Switzerland
Network Topology-aware Traffic Scheduling Emin Gabrielyan cole - - PDF document
Network Topology-aware Traffic Scheduling Emin Gabrielyan cole - - PDF document
6th SOS Workshop on Distributed Supercomputing: Data Intensive Computing March 4-6, 2002, Badehotel Bristol, Leukerbad, Valais, Switzerland Network Topology-aware Traffic Scheduling Emin Gabrielyan cole Polytechnique Fdrale de Lausanne,
l1 l2 l3 l4 l5 l6 l7 l8 l9 l10 l11 l12 T1 T2 T3 T4 T5 R1 R2 R3 R4 R5 T1 T2 T3 T4 T5 R1 R2 R3 R4 R5 ...
25-transfer data exchange
T1 T2 T3 T4 T5 R1 R2 R3 R4 R5 T1 T2 T3 T4 T5 R1 R2 R3 R4 R5 T1 T2 T3 T4 T5 R1 R2 R3 R4 R5 T1 T2 T3 T4 T5 R1 R2 R3 R4 R5 T1 T2 T3 T4 T5 R1 R2 R3 R4 R5
Round-robin schedule
Round-robin Throughput
Troundrobin 25 7 ⁄ 100MB s ⁄ ⋅ 357MB s ⁄ = =
1 2 5 3.2 4.2 total throughput number of transfers number of timeframes
{
mean number of connections per timeframe
{
connection throughput 3.1 4.1
step 1 step 2 step 3 step 4 step 5 step 6
Liquid Schedule
Tliquid 25 6 ⁄ 100MB s ⁄ ⋅ 416MB s ⁄ = =
mean number of connections per step
{
T1 T2 T3 T4 T5 R1 R2 R3 R4 R5 ...
T1 T2 T3 T4 T5 R1 R2 R3 R4 R5
5 5 5 5 5 5 5 5 5 5 6 6
The 25 transfer traffic
bottlenecks
X =
l1 l2 l3 l4 l5 l6 l7 l8 l9 l10 l11 l12
Load of Links and Transfers
λ l1 X , ( ) 5 = …λ l12 X , ( ) 6 = , l1 l6 , { } … l1 l12 l6 , , { } … , ,
Transfers:
{l1, l6}, {l1, l7}, {l1, l8}, {l1, l12, l9}, {l1, l12, l10}, {l2, l6}, {l2, l7}, {l2, l8}, {l2, l12, l9}, {l2, l12, l10}, {l3, l6}, {l3, l7}, {l3, l8}, {l3, l12, l9}, {l3, l12, l10}, {l4, l11, l6}, {l4, l11, l7}, {l4, l11, l8}, {l4, l9}, {l4, l10}, {l5, l11, l6}, {l5, l11, l7}, {l5, l11, l8}, {l5, l9}, {l5, l10}
l1 l2 l3 l4 l5 l6 l7 l8 l9 l10 l11 l12
X=
, ,... ,
λ l1 X , ( ) 5 = λ l2 X , ( ) 5 = λ l11 X , ( ) 6 = λ l12 X , ( ) 6 = Λ X ( ) 6 =
Duration of the Traffic
{l1, l6}, {l1, l7}, {l1, l8}, {l1, l12, l9}, {l1, l12, l10}, {l2, l6}, {l2, l7}, {l2, l8}, {l2, l12, l9}, {l2, l12, l10}, {l3, l6}, {l3, l7}, {l3, l8}, {l3, l12, l9}, {l3, l12, l10}, {l4, l11, l6}, {l4, l11, l7}, {l4, l11, l8}, {l4, l9}, {l4, l10}, {l5, l11, l6}, {l5, l11, l7}, {l5, l11, l8}, {l5, l9}, {l5, l10}
X=
Tliquid # X ( ) Λ X ( )
- Tlink
⋅ = = = 25 6
- 100MB s
⁄ ⋅ 417MB s ⁄ =
Liquid Throughput
the duration of the traffic (the load of its bottlenecks) total number of transfers the throughput of a single link
l1 l2 l3 l4 l5 l6 l7 l8 l9
{l1, l7, l8, l6}, {l2, l8, l9, l4}, {l3, l9, l7, l5}
X = Tliquid # X ( ) Λ X ( )
- Tlink
⋅ = = = 3 2 ⁄ 100MB s ⁄ ⋅ 150MB s ⁄ = # X ( ) 3 = Λ X ( ) 2 =
R T T T R R
No liquid schedule
the 5 trans- fers block the access to bottlenecks
2 4 5 6
Network link Routing information
3 7 1 PR63 PR00
PR01 PR00
PR02 PR04 PR06 PR08 P R 1 P R 1 2 PR14 PR16 PR18 PR20 PR22 PR24 P R 2 6 P R 2 8 PR30 PR32 PR34 PR36 PR38 PR40 P R 4 2 P R 4 4 PR46 PR48 PR50 PR52 PR54 PR56 P R 5 8 P R 6 PR62 PR61 PR59 PR57 PR55 P R 5 3 P R 5 1 PR49 PR47 PR45 PR43 PR41 PR39 P R 3 7 P R 3 5 PR33 PR31 PR29 PR27 PR25 PR23 P R 2 1 P R 1 9 PR17 PR15 PR13 PR11 PR09 PR07 P R 5 P R 3 PR01
Sending Processor Receiving Processor Node
N00 N01 N02 N 3 N04 N05 N06 N07 N08 N09 N10 N 1 1 N12 N13 N14 N15 N16 N17 N18 N 1 9 N20 N21 N22 N23 N24 N25 N 2 6 N27 N28 N29 N30 N 3 1
N00
Switch
Swiss-T1 Cluster
200 400 600 800 1000 1200 1400 1600 1800 4 8 12 16 20 24 28 32 Number of contributing nodes Liquid throughput (MB/s) Upper bound Lower bound
363 Test Traffics
400 800 1200 1600 2000 2400 2800 ( ) 3 ( 9 ) 6 ( 1 1 ) 9 ( 1 2 ) 1 2 ( 1 4 ) 1 5 ( 1 5 ) 1 8 ( 1 6 ) 2 1 ( 1 8 ) 2 4 ( 1 9 ) 2 7 ( 2 ) 3 ( 2 2 ) 3 3 ( 2 4 ) 3 6 ( 3 ) Aggregate throughput (MB/s) C r
- s
s b a r t h r
- u
g h p u t L i q u i d t h r
- u
g h p u t
363-Topology Test-bed
200 400 600 800 1000 1200 1400 1600 1800 9 1 1 1 3 1 4 1 5 1 6 1 8 1 9 2 1 2 3 2 6 Numbers of nodes for the 363 sub-topologies m e a s u r e d r
- u
n d
- r
- b
i n l i q u i d t h r
- u
g h p u t T1 Cluster
Round-robin throughput
{l1, l6}, {l1, l7}, {l1, l8}, {l1, l12, l9}, {l1, l12, l10}, {l2, l6}, {l2, l7}, {l2, l8}, {l2, l12, l9}, {l2, l12, l10}, {l3, l6}, {l3, l7}, {l3, l8}, {l3, l12, l9}, {l3, l12, l10}, {l4, l11, l6}, {l4, l11, l7}, {l4, l11, l8}, {l4, l9}, {l4, l10}, {l5, l11, l6}, {l5, l11, l7}, {l5, l11, l8}, {l5, l9}, {l5, l10}
X =
{l1, l7}, {l2, l8}, {l3, l12, l9}, {l5, l11, l6} {l1, l6}, {l2, l12, l10}, {l3, l7}, {l4, l11, l8} {l3, l12, l10}, {l4, l9}, {l5, l11, l8}}
{
, ,
{l1, l12, l9}, {l2, l7}, {l3, l8}, {l4, l11, l6}, {l5, l10} {l1, l12, l10}, {l2, l6}, {l4, l11, l7}, {l5, l9} {l1, l8}, {l2, l12, l9}, {l3, l6}, {l4, l10}, {l5, l11, l7}
, , ,
α = schedule α is liquid ⇔ # α ( ) Λ X ( ) = ⇔ ⇔ A α ∈ ( ) A is a team of X ∀ ⇔
Team: set of non-congesting transfers
using all bottlenecks
number of steps load of the bottlenecks
l1 l2 l3 l4 l5 l6 l7 l8 l9
{l1, l7, l8, l6}, {l2, l8, l9, l4}, {l3, l9, l7, l5}
X =
Traffic without a team
... ... ...
X Choice X ( ) A1 A2 A3…An , , { } = → X1 X A1 – = Choice X1 ( ) A1 1
,
A1 2
, …
, { } = → X1 1
,
X1 A1 1
,
– = X1 2
,
X1 A1 2
,
– = X2 X A2 – = Choice X2 ( ) A2 1
,
A2 2
, …
, { } = → X2 1
,
X2 A2 1
,
– = X2 2
,
X2 A2 2
,
– = X3 X A3 – = Choice X3 ( ) A3 1
,
A3 2
, …
, { } = → X3 1
,
X3 A3 1
,
– = Choice Xi1 i1…in
,
( ) A ℑ X ( ) ∈ A Xi1 i1…in
,
⊂ { } =
Liquid schedule search tree
possible steps to the next layer set of all possible teams of X
A1,1 A1,1,1 X (25 transfers) X1 = X - A1 (20 transfers) X1,1 = X1 - A1,1 (16 transfers) A1 A(X)=6 (X1)=5 (X1,1)=4 A A 2 bottlenecks 2 bottlenecks 4 bottlenecks 4 bottlenecks 6 bottlenecks 8 bottlenecks (X1,1,1)=3 A
Additional bottlenecks
A1,1 A1,1,1 X (25 transfers) X1 = X - A1 (20 transfers) X1,1 = X1 - A1,1 (16 transfers) A1 A(X)=6 (X1)=5 (X1,1)=4 A A 2 bottlenecks 2 bottlenecks 4 bottlenecks
16-transfer traffic load is 4 load is 4
Prediction of Dead-ends
Choice Y ( ) A ℑ X ( ) ∈ A Y ⊂ { } = Y: reduced traffic ℑ Y ( ) A ℑ X ( ) ∈ A Y ⊂ { } ⊂
teams of the reduced traffic
{
Choice Y ( ) ℑ Y ( ) =
Liquid schedule search optimization
- riginal traffic’s
teams formed from the re- duced traffic decrease of the search space without affect- ing the solution space
... ...
X Choice X ( ) A1 A2 A3…An , , { } = → X1 X A1 – = Choice X1 ( ) A1 1
,
A1 2
, …
, { } = → X1 1
,
X1 A1 1
,
– = X1 2
,
X1 A1 2
,
– = X2 X A2 – = Choice X2 ( ) A2 1
,
A2 2
, …
, { } = →
ℑfull Y ( ) ℑ Y ( ) ⊂ Choice Y ( ) ℑ Y ( ) = Choice Y ( ) ℑfull Y ( ) = full teams of the reduced traffic
{
decrease of the search space without affecting the solution space
Liquid schedules construction
- For more than 90% of the test-bed topolo-
gies the search of liquid schedules took less than 0.1s on a single 500MHz processor.
- For 8 topologies out of 363 solution was not
found within 24 hours.
200 400 600 800 1000 1200 1400 1600 1800 2000 8 1 1 1 1 2 1 3 1 4 1 5 1 5 1 6 1 7 1 8 1 9 2 2 1 2 2 2 4 2 5 3 Number of nodes for the 363 sub-topologies All-to-all throughput (MB/s)
Results
Conclusion
- Data exchanges relying on the liquid schedules
may be carried out several times faster com- pared with topology-unaware schedules.
- Our method may be applied to applications
requiring high network efficiency, such as video
- r voice traffic management, high energy phys-
ics data acquisition and event assembling.
- At the present we consider only static routing
- scheme. Dynamic routing could possibly be also
combined in the algorithms.
- Fixed packet size transfers are considered.
- The network latency are neglected in compari-