Balaji Prabhakar
Departments of EE and CS Stanford University
Randomized Network Algorithms: An Overview and Recent Results - - PowerPoint PPT Presentation
Randomized Network Algorithms: An Overview and Recent Results Balaji Prabhakar Departments of EE and CS Stanford University Network algorithms Algorithms implemented in networks, e.g. in switches/routers scheduling algorithms
Balaji Prabhakar
Departments of EE and CS Stanford University
2
– switches/routers
scheduling algorithms routing lookup packet classification security
– memory/buffer managers
maintaining statistics active queue management bandwidth partitioning
– load balancers – web caches
eviction schemes placement of caches in a network
3
– line speeds in the Internet core 10Gbps (40Gbps in the near future)
i.e. packets arrive roughly every 40ns
– large number of
distinct flows in the Internet core requests arriving per sec at large server farms
– due to rigid space and heat dissipation constraints
– but simple algorithms may perform poorly, if not well-designed
4
6ft 19” Capacity: 160Gb/s Power: 4.2kW 3ft 2.5ft 19” Capacity: 80Gb/s Power: 2.6kW
2ft
5
Network Processor Lookup Engine Network Processor Lookup Engine Network Processor Lookup Engine
Interconnection Fabric Switch
Output Scheduler
Line cards Outputs
Packet Buffers Packet Buffers Packet Buffers
6
base decisions upon a small, randomly chosen sample of the state/input, instead of the complete state/input
Internet packet traces exhibit power law distributions: 80% of the packets belong to 20% of the flows; i.e. most flows are small (mice), most work is brought by a few elephants identifying the large flows cheaply can significantly simplify the implementation
– switch scheduling – bandwidth partitioning
7
–
has a complexity of 1 billion
–
has a complexity of 30
–
linear search will find the absolute youngest person (rank = 1)
–
if R is the person found by randomized algorithm, we can say
probability
8
– and store the identity of the youngest person in memory – in 2008 you choose 29 new people at random – let R be the youngest person from these 29 + 1 = 30 people – or
9
joint work with Paolo Giaccone and Devavrat Shah
10
Network Processor Lookup Engine Network Processor Lookup Engine Network Processor Lookup Engine
Interconnection Fabric Switch
Output Scheduler
Line cards Outputs
Packet Buffers Packet Buffers Packet Buffers
11
Crossbar fabric
1 2 1 2 3 3
12
Crossbar fabric
1 2 1 2 3 3
13
Crossbar fabric
1 2 1 2 3 3
14
Crossbar fabric
1 2 1 2 3 3
15
admissible arrival, the average backlog is bounded.
16
19 3 4 21 18 7 1
Schedule or Matching
17
Not stable Stable
(Tassiulas-Ephremides 92, McKeown et. al. 96, Dai-Prabhakar 00)
Not stable
(McKeown-Ananthram-Walrand 96)
19 3 4 21 18 7 1
Practical Maximal Matchings Max Wt Matching
19 18
Max Size Matching
19 1 7
18
– throughput: stable (Tassiulas-Ephremides 92; McKeown et al 96; Dai-Prabhakar 00) – backlogs: very low on average (Leonardi et al 01; Shah-Kopikare 02)
– has cubic worst-case complexity
(approx. 27,000 iterations for a 30-port switch)
– MWM algorithms involve backtracking:
i.e. edges laid down in one iteration may be removed in a subsequent iteration
19
Stable and low backlogs Not stable
Better performance Easier implementation
Maximal matching Max Wt Matching
19 18
Max Size Matching
19 1 7
Not stable
20
when d = N, the throughput is at most
21
Next time
MAX
Previous matching S(t-1) Current matching S(t) Random Matching R(t)
22
10 50 10 10 70 60
S(t-1) W(S(t-1))=160
40 30 10 20
R(t) W(R(t))=150 MAX S(t)
23
Theorem (Tassiulas 98): The above scheme is stable under any admissible Bernoulli IID inputs.
24
0.01 0.1 1 10 100 1000 10000 0.2 0.4 0.6 0.8 1
Normalized Load Mean IQ Length
Tassiulas MWM
25
10 10 10 70 60
S(t-1) W(S(t-1))=160
50 40 30 10 20
R(t) W(R(t))=150
30 v/s 120 130 v/s 30
26
10 10 10 70 60
S(t-1) W(S(t-1))=160
50 40 30 10 20
R(t) W(R(t))=150
W(S(t)) = 250
27
Theorem (GPS): The Merge scheme is stable under any admissible Bernoulli IID inputs.
28
0.01 0.1 1 10 100 1000 10000 0.2 0.4 0.6 0.8 1
Normalized Load Mean IQ Length Tassiulas Merge MWM
29
89 3 5 23 47 11 31 97
S(t-1) W(S(t-1))=209
2 7
The arrival graph
30
89 3 5 23 47 11 31 97
S(t-1) W(S(t-1))=209
2
The arrival graph
31
89 3 6 23 47 11 31 97
S(t-1)
23
W(S(t-1))=209 W=121
W(S(t))=243 S(t)
89 3 23 31 97
32
Theorem (GPS): The Serena algorithm is stable under any admissible Bernoulli IID inputs.
33
0.01 0.1 1 10 100 1000 10000 0.2 0.4 0.6 0.8 1
Normalized Load Mean IQ Length
Tassiulas Merge Serena MWM
(jointly with R. Pan, C. Psounis, C. Nair, B. Yang)
35
– allocate bandwidth fairly – control queue size and hence delay
36
37
bandwidth
from hogging up all the bandwidth, without explicitly identifying the rogue source?
38
responsive (green) flows
When buffer is congested evict a randomly chosen packet
39
congestion signal and back-off
40
their ids, then it is quite unlikely that both will be green
Choose two packets at random and drop them both if their ids agree
consume
41
R1 1Mbps 10Mbps S(2) S(m) S(m+n) TCP Sources S(m+1) UDP Sources S(1) R2 D(2) D(m) D(m+n) TCP Sinks D(m+1) UDP Sinks D(1) 10Mbps
42
200 400 600 800 1000 100 1000 10000 UDP Arrival Rate (Kbps) UDP Throughput (Kbps) RED CHOKe
43
44
i i i i i i i
i i i i i i
D i i
45
50 100 150 200 250 300 350 0.1 1 10
fluid model CHOKe ns simulation
46
all the bandwidth
– but, approaching ideal bandwidth partitioning, seems very costly – (recall the fair queueing algorithm)
47
(elephant) flows: simply partition the bandwidth amongst the elephant flows
them
48
–
Flip a coin with bias p (= 0.1, say) for heads on each arriving packet, independently from packet to packet.
–
A flow is “sampled” if one its packets has a head on it
–
flows with fewer than 5 packets are sampled with prob 0.5
–
flows with more than 10 packets are sampled with prob 1
H H T T T T T T T T T T H H
49
Di Data Buffer
Flow Table (Elephant Trap)
50
problems, mainly because of very tight constraints