competitive analysis in buffer management . Sergey I. Nikolenko - - PowerPoint PPT Presentation

competitive analysis in buffer management
SMART_READER_LITE
LIVE PREVIEW

competitive analysis in buffer management . Sergey I. Nikolenko - - PowerPoint PPT Presentation

competitive analysis in buffer management . Sergey I. Nikolenko 1,2,3 Summer School on Operational Research and Applications Nizhny Novgorod, May 25, 2016 1 NRU Higher School of Economics, St. Petersburg 2 Steklov Institute of Mathematics at St.


slide-1
SLIDE 1

competitive analysis in buffer management

.

Sergey I. Nikolenko1,2,3 Summer School on Operational Research and Applications Nizhny Novgorod, May 25, 2016

1NRU Higher School of Economics, St. Petersburg 2Steklov Institute of Mathematics at St. Petersburg 3Deloitte Analytics Institute, Moscow

slide-2
SLIDE 2

intro and problem setting .

slide-3
SLIDE 3

problem setting .

  • A buffer B that handles a sequence of arriving packets.
  • Discrete time, each time slot contains:

(1) arrival: new packets arrive, and the buffer management unit performs admission control and, possibly, push-out; (2) assignment and processing: a single packet is selected for processing by the scheduling module; (3) transmission: packets with zero required processing left are transmitted and leave the queue.

  • The goal is to transmit as many packets as possible (i.e., drop as

little as possible).

3

slide-4
SLIDE 4

packets, buffers, processing orders .

  • The structure of the buffer can be different:
  • single queue: all packets go to a single output port;
  • multiple queues: clearly separated queues leading to different
  • utput ports;
  • shared memory: different output ports but the memory is shared

(and has to be balanced);

  • CIOQ (combined input-output queued) switches: several inputs,

several outputs, one queue per output at every input;

  • crossbar switches: buffers at intersections.

4

slide-5
SLIDE 5

packets, buffers, processing orders .

  • Packets can differ in various characteristics:
  • value v(p) ∈ {1, … , V}: how much the packet contributes to the
  • bjective function;
  • required processing cycles r(p) ∈ {1, … , k}: how long must a

CPU work on the packet before transmission;

  • output port: where a packet is headed;
  • size in bytes (buffer slots).

4

slide-6
SLIDE 6

packets, buffers, processing orders .

  • Finally, processing and transmission orders can be different too:
  • in FIFO order, packets should be processed and transmitted in the
  • rder they arrived;
  • in semi-FIFO order, processing is free but transmission should

follow arrivals;

  • or the order does not matter, so we are free to construct priority

queues.

4

slide-7
SLIDE 7

competitiveness .

  • The goal is to transmit as many packets as possible (i.e., drop as

little as possible). Definition An online algorithm A is said to be α-competitive (for some α ≥ 1) if for any arrival sequence σ the number of packets successfully transmitted by A is at least 1/α times the number of packets suc- cessfully transmitted by an optimal solution (denoted OPT) ob- tained by an offline clairvoyant algorithm.

  • This is a worst-case definition, with guarantees over all traffic

distributions.

  • Lower bounds are counterexamples, upper bounds are uniform

guarantees (often interesting theorems).

5

slide-8
SLIDE 8

plan .

  • Our plan:
  • try to study buffers, packets, processing orders in different

combinations;

  • a lot of different combinations and works, we only look at some of

the simplest and most interesting (best interest/technicality ratio);

  • start with uniform packets (shared memory);
  • then look at packets with heterogeneous processing;
  • and finally at packets with multiple characteristics.

6

slide-9
SLIDE 9

uniform packets .

slide-10
SLIDE 10

simplest setting .

  • Simplest setting: single queue, all packets are identical.
  • What is the competitive ratio?

8

slide-11
SLIDE 11

simplest setting .

  • Simplest setting: single queue, all packets are identical.
  • What is the competitive ratio?
  • Naturally, 1: the greedy algorithm is optimal.
  • Hasn’t been too hard, has it? Well, there’s more...
  • Based on this paper:
  • W. Aiello, A. Kesselman, Y. Mansour. Competitive buffer

management for shared-memory switches. ACM Transactions on Algorithms, vol. 5, no. 1, 2008.

8

slide-12
SLIDE 12

shared memory switch .

  • Let us now consider a shared memory switch.
  • N × N switch: each of N output ports has a queue, and the total

number of packets in all queues is bounded by B.

  • N input ports send in at most N packets per time slot (not

necessary).

  • Non-preemptive (non-push-out) algorithms decide what to

accept and cannot drop packets.

  • Preemptive (push-out) algorithms can push out already

accepted packets.

9

slide-13
SLIDE 13

push-out algorithms .

  • Push-out algorithms: there are N queues of total size B, each

packet is labeled with its output port.

  • The queues can be assumed to be FIFO (doesn’t matter since

packets are uniform).

  • A new packet comes in; if there is free space, we lose nothing by

accepting.

  • If the buffer is congested (full), we have to decide: where do we

push out from?

  • What policies can you propose?

10

slide-14
SLIDE 14

push-out algorithms .

  • Push-out algorithms: there are N queues of total size B, each

packet is labeled with its output port.

  • The queues can be assumed to be FIFO (doesn’t matter since

packets are uniform).

  • A new packet comes in; if there is free space, we lose nothing by

accepting.

  • If the buffer is congested (full), we have to decide: where do we

push out from?

  • What policies can you propose?
  • We’ll study LQD (longest queue drop): drop packet from the

longest queue.

  • What is its competitive ratio?
  • We will demonstrate:
  • lower bound for LQD by a specific counterexample;
  • upper bound for LQD by matching (common technique);
  • but first...

10

slide-15
SLIDE 15

general lower bound .

  • There is one more flavor of results: general lower bounds.
  • How can we prove a lower bound for all online algorithms?

11

slide-16
SLIDE 16

general lower bound .

  • There is one more flavor of results: general lower bounds.
  • How can we prove a lower bound for all online algorithms?
  • We have to show an adversarial bound, constructing a hard

example for an online algorithm on the fly.

11

slide-17
SLIDE 17

general lower bound .

  • In this case: consider 2 active output ports with queues Q1 and

Q2 (and N input ports).

  • For

2B N−2 time slots, N 2 packets arrive for each port.

  • At time

2B N−2, either Q1 or Q2 has ≤ B 2 packets (B in total).

  • Next, for B time slots we send 1 packet per time slot to the other

queue.

  • Now ALG can transmit at most

4B N−2 + 3B 2 packets, and OPT can

do the full

4B N−2 + 2B.

  • The sequence can be repeated, getting the ratio of

4B 3B+2 → 4 3.

  • Hence, we have shown a general lower bound of 4

3 for any

  • nline algorithm.
  • This is how many lower bounds for many settings work.
  • But there are more interesting cases too...

12

slide-18
SLIDE 18

lqd: lower bound .

  • Lower bound for LQD: √2.
  • Proof by construction of a specific counterexample:
  • consider a 3A × 3A switch with B = L(A−1)

2

+ A, L > A;

  • output ports:
  • A overloaded: each receives 2 packets per time slot;
  • A idle: needed only to get enough inputs;
  • A active: L packets arrive over L

A time slots, then for L − L A slots

nothing happens, then again; A input ports can keep A active port busy in this way.

  • OPT accepts one of two packets for each overloaded port and

keeps all that comes for active ports (which amounts to exactly B), processing 2A packets per time slot;

  • LQD accepts more for overloaded ports...

13

slide-19
SLIDE 19

lqd: lower bound .

  • LQD accepts more for overloaded ports:
  • the max queue length is ≈ constant, say xL;
  • when a burst arrives to an active port, it stabilizes at xL;
  • the total number of packets in active queues is

xL + (xL − L A) + (xL − 2 L A) + … ≈ x2AL 2 ;

  • in overloaded ports, ≈ xAL;
  • and in total they should add up to B ≈ LA

2 , so we get

1 2x2 + x ≈ 1 2, x ≈ √2 − 1;

  • each active port over L time slots transmits ≈ L

A + xL ≈ xL, so the

total throughput rate is ≈ xL

L A + A = √2A, hence the bound.

14

slide-20
SLIDE 20

lqd: upper bound .

  • Upper bound: 2.
  • Proof: we construct a matching between the extra packets

processed by OPT and LQD packets.

  • An extra packet is a packet transmitted by OPT when the

corresponding LQD queue is idle.

  • A potential extra packet is a packet further in OPT’s queue than

the entire corresponding LQD queue, post

OPT(p) > Lt LQD(q).

  • Matching idea:

15

slide-21
SLIDE 21

lqd: upper bound .

  • Matching routine: on each time slot t,

(1) if during arrival a matched LQD packet p is preempted by p′, replace p by p′ in the matching; (2) at end of arrivals, match each unmatched OPT packet p in queue q for which post

OPT(p) > Lt LQD(q) as follows:

  • if p arrived at this time slot and was accepted by both OPT and

LQD, match p to itself;

  • else match p to an arbitrary unmatched packet in LQD buffer.
  • So the idea is to match all potential extra packets as soon as

they appear.

  • We have to prove that:
  • all extra packets are matched before they are transmitted;
  • the matching is feasible (there are always enough unmatched

packets).

  • If so, we get a competitive ratio of 2.

16

slide-22
SLIDE 22

lqd: upper bound .

  • Sequence of lemmas:

(1) all extra packets are matched (if matching works at all); (2) if p ∈ OPT from queue q is matched to p′ ∈ LQD and p′ ≠ p, then LQD is congested and some packets have been dropped in queue q; (3) if p ∈ OPT is matched to p′ ∈ LQD at time t, then post

OPT(p) ≥ post LQD(p′) (check all cases);

(4) matching works (if there are unmatched packets in OPT then there are at least as many unmatched packets in LQD).

17

slide-23
SLIDE 23

single queue with heterogeneous processing .

slide-24
SLIDE 24

intro .

  • Modern network processors have to perform increasingly

heterogeneous tasks.

  • Existing infrastructure does not always support it.
  • We study various settings of heterogeneous packet

characteristics and compare different online algorithms with competitive analysis.

  • Based on these papers:
  • K. Kogan, A. Lopez-Ortiz, S.I. Nikolenko, A.V. Sirotkin. Online

Scheduling FIFO Policies with Admission and Push-Out. Theory of Computing Systems, vol. 58, no. 2, 2016, pp. 322-344

  • K. Kogan, A. Lopez-Ortiz, S.I. Nikolenko, A.V. Sirotkin, D. Tugaryov.

FIFO Queueing Policies for Packets with Heterogeneous

  • Processing. Proc. 1st Mediterranean Conference on Algorithms

(MedAlg 2012), LNCS vol. 7659, Springer, 2012, pp. 248–260.

19

slide-25
SLIDE 25

problem setting .

  • A buffer B that handles a sequence of arriving packets.
  • Each packet p has several required processing cycles

r(p) ∈ {1, … , k}, denoted by r(p) .

  • Discrete time, each time slot contains:

(1) arrival: new packets arrive, and the buffer management unit performs admission control and, possibly, push-out; (2) assignment and processing: a single packet is selected for processing by the scheduling module; (3) transmission: packets with zero required processing left are transmitted and leave the queue.

20

slide-26
SLIDE 26

a sample time slot .

21

slide-27
SLIDE 27

basic definitions .

  • Notation:
  • k is the maximal number of required processing cycles;
  • B is the buffer size;
  • C is the number of processing cores (C = 1 for now).
  • Common properties: an algorithm is
  • greedy if it accepts all arrivals whenever there is buffer space

available;

  • preemptive if it allows packets to push out (preempt) currently

stored packets.

22

slide-28
SLIDE 28

simple algorithms .

  • Non-preemptive greedy NPO: for an incoming packet p, if buffer
  • ccupancy is less than B then accept p else drop p.
  • Preemptive greedy PO: for an incoming packet p,
  • if buffer occupancy is less than B then accept p;
  • else let q be the first (from HOL) packet with maximal number of

residual processing; if rt(p) < rt(q) then drop q and accept p according to FIFO order, else drop p,

  • What are their competitive ratios?

23

slide-29
SLIDE 29

simple algorithms .

  • Obvious upper bound: any reasonable greedy work-conserving

algorithm (even NPO) is k-competitive.

  • What is the lower bound for NPO?

23

slide-30
SLIDE 30

simple algorithms .

  • Lower bound for NPO is also k:
  • fill NPO buffer with k s;
  • keep NPO buffer full with k s by adding one more every k time

slots;

  • at the same time, feed OPT with 1 s (OPT does not accept all

k s and leaves room for 1 s).

  • This concludes our theoretical analysis of NPO.
  • Exercise: can you prove a lower bound of ≈ 2 for PO?

23

slide-31
SLIDE 31

lower bounds for PO .

  • Lower bound for PO is at least 2(1 − 1

B) for k ≥ B: t Arriving IB{PO,LPO}

t

#PO IBOPT

t

# OPT 1 1 B 1 B 1 1 2 1 1 1 B − 1 1 2 3 1 1 1 1 B − 2 1 3 … … … B − 2 1 1 … 1 2 1 B − 2 B − 1 1 × B 1 … 1 1 1 1 … 1 1 B − 1 … … … 2B − 1 B 2B − 2

  • Next exercise: what if k < B?

24

slide-32
SLIDE 32

lower bounds for PO .

  • Lower bound for PO is at least

2k k+1 for k < B:

  • on step 1, there arrive (1 − α)B × k followed by αB × 1 ; PO

accepts all, OPT rejects k s.

  • on step αB, there arrive αB

k × 1 ; on step αB(1 + 1 k), αB k2 × 1

and so on;

  • when PO is out of packets with k processing cycles, its queue is

full 1 s, and OPT’s queue is empty; now, there arrive B × 1 , they are processed, and the sequence is repeated.

  • In order for this sequence to work, we need to have

αB (1 + 1

k + 1 k2 + …) = k (1 − α) B, so we get α = 1 − 1 k.

  • During the sequence, OPT has processed

αB (1 + 1

k + 1 k2 + …) + B = 2B packets, while PO has

processed (1 − α) B + B = (1 + 1

k) B packets, so the competitive

ratio is

2 1+ 1

k .

24

slide-33
SLIDE 33

lower bounds for PO .

  • For large values of k, we can have a logarithmic lower bound.

First step: suppose k ≥ (B − 1)(B − 2). Then:

  • we begin with buffer state

1 2 3 4 … B − 1 (B − 1)(B − 2) .

  • OPT drops first packet and processes the rest while PO keeps

processing the first;

  • then, for B steps one 1 per step arrives; PO keeps dropping its

HOL;

  • then PO has a queue of 1 s, so we flush it out with B × 1 .
  • At the end of this iteration, PO has processed B + 1 packets;

OPT, 3B packets.

24

slide-34
SLIDE 34

lower bounds for PO .

  • We can iterate this construction for larger values of k: having

proven for S = Ω(Bn−1), on the next step we begin with 1 + S 2 + S 3 + S 4 + S … B − 1 + S (B − 1)(B − 2 + S) . Theorem The competitive ratio of PO is at least ⌊logB k⌋ + 1 − O( 1

B). 24

slide-35
SLIDE 35

lazy policies .

  • So the lower bound that we can show for PO is ≈ logB k,

asymptotically better than k.

  • But can we show a matching upper bound? It is far from
  • bvious how to analyze PO.
  • We do the analysis by introducing a new class of algorithms,

lazy processing policies.

25

slide-36
SLIDE 36

lazy policies .

  • Lazy push-out algorithm LPO mimics the behaviour of PO with

two important differences:

  • LPO does not transmit HOL 1 if it has at least one packet with

r > 1, until the buffer contains only 1 s;

  • then, LPO transmits all 1 s one by one, accepting new packets in

the end of the queue (they cannot push out 1 s).

25

slide-37
SLIDE 37

lazy policies .

  • Intuitively, LPO is a weakened version of PO since PO tends to

empty its buffer faster.

  • However, in the worst case they are incomparable:
  • there exists a sequence of inputs on which PO processes ≥ 3

2

times more packets than LPO;

  • there exists a sequence of inputs on which LPO processes ≥ 5

4

times more packets than PO.

25

slide-38
SLIDE 38

lazy policies .

  • Lower bounds on LPO almost exactly match lower bounds on

PO:

  • the competitive ratio of LPO is at least 2(1− 1

B) for k ≥ B and

at least 2k−1

k

for k < B;

  • for large k, the competitive ratio of LPO is at least

⌊logB k⌋ + 1 − O( 1

B).

  • The difference is that for LPO, we can prove an upper bound.

Theorem LPO is at most (max{1, ln k} + 2 + o(1))-competitive.

25

slide-39
SLIDE 39

anatomy of an iteration .

26

slide-40
SLIDE 40

simulations: variable λon and k .

.

. OPT ∗

.

. PO

.

. LPO

.

. NPO

.

. .

0 . 0.1

.

0.2

.

0.6

.

0.7

.

0.8

.

0.9

.

1

.

k = 5, B = 10, C = 1 λon

.

. . .

0.1

.

0.2

.

k = 10, B = 50, C = 1 λon

.

. .

0 . 0.1

.

0.2

.

k = 10, B = 50, C = 2 λon

.

. . .

20

.

40

.

0.4

.

0.6

.

0.8

.

1

.

B = 10, λon = 0.1, C = 1 k

.

. .

0. 20

.

40

.

B = 10, λon = 0.2, C = 1 k

. . .

. . .

20

.

40

.

B = 50, λon = 0.2, C = 2 k

27

slide-41
SLIDE 41

simulations: variable b and c .

.

. OPT ∗

.

. PO

.

. LPO

.

. NPO

.

. . .

20

.

40

.

0.4

.

0.6

.

0.8

.

1

.

k = 3, λon = 0.2, C = 1 B

.

. . .

20

.

40

.

k = 5, λon = 0.2, C = 1 B

. . .

. . .

20

.

40

.

k = 10, λon = 0.2, C = 1 B

.

. .

5

.

10

.

0.6

.

0.8

.

1

.

k = 5, B = 5, λon = 0.2 C

.

. .

5

.

10

.

k = 10, B = 10, λon = 0.2 C

. . .

. .

5

.

10

.

k = 25, B = 50, λon = 0.2 C

28

slide-42
SLIDE 42

processing orders .

  • Let us now generalize a bit.
  • Priority queueing (PQ): a packet with minimal residual work is

processed first.

  • Reversed priority queueing (RevPQ): a packet with maximal

residual work is processed first.

  • FIFO with recycles (RFIFO): non-fully processed packets are

recycled to the back of the queue.

  • And so on; we can decouple processing order from transmission
  • rder.
  • K. Kogan, A. Lopez-Ortiz, S.I. Nikolenko, A.V. Sirotkin. A Taxonomy of

Semi-FIFO Policies. Proc. 31st IEEE International Performance Computing and Communications Conference ( IPCCC 2012), IEEE Press, 2012, pp. 295–304.

29

slide-43
SLIDE 43

lazy: a definition .

Definition A buffer processing policy LA is called lazy if it satisfies the follow- ing conditions: (i) LA greedily accepts packets if its buffer is not full; (ii) LA pushes out the first packet with maximal number of processing cycles in case of congestion; (iii) LA does not process and transmit packets with a single processing cycle if its buffer contains at least one packet with more than one processing cycle left; (iv) once all packets in LA’s buffer (say m packets) have a single processing cycle remaining, LA transmits them over the next m time slots, even if additional packets arrive during that time.

30

slide-44
SLIDE 44

a general upper bound on LA .

  • Ideas of the LPO upper bound can be extended to a general

upper bound on all lazy policies.

  • Same as above, we define an iteration and it comes to a

logarithmic bound (≈ ln k). Theorem LA is at most (3 + 1

B logB/(B−1) k)-competitive. 31

slide-45
SLIDE 45

lower bounds .

  • This upper bound is tight for some processing orders.

Theorem LRFIFO, LRevPQ, and RFIFO are at least (1 + 1

B logB/(B−1) k)-

competitive.

32

slide-46
SLIDE 46

lower bounds .

  • Proof: denote γ = B−1

B

.

  • First burst: (B − 1) × k packets arrive followed by γk ; OPT

drops all k s and only leaves γk .

  • After γk steps, all three policies will have B × γk in the buffer,

and then γ2k arrives.

  • Repeat this sequence ( γi+1k arrives after γi more steps)

until IBALG consists of 1 ’s.

  • We get that OPT has processed log 1

γ k = log B B−1 k packets

while LRevPQ (LRFIFO) has processed none.

  • Then we flush out with a new burst of B × 1 .

32

slide-47
SLIDE 47

LPQ .

Theorem (i) LPQ is at most 2-competitive. (ii) LPQ is at least (2 − 1

B ⌈ B k ⌉)-competitive.

(i) Since PQ is optimal, during an iteration OPT cannot transmit more packets than reside in the LPQ buffer at the end of an iteration. By a previous Lemma, LPQ is at most 2-competitive. (ii) For k ≥ B, consider two bursts of packets: B × k and then, in (k − 1)B steps, (B − 1) × 1 each. After these two bursts, OPT has processed 2B − 1 packets, and LPQ has processed B packets, so we can repeat them to get the asymptotic

  • bound. For k < B, in the same construction ⌈ B

k ⌉ packets are left in OPT’s

queue after (k − 1)B processing steps.

33

slide-48
SLIDE 48
  • ther extensions

.

Theorem Any greedy Semi-FIFO policy is at least (1 + m−1

B

)-competitive for m = min{k, B}. Theorem Any lazy policy LA (including LRevPQ) is incomparable with either FIFO or RFIFO in the worst case for every k > 2 and B > 2. Theorem Any greedy non-push-out Semi-FIFO policy NPO is at least k+1

2

  • competitive. Any lazy greedy non-push-out policy NLPO is at least

(k − 1)-competitive.

34

slide-49
SLIDE 49

constraints on push-out .

  • In some situations, we’d like to impose constraints on push-out;

e.g., there might be copying cost α for each admitted packet.

  • We introduce an additional constraint β: a policy ALGβ pushes
  • ut only if the new arrival has at least β times less work than

the maximal residual work in the buffer.

35

slide-50
SLIDE 50

upper bound with β-preemption .

  • The key lemma will now have Wte ≤ Wt−1 − Mt−1

β

. Theorem LAβ is at most (3 + 1

B log

βB βB−1 k)

1−α 1−α logβ k-competitive for copy-

ing cost 0 < α <

1 logβ k. 36

slide-51
SLIDE 51

results summary .

Algorithm/family Lower bound Upper bound Semi-FIFO 1 + min{k,B}−1

B

  • pen problem

Lazy 1 + min{k,B}−1

B

3 + 1

B log

B B−1 k

LRFIFO, LRevFIFO 1 + 1

B log

B B−1 k

3 + 1

B log

B B−1 k

LPO ⌊logB k⌋ + 1 max{1, ln k} + 2 LPQ 2 − 1

B ⌈ B k ⌉

2 2LFIFO k − 1 + 1

B ⌊ B k ⌋

k Lazy β-push-out 1 + min{k,B}−1

B

(3 + 1

B log

βB βB−1 k)

1−α 1−α logβ k

Non-push-out

k+1 2

k Lazy non-push-out k − 1 k

37

slide-52
SLIDE 52
  • ther settings

.

slide-53
SLIDE 53

multiple shared queues .

  • Now for a brief review of our other results in this direction.
  • First: multiple separate queues with a shared buffer.
  • K. Kogan, A. Lopez-Ortiz, S.I. Nikolenko, A.V. Sirotkin. Multi-Queued

Network Processors for Packets with Heterogeneous Processing

  • Requirements. Proc. 5th International Conference on

Communication Systems and Networks (COMSNETS 2013), IEEE Press, 2013.

39

slide-54
SLIDE 54

multiple shared queues .

  • Let’s divide the buffer into k queues and send each packet to

the corresponding queue.

  • Fairness properties become trivial, and there’s no need to

implement priority queueing, move packets around a queue.

39

slide-55
SLIDE 55

multiple shared queues .

  • Reasonable policies:
  • LQF (longest queue first);
  • SQF (shortest queue first);
  • MQF (minimal queue first);
  • fair policies: PRR (packet round robin), CRR (cycle round robin).

39

slide-56
SLIDE 56

multiple shared queues .

  • Lower bounds:
  • LQF is at least m

2 -competitive, m = min{k, B};

  • SQF is at least k-competitive;
  • MQF is at least (1+ k−1

2k )-competitive.

Theorem MQF is at most 2-competitive.

39

slide-57
SLIDE 57

multiple shared queues .

  • Important special case: what if there are only two kinds of

required processing, a and b. Theorem

  • 1. The competitiveness of MQF with two queues is at least

1 + 1+⌊ aB−1

b

⌋ B + ⌈

1 a (b ⌊ aB−1 b

⌋ + 1)⌉ .

  • 2. The competitiveness of MQF with two queues is at most

1 + 1+⌊ aB−1

b

⌋ B + ⌈

1 a (b ⌊ aB−1 b

⌋ + 1)⌉ .

39

slide-58
SLIDE 58

multiple output ports .

  • Next setting: let us now have multiple output ports, each packet

is labeled with an output port, and packets with the same

  • utput port share properties (like required processing).
  • P. Eugster, K. Kogan, S.I. Nikolenko, A.V. Sirotkin. Shared-Memory

Buffer Management for Heterogeneous Packet Processing. Proc. 34th International Conference on Distributed Computing Systems (ICDCS 2014), IEEE Press, 2014, pp. 471–480.

40

slide-59
SLIDE 59

multiple output ports .

  • First setting: l × n memory switch with shared buffer of size B,

each FIFO queue Qi has packets with the same processing requirement ri (max k).

40

slide-60
SLIDE 60

multiple output ports .

  • Lower bounds:
  • NHST is at least kZ-competitive, Z = ∑ 1

ri ;

  • NEST is at least n-competitive;
  • NHDT is at least 1

2√k ln k-competitive;

  • LQD is at least √k-competitive;
  • BPD is at least (ln k + γ)-competitive;
  • LWD is at least ( 4

3 − 6 B)-competitive.

Theorem LWD is at most 2-competitive.

40

slide-61
SLIDE 61

multiple heterogeneous characteristics .

slide-62
SLIDE 62

multiple characteristics .

  • Next setting: what if the packets have multiple different

characteristics?

  • Say, required processing and value.
  • We restrict ourselves to a single queue, it’s hard enough already.
  • P. Chuprikov, K. Kogan, S.I. Nikolenko. Priority Queueing with

Multiple Packet Characteristics. IEEE INFOCOM 2015 (INFOCOM 2015), 2015, pp. 1418–1426.

42

slide-63
SLIDE 63

multiple characteristics .

  • This time, it makes sense to concentrate on priority queues

since there are many different versions.

42

slide-64
SLIDE 64

multiple characteristics .

  • Lower bounds:
  • PQ−w,v is at least V-competitive and at most V-competitive;
  • PQv,−w is at least ( (V−1)

V

W − o(1))-competitive;

  • PQv/w is at least min(V, W)-competitive.
  • General lower bound:

Theorem Every online deterministic algorithm ALG is at least ( 5

4 − O(1/W))-

competitive.

42

slide-65
SLIDE 65

multiple characteristics .

  • Two-valued case: what if required processing can be arbitrary,

but there are only two values, 1 and V? Lower bounds:

  • PQ−w,v is at least V-competitive;
  • if W ≥ V then PQv/w is at least V-competitive;
  • PQv,−w is at least ( W

V + o(1))-competitive.

  • General lower bound:

Theorem Every online deterministic algorithm ALG is at least (1 + V−1

V2

− O( 1

W ))-competitive.

  • Upper bound:

Theorem PQv,−w is at most (1 + W+2

V

)-competitive.

42

slide-66
SLIDE 66

multiple characteristics .

  • Summary of our results:

General case Two-valued case Processing policy Lower bound Lower bound Upper bound Any 5/4 1 + V−1

V2

− O( 1

W )

PQ−w,v, PQβ

−w,v

V V V PQv,−w, PQβ

v,−w (V−1) V

W − o(1)

W V + o(1)

1 + W+2

V

PQv/w, PQβ

v/w, βW ≥ V

V V PQv/w, W < V W

W V + o(1)

2 + 2

V

42

slide-67
SLIDE 67

summary .

  • Competitive analysis provides worst-case guarantees for buffer

management.

  • Worst-case competitive upper bounds are important since

traffic distributions can be uneven and unpredictable.

  • In modern networking, heterogeneous characteristics (required

processing, value, size etc.) abound.

  • This leads to new challenges in buffer management.
  • We consider buffer management policies in many different

contexts, proving bounds on their competitiveness:

  • a single queue, where we have introduced lazy algorithms to prove

upper bounds;

  • buffer with multiple shared queues;
  • buffer with a separate queue for each output port;
  • queue with packets with multiple heterogeneous characteristics.
  • There are many more important contexts and settings to come.

43

slide-68
SLIDE 68

thank you! .

Thank you for your attention!

44

slide-69
SLIDE 69

upper bound on the competitiveness of LPO .

  • Idea – we define an iteration:
  • the first iteration begins with the first arrival;
  • an iteration ends when all packets in the LPO buffer have a

single processing pass left;

  • each subsequent iteration starts after the transmission of all

LPO packets from the previous iteration.

  • The plan is to count how many packets LPO can lose to OPT on

each iteration.

45

slide-70
SLIDE 70

upper bound on the competitiveness of LPO .

  • Wlog, OPT never pushes out packets and it is work-conserving.
  • Further, we give OPT an additional property for free:

(1) at the start of each iteration, OPT flushes out all packets remaining in its buffer from the previous iteration (for free, with extra gain to its throughput).

  • Notation:
  • A, number of non-HOL packets in OPT’s buffer at time tcon;
  • WA, their total required processing;
  • Mt, maximal number of residual processing cycles among all

packets in LPO’s buffer at time t in current iteration;

  • Wt, total residual work for all packets in LPO’s buffer at time t.

45

slide-71
SLIDE 71

upper bound on the competitiveness of LPO .

  • Consider an iteration I that begins at time tbeg and ends at

time tend; tcon is the time when LPO buffer is first congested.

  • The following statements hold:

(1) during I, the buffer occupancy of LPO is at least the buffer

  • ccupancy of OPT;

(2) if during a time interval [t, t′], tbeg ≤ t ≤ t′ ≤ tcon, there is no congestion in LPO’s buffer then during [t, t′] OPT transmits at most | IBLPO

t′

| packets and LPO does not transmit any packets.

46

slide-72
SLIDE 72

upper bound on the competitiveness of LPO .

Lemma

  • 1. During [tbeg, tcon], OPT processes at most B − 1 packets.
  • 2. For every packet p in OPT’s buffer at time tconexcept perhaps

the HOL packet, there is a corresponding packet q in LPO’s buffer with r(q) ≤ r(p). Proof.

  • 1. During [tbeg, tcon], there arrive exactly B packets (because LPO does not

transmit any packets and becomes congested at tcon). Moreover, OPT cannot process all B packets because then LPO would also have time to process them, and the iteration would be uncongested.

  • 2. Every packet in OPT’s buffer also resides in LPO’s buffer because LPO has

not dropped anything yet at time tcon; r(q) ≤ r(p) because LPO may have processed some packets partially.

47

slide-73
SLIDE 73

upper bound on the competitiveness of LPO .

  • By prev. Lemma, LPO buffer at time tcon contains A

corresponding packets, so Wtcon ≤ WA + (B − A)k.

  • Moreover, over the next WA time slots OPT will be processing

these A packets and LPO, being congested, will also not be idle, so at time tcon + A we will have Wtcon+A ≤ (B − A)k (we give OPT its HOL packet for free, so OPT processes A + 1 packets

  • ver [tcon, tcon + A]).

48

slide-74
SLIDE 74

upper bound on the competitiveness of LPO .

Lemma For every packet accepted by OPT at time t ∈ [tcon, tend] and processed by OPT during time interval [t′, t″], tcon ≤ t′ ≤ t″ ≤ tend, Wt″ ≤ Wt−1 − Mt. Proof. If LPO’s buffer is full then a packet p accepted by OPT either pushes

  • ut a packet in LPO’s buffer or is rejected by LPO. If p pushes a

packet out, then the total work Wt−1 is immediately reduced by Mt − rt(p). Moreover, after processing p, Wt″ ≤ Wt−1 − (Mt − rt(p))−rt(p) = Wt−1 −Mt. If, on the other hand, p is rejected by LPO then rt(p) ≥ Mt, and thus Wt″ ≤ Wt−1 − rt(p) ≤ Wt−1 − Mt.

49

slide-75
SLIDE 75

upper bound on the competitiveness of LPO .

We denote by f(B, W) the maximal number of packets that OPT can accept and process during [t, t′], tcon ≤ t ≤ t′ ≤ tend, where W = Wt−1. The next lemma is crucial for the proof. Lemma For every ε > 0, f(B,W) ≤ B−1

1−ε ln W B for B sufficiently large.

  • Proof: all packets LPO transmits it does at the end of an

iteration, hence, if the buffer of LPO is full, it will remain full until tend − B.

  • At any time t, Mt ≥ Wt

B : the maximal required processing is no

less than the average.

50

slide-76
SLIDE 76

upper bound on the competitiveness of LPO .

  • We know that for every packet p accepted by OPT at time t, the

total work W = Wt−1 is reduced by Mt after OPT has processed p.

  • Therefore, after OPT processes a packet at time t′, Wt′ is at

most W (1 − 1

B).

  • Now by induction on W; for W = B the base is trivial.

51

slide-77
SLIDE 77

upper bound on the competitiveness of LPO .

  • The induction hypothesis is that after a packet is processed by

OPT, there cannot be more than f(B, W B (1 − 1 B)) ≤ B − 1 1 − ε ln [ W B (1 − 1 B)] packets left, and for the induction step we have to prove that B − 1 1 − ε ln [ W B (1 − 1 B)] + 1 ≤ B − 1 1 − ε ln W B .

  • This is equivalent to

ln W B ≥ ln [ W B B − 1 B e

1−ε B−1

] , and this holds asymptotically because for every ε > 0, we have e

1−ε B−1 ≤

B B−1 for B sufficiently large. 51

slide-78
SLIDE 78

upper bound on the competitiveness of LPO .

  • Applying Lemma 15 to the time tcon + A, we get the following.

Corollary For every ε > 0, the total number of packets processed by OPT between tcon and tend in a congested iteration does not exceed A + 1 + (B + o(B)) ln (B − A)k B .

  • And the final result is as follows.

Theorem LPO is at most (max{1, ln k} + 2 + o(1))-competitive.

51

slide-79
SLIDE 79

anatomy of an iteration .

52

slide-80
SLIDE 80

upper bound on the competitiveness of LPO .

  • Consider an iteration I over time [tbeg, tend].
  • If I is uncongested then OPT cannot transmit more than

| IBLPO

t

| packets during I.

  • Consider an iteration I first congested at time tcon:
  • by a lemma, during [tbeg, tcon) OPT can transmit at most

B − 1 packets, leaving A + 1 packets in its buffer;

  • by the corollary, OPT processes at most

A + 1 + B−1

1−ε ln (B−A)k B

+ o(B ln (B−A)k

B

) packets during [tcon, tend] and flushes out ≤ B packets at time tend;

  • thus, the total number of packets transmitted by OPT over a

congested iteration is at most 2B + A + (B + o(B)) ln (B − A)k B .

  • It is now easy to check that for every 1 ≤ A ≤ B − 1 the

theorem’s statement is satisfied.

53

slide-81
SLIDE 81

a general upper bound on LA .

  • Ideas of the LPO upper bound can be extended to a general

upper bound on all lazy policies. Lemma Consider an iteration I that has started at time t′ and ended at time

  • t. The following statements hold.

(1) During I, the buffer occupancy of LA is at least the buffer

  • ccupancy of OPT.

(2) Between two consecutive iterations I and I′, OPT transmits at most |IBLA

t

| packets. (3) If during an interval of time [t′, t฀], t′ ≤ t฀ ≤ t, there is no congestion, then during [t′, t฀] OPT transmits at most |IBLA

t฀ |

packets.

54

slide-82
SLIDE 82

a general upper bound on LA .

  • Same as above.

Lemma For any packet accepted by OPT at time t and processed by OPT during [ts, te], t ≤ ts ≤ te, if |IBLA

t−1| = B and |IBOPT t−1 | = 0 then

Wte ≤ Wt−1 − Mt−1.

54

slide-83
SLIDE 83

a general upper bound on LA .

  • And this comes to a logarithmic bound (≈ ln k).

Theorem LA is at most (3 + 1

B logB/(B−1) k)-competitive. 54

slide-84
SLIDE 84

a general upper bound on LA .

  • In a congested iteration, any packet processed by OPT

decreases the total LA work by Mt, i.e., by at least W/B.

  • After n transmission rounds, the residual number of processing

cycles in LA buffer is W(1 − 1/B)n.

  • Since initially W ≤ kB, n ≤ logB/(B−1) k.

54