Felisa J. V azquez-Abad and Lachlan L. H. Andrew D epartement - - PowerPoint PPT Presentation

felisa j v azquez abad and lachlan l h andrew
SMART_READER_LITE
LIVE PREVIEW

Felisa J. V azquez-Abad and Lachlan L. H. Andrew D epartement - - PowerPoint PPT Presentation

Filtered Gibbs Sampler fo r Estimating Filtered Gibbs Sampler fo r Estimating Blo cking Probabilities in WDM Optical Net w o rks Blo cking Probabilities in WDM Optical Net w o rks Felisa J. V azquez-Abad and


slide-1
SLIDE 1 Filtered Gibbs Sampler fo r Estimating Filtered Gibbs Sampler fo r Estimating Blo cking Probabilities in WDM Optical Net w
  • rks
Blo cking Probabilities in WDM Optical Net w
  • rks

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew

D´ epartement d’informatique et recherche opr´ erationnelle Universit´ e de Montr´ eal, Qu´ ebec, CANADA couriel: vazquez@IRO.UMontreal.CA Department of Electronic and Electrical Engineering The University of Melbourne email: {fva,lha}@ee.mu.oz.au

European Simulation Multiconference, 25 May 2000.

slide-2
SLIDE 2

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 1

Outline of Presentation

  • 1. Motivation
  • WDM optical networks
  • 2. Clique packing
  • Stationary measure
  • Blocking probability
  • 3. Monte Carlo simulation
  • Accept/reject Monte Carlo
  • Markov chain Monte Carlo
  • 4. The Gibbs sampler
  • Periodic Gibbs
  • Filtered sequential Gibbs sampler
  • 5. Future work
slide-3
SLIDE 3

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 2

1.1 Motivation: WDM Optical Networks

  • Optical bandwidth >> electronic bandwidth.
  • Wavelength division multiplexing (WDM):

– Λ independent wavelengths per fibre – Each wavelength modulated separately

  • Crossconnects: at nodes act as space switches, they can also switch

wavelengths.

slide-4
SLIDE 4

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 3

1.1 Motivation: WDM Optical Networks

  • Optical bandwidth >> electronic bandwidth.
  • Wavelength division multiplexing (WDM):

– Λ independent wavelengths per fibre – Each wavelength modulated separately

  • Crossconnects: at nodes act as space switches, they can also switch

wavelengths.

Lightpaths are shown in different shades of colour. Hop 1 Hop 2 Hop 3

  • Optical carriers within fibres are

wavelengths.

  • Calls are connected using optical

carriers along the links on their paths: lightpath.

  • Connected calls use the bandwidth
  • f each carrier wavelength along the

lightpath.

slide-5
SLIDE 5

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 4

1.1 Motivation: Crossconnects Full wavelength conversion ⇒ standard circuit switched loss network

λ λ λ 2 3 1 Wavelength Converter (b) (a) Space Switch

M input and output fibres with W wavelengths on each, requirements:

  • wavelength continuous crossconnect: W different M × M space switches,
  • wavelength conversion crossconnect: a single MW × MW space switch.

VERY EXPENSIVE !!!!!!

slide-6
SLIDE 6

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 5

1.1 Motivation: Resource Allocation Demand Model: Call arrivals to lightpath i follow a Point process Ni(t) with intensity λi (e.g. Poisson). Call durations: i.i.d holding times with mean 1/µ. Resources: No (or partial) wavelength conversion : wavelength continuity constraints. Calls compete for bandwidth.

  • Dynamic allocation of lightpaths

– Several methods available to assign LPs to incoming calls – Problem: Analysis and evaluation difficult (unless full conversion)

slide-7
SLIDE 7

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 6

1.1 Motivation: Maximum and Clique Packing Dynamic lightpath allocation: Wavelength continuity constraint (no conversion). How to assign lightpaths to incoming calls at route i?

  • Call arrives, search available wavelength (say First Fit assignment).
  • No wavelength available on path .... reject???

Fast tuning devices: Optical carriers can (in principle) change wavelength of on-going connections without affecting QoS.

slide-8
SLIDE 8

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 7

1.1 Motivation: Maximum and Clique Packing Dynamic lightpath allocation: Wavelength continuity constraint (no conversion). How to assign lightpaths to incoming calls at route i?

  • Call arrives, search available wavelength (say First Fit assignment).
  • No wavelength available on path .... reject???

Fast tuning devices: Optical carriers can (in principle) change wavelength of on-going connections without affecting QoS. Maximum packing: Fast tuning devices ⇒ rearrangement of wavelengths. Calls on route i connected if, upon rearrangement, there is a wavelength available.

slide-9
SLIDE 9

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 8

1.1 Motivation: Maximum and Clique Packing Dynamic lightpath allocation: Wavelength continuity constraint (no conversion). How to assign lightpaths to incoming calls at route i?

  • Call arrives, search available wavelength (say First Fit assignment).
  • No wavelength available on path .... reject???

Fast tuning devices: Optical carriers can (in principle) change wavelength of on-going connections without affecting QoS. Maximum packing: Fast tuning devices ⇒ rearrangement of wavelengths. Calls on route i connected if, upon rearrangement, there is a wavelength

  • available. State description: occupancy, complex coupling equations.

Analysis: Complex model for analytical results, state space too large. Simplified model: clique packing yields simple linear constraints

slide-10
SLIDE 10

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 9

2.0 Clique Packing in WDM Optical Networks

  • R = number of routes in network (number of O/D pairs if fixed routing)
  • ni = number of calls currently using route i

Cliques Graph G = (V, E)

  • V : vertices = routes
  • E: edge if routes share a link
  • Clique: fully connected subgraph of G.

1 3 2

Maximum packing Fast tuning devices: Allocate incoming calls whenever possible, allowing rearrangement ⇒ (n-colouring of G) Clique packing assumes that incoming calls can be connected iff

  • j∈Cl

nj < Λ for all l with j ∈ Cl Simplified Model: Occupancy vector ni(t) follows stochastic process: independent Poisson arrivals and i.i.d. holding times (not M/G/∞ server... boundaries!)

slide-11
SLIDE 11

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 10

2.1 Analysis of clique packing: stationary measure Model Arrivals ∼ Poisson(λi), holding times ∼ exp(µ), {n(t)} occupancy process: each dimension B&D with state dependent reflecting boundaries.

λi λi λi λi 1 n -1 n n +1 (n) Λ

i i i

µ µ n (n +1)µ µ (n)

i i i

Λ

Result The limit occupancy distribution (stationary probabilites) are: π(n) = 1 G

R

  • i=1

ρni

i

ni!

  • ,

n ∈ S S =   n ∈ NR :

  • j∈Cl

nj ≤ Λ; l = 1, . . . , L    G =

  • n∈S

R

  • i=1

ρni

i

ni!

  • Result: This result may be generalised for other renewal arrival processes and

holding time distribution.

slide-12
SLIDE 12

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 11

2.2 Blocking probability B = lim

t→∞ R

  • i=1

Yi(t) A(t) Yi(t) = number of lost arrivals on route i at time t A(t) = total number of arrivals at time t. Blocking states on route i: states n ∈ Bi ⇒ incoming calls at i are lost: Bi =   n ∈ S : max

{l:i∈Cl}

  • j∈Cl

nj = Λ    B =

R

  • i=1

λi λ

  • π(Bi)

...solved the problem?

slide-13
SLIDE 13

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 12

2.2 Blocking probability B = lim

t→∞ R

  • i=1

Yi(t) A(t) Yi(t) = number of lost arrivals on route i at time t A(t) = total number of arrivals at time t. Blocking states on route i: states n ∈ Bi ⇒ incoming calls at i are lost: Bi =   n ∈ S : max

{l:i∈Cl}

  • j∈Cl

nj = Λ    B =

R

  • i=1

λi λ

  • π(Bi)

...solved the problem? Realistic network sizes: > 20 nodes, 8–64 wavelengths, R = n2/2 + o(n2) # states ≈ O(ΛR). For 10 nodes and 8 wavelengths, computation of G requires ≈ 845 ≈ 1040 multiplications,

slide-14
SLIDE 14

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 13

2.2 Blocking probability B = lim

t→∞ R

  • i=1

Yi(t) A(t) Yi(t) = number of lost arrivals on route i at time t A(t) = total number of arrivals at time t. Blocking states on route i: states n ∈ Bi ⇒ incoming calls at i are lost: Bi =   n ∈ S : max

{l:i∈Cl}

  • j∈Cl

nj = Λ    B =

R

  • i=1

λi λ

  • π(Bi)

...solved the problem? Realistic network sizes: > 20 nodes, 8–64 wavelengths, R = n2/2 + o(n2) # states ≈ O(ΛR). For 10 nodes and 8 wavelengths, computation of G requires ≈ 845 ≈ 1040 multiplications, which takes 1021 years of CPU time on a 1 TFlops computer...

slide-15
SLIDE 15

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 14

  • 3. Simulation methods: Monte Carlo

Idea: Estimate B directly, rather than find G then B B =

R

  • i=1

λi λ

  • E(1{X ∈ Bi}), X ∼ π

Simulation:

  • Generate a sample {X1, . . . , XN} i.i.d., Xi ∼ π
  • Use the sample average:

ˆ Y (N) = 1 N

R

  • i=1

λi λ

  • N
  • s=1

1{Xs ∈ Bi}

slide-16
SLIDE 16

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 15

  • 3. Simulation methods: Efficiency

ˆ Y (N) = 1 N

R

  • i=1

λi λ

  • N
  • s=1

1{Xs ∈ Bi} LLN and CLT ⇒ confidence intervals can be estimated to give approximate error ǫ = z1−α/2

  • V
a r( ˆ

Y (N)) ⇒ Relative error ≈

  • V
a r ˆ

Y (N)) B2 Definition: Relative efficiency of estimator ˆ Y (N): Er( ˆ Y (N)) = B2

CPU[ ˆ

Y (N)]

V a r[ ˆ

Y (N)] Trade-off between relative error and CPU time.

slide-17
SLIDE 17

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 16

  • 4. Monte Carlo Simulation: Acceptance/Rejection

Generate Xk ∼ π(·), {Xk} i.i.d. and use Yk =

R

  • i=1

λi λ

  • 1{Xk ∈ Bi}

π is a state-dependent truncated Poisson π(n) = 1 G

R

  • i=1

ρni

i

ni!, n ∈ S Accept/Reject:

  • Repeat

– Generate (m1, . . . , mR) independent, mi ∼ Poisson(ρi)

  • until m ∈ S
  • Set Xk(i) = mi

Xk are i.i.d.∼ π.

slide-18
SLIDE 18

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 17

  • 4. Monte Carlo Simulation: Acceptance/Rejection

Generate Xk ∼ π(·), {Xk} i.i.d. and use Yk =

R

  • i=1

λi λ

  • 1{Xk ∈ Bi}

π is a state-dependent truncated Poisson π(n) = 1 G

R

  • i=1

ρni

i

ni!, n ∈ S Accept/Reject:

  • Repeat

– Generate (m1, . . . , mR) independent, mi ∼ Poisson(ρi)

  • until m ∈ S
  • Set Xk(i) = mi

Xk are i.i.d.∼ π.

ρi R = 7 R = 19 R = 37 12 0.05 0.19 0.5 15 1.10 11.00 167.0 18 17.00 16.66 5.3 × 108

Problem:

E(iterations) = 1/G
slide-19
SLIDE 19

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 18

  • 5. The Gibbs sampler

Periodic Gibbs sampler: change one component of the vector at a time, using the conditional distribution to generate it (one-dimensional truncated Poisson).

(m | m )

k

π

  • k

Step k Step k+1

After R steps the vector is updated in all components, π(m|ζk+1) = Pj(m) Pj(Λj(ζk)), Λj(ζk) = Λ − max

i:j∈Ci

  • c∈Ci

ζk(c)1{j = c}.

  • all samples are accepted: truncates to the feasible region
  • one component at a time: no need to evaluate G

Remark: {ζkR}∞

k=0 is a Markov chain.

slide-20
SLIDE 20

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 19

5.2 Markov chain Monte Carlo: Gibbs Sampler Idea: When direct generation of X ∼ π is difficult or impractical, simulate a Markov chain {ζk} with stationary probabilites: π(n) = lim

k→∞

P{ζk = n}, n ∈ S

Sample average satisfies CLT. ζk+1(i) = ζk(i) if i = σk = k mod(R) + 1, ∼ π(·|ζ−σk

k+1)

  • therwise.

, Y (r)

k

=

R

  • i=1

λi λ

  • 1{ζ(r)

k

∈ Bi} : periodic 1 ≤ r ≤ R, Yk =

R

  • i=1

λi λ

  • 1{ζk ∈ Bi} : sequential.

Result: The periodic Gibbs sample has stationary probabilty π(·). The sequential Gibbs sampler is strongly consistent: at each step, calculate Yk instead of only every R(≈ 300) steps.

slide-21
SLIDE 21

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 20

5.2 Markov chain Monte Carlo: Gibbs Sampler Efficiency against offered load for direct simulation, and random, sequential and periodic Gibbs samplers applied to a 5 × 5 mesh-torus.

0.001 0.01 0.1 1 10 100 1000 10000 100000 0.05 0.10 0.20 0.30 relative efficiency (1/s) load per route, rho_i Accept/reject periodic Gibbs sequential Gibbs randomised Gibbs 0.001 0.01 0.1 1 10 100 1000 10000 100000 0.8 1.0 1.5 2.0 relative efficiency (1/s) load per route, rho_i Accept/reject periodic Gibbs sequential Gibbs randomised Gibbs

. . . except for very high loads Gibbs is not performing much better.

slide-22
SLIDE 22

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 21

5.3 Filtered Gibbs sampler – Reduction of CPU time Sequential Gibbs: {ζk} ∈ S a Markov chain. At each step use Yk =

R

  • i=1

λi λ

  • 1{ζk ∈ Bi}

Many terms which are highly correlated between samples: component σ(k) = k mod R + 1 is the only one that changes, 1{ζk ∈ Bi} ∈ {0, 1} will be identical for many routes i from k − 1 to k. For any i ∈ {1, . . . , R}: Yk,i = 1{ζk+1 ∈ Bi} ⇒

E[Yk,i] = Bi,

lim

N→∞

1 N(i)

N

  • k=1

Yk,i 1{σk = i} → Bi R ≈ 300, many computations are wasted. Dedicate each iteration k to only one component i = σk . . . ?

slide-23
SLIDE 23

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 22

5.3 Filtered Gibbs sampler – Reduction of CPU time Sequential Gibbs: {ζk} ∈ S a Markov chain. At each step use Yk =

R

  • i=1

λi λ

  • 1{ζk ∈ Bi}

Dedicate each iteration k to only one component i = σk . . .? Greater effects on routes sharing more cliques. Only determine blocking on updated route Yk = R λi λ

  • 1{ζk ∈ Bσk},

σk = k mod R + 1. Result: The sample average ˆ Y (N) is strongly consistent for B and converges in the order O(N−1/2).

slide-24
SLIDE 24

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 23

5.3 Filtered Gibbs sampler – Reduction of Variance Yk = R λi λ

  • 1{ζk ∈ Bσk} σk = k mod R + 1

Usual conditioning argument:

V a r(Y ) = V a r( E(Y |X)) + E( V a r(Y |X)),

Z =

E(Y |X) satisfies E(Z) = B and V a r(Z) ≤ V a r(Y ).

ˆ Y ′(N) = 1 N

N

  • k=1
E[Yk|ζk−1]

conditions on natural filtration of the process (hence “filtered”).

slide-25
SLIDE 25

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 24

5.3 Filtered Gibbs sampler – Reduction of Variance Yk = R λi λ

  • 1{ζk ∈ Bσk} σk = k mod R + 1

Usual conditioning argument:

V a r(Y ) = V a r( E(Y |X)) + E( V a r(Y |X)),

Z =

E(Y |X) satisfies E(Z) = B and V a r(Z) ≤ V a r(Y ).

ˆ Y ′(N) = 1 N

N

  • k=1
E[Yk|ζk−1]

conditions on natural filtration of the process (hence “filtered”). Instead of using 1{ζk ∈ Bσ}, estimate the blocking using the conditional probability: Yk = R λi λ

  • P(ζk ∈ Bσk|ζk−1)

Result: The Filtered Gibbs sampler is strongly consitstent for B and converges in the order O(N−1/2).

P[Xk+1 ∈ Bj|ζk] = Pj(Λj(ζk)) − Pj(Λj(ζk) − 1))

Pj(Λi(ζk))

slide-26
SLIDE 26

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 25

5.3 Markov chain Monte Carlo: Gibbs Sampler Efficiency against offered load for direct simulation, filtered randomised and filtered sequential Gibbs samplers applied to a 5 × 5 mesh-torus.

0.1 1 10 100 1000 10000 100000 0.05 0.10 0.20 0.30 relative efficiency (1/s) load per route, rho_i Accept/reject filtered sequential Gibbs filtered randomised Gibbs 1 10 100 1000 10000 100000 0.8 1.0 1.5 2.0 relative efficiency (1/s) load per route, rho_i Accept/reject filtered sequential Gibbs filtered randomised Gibbs

Realistic network sizes and loads: Λ = 12 at B = 0.12, the gain factor is 93,931: if our method requires 1 minute, the usual A/R would need 65.2 days to complete the simulation. As network size grows it overcomes the curse of dimensionality.

slide-27
SLIDE 27

Felisa J. V´ azquez-Abad and Lachlan L. H. Andrew 26

On-going and future work

  • Use of stratification to choose σk in a judicious way, for variance reduction.
  • Application of Filtered Gibbs sampler for more complex networks

– partial conversion, asymmetric, – use efficient clique determination algorithms: graph theory.

  • Application of similar techniques to realistic route assignment.
  • Relative efficiency ↓ as B → 0: “rare events”

– Use of fast simulation with importance sampling: direct simulation of the process. – Can we apply fast simulation to Gibbs sampler directly?