Asymptotic Robustness of Estimators in Rare-Event Simulation P. - - PowerPoint PPT Presentation

asymptotic robustness of estimators in rare event
SMART_READER_LITE
LIVE PREVIEW

Asymptotic Robustness of Estimators in Rare-Event Simulation P. - - PowerPoint PPT Presentation

Asymptotic Robustness of Estimators in Rare-Event Simulation P. LEcuyer, Universit e de Montr eal, Canada J. H. Blanchet, Columbia University, USA P. W. Glynn, Stanford University, USA B. Tuffin, IRISA, Rennes, France RESIM 2008,


slide-1
SLIDE 1

Asymptotic Robustness of Estimators in Rare-Event Simulation

  • P. L’Ecuyer, Universit´

e de Montr´ eal, Canada

  • J. H. Blanchet, Columbia University, USA
  • P. W. Glynn, Stanford University, USA
  • B. Tuffin, IRISA, Rennes, France

RESIM 2008, Rennes

slide-2
SLIDE 2

Outline

◮ Rare-event setting and motivation. ◮ Asymptotic robustness properties: Definitions. ◮ Markov chain model and zero-variance approximation. ◮ Examples: Highly-reliable Markovian systems. ◮ Examples: Random walks.

slide-3
SLIDE 3

Rare-event setting

We want to estimate a small quantity γ = γ(ε) > 0, where γ(ε) → 0 when ε → 0.

slide-4
SLIDE 4

Rare-event setting

We want to estimate a small quantity γ = γ(ε) > 0, where γ(ε) → 0 when ε → 0. We have an unbiased estimator (r.v.) Y = Y (ε) ≥ 0.

slide-5
SLIDE 5

Rare-event setting

We want to estimate a small quantity γ = γ(ε) > 0, where γ(ε) → 0 when ε → 0. We have an unbiased estimator (r.v.) Y = Y (ε) ≥ 0. Example: Y (ε) is an indicator function, P[Y (ε) = 1] = γ(ε). Then the relative variance (squared relative error) blows up: Var[Y ] γ2(ε) = 1 − γ(ε) γ(ε) ≈ 1 γ(ε) → ∞ when ε → 0. Standard Monte Carlo (MC): estimate γ by ¯ Yn, the average of n i.i.d. copies of Y . For a meaningful estimate, need n = O(1/γ(ε)). If γ = 10−10, for example, need n = 1014 for 1% relative error. Two popular cases, γ(ε) ≈ e−η/ε and γ(ε) ≈ poly(ε).

slide-6
SLIDE 6

Some applications where this type of problem happens

◮ Expected amount of radiation that crosses a given protection

shield.

◮ Probability of a large loss from an investment portfolio. ◮ Value-at-risk (quantile estimation). ◮ Ruin probability for an insurance firm. ◮ Probability that the completion time of a large project exceeds

a given threshold.

◮ Probability of buffer overflow, or mean time to overflow, in a

queueing system.

◮ Proportion of packets lost in a communication system. ◮ Air traffic control. ◮ Mean time to failure or other reliability or availability measure

for a highly reliable system (e.g., fault-tolerant computers, safety systems).

slide-7
SLIDE 7

Major techniques to improve this type of situation: importance sampling (IS) and splitting.

slide-8
SLIDE 8

Major techniques to improve this type of situation: importance sampling (IS) and splitting. Commonly-used robustness characterizations of Y (ε): It has bounded relative error (BRE) (bounded relative variance) if lim

ε→0

Var[Y (ε)] γ2(ε) < ∞. It is logarithmically efficient (LE) or asymptotically optimal if lim

ε→0

ln E[Y 2(ε)] 2 ln γ(ε) = 1. Means (roughly) that if γ(ε) → 0 at an exponential rate, then the standard deviation converges at least at the same exponential rate.

slide-9
SLIDE 9

Are there other useful characterizations?

To get a meaningful variance estimator when ε → 0, we would like to control the relative error of the empirical variance. This involves the fourth moment of Y (ε). If we use a CLT, we may want to show that the quality of the normal approximation remains good when ε → 0, by bounding the Berry-Esseen bound on the approx. error. This involves the third moment. In some settings, we may be interested in bounding the relative moment of order 2 + δ for some small δ > 0. And so on.

slide-10
SLIDE 10

BRM-k

For k ∈ [1, ∞), the relative moment of order k for Y (ε) is mk(ε) = E[Y k(ε)]/γk(ε). For any fixed ε, mk(ε) is nondecreasing in k (from Jensen).

slide-11
SLIDE 11

BRM-k

For k ∈ [1, ∞), the relative moment of order k for Y (ε) is mk(ε) = E[Y k(ε)]/γk(ε). For any fixed ε, mk(ε) is nondecreasing in k (from Jensen). Y (ε) has a bounded relative moment of order k (BRM-k) if lim sup

ε→0

mk(ε) < ∞.

slide-12
SLIDE 12

BRM-k

For k ∈ [1, ∞), the relative moment of order k for Y (ε) is mk(ε) = E[Y k(ε)]/γk(ε). For any fixed ε, mk(ε) is nondecreasing in k (from Jensen). Y (ε) has a bounded relative moment of order k (BRM-k) if lim sup

ε→0

mk(ε) < ∞. Some properties: (i) BRE is equivalent to BRM-2. (ii) BRM-k implies BRM-k′ for 1 ≤ k′ < k. (iii) For positive real numbers k, ℓ, m, if Y (ε) = X ℓ(ε) is BRM-mk, then Y ′(ε) = X mℓ(ε) is BRM-k.

slide-13
SLIDE 13

LE-k

Y (ε) has logarithmic efficiency of order k (LE-k) if lim

ε→0

ln E[Y k(ε)] k ln γ(ε) = 1.

slide-14
SLIDE 14

Example: γ(ε) = exp[−η/ε], E[Y k(ε)] = q(1/ε) exp[−kη/ε] for some constant η > 0 and a slowly increasing function q (e.g., a polynomial). Then, we have LE-k but not BRE-k.

slide-15
SLIDE 15

Example: γ(ε) = exp[−η/ε], E[Y k(ε)] = q(1/ε) exp[−kη/ε] for some constant η > 0 and a slowly increasing function q (e.g., a polynomial). Then, we have LE-k but not BRE-k. Example: γk(ε) = q1(ε) = εt1 + o(εt1), E[Y k(ε)] = q2(ε) = εt2 + o(εt2), where t2 ≤ t1. Here, lim

ε→0

ln E[Y k(ε)] k ln γ(ε) = t2 t1 . We have BRM-k iff t2 = t1 iff LE-k.

slide-16
SLIDE 16

Bounded normal approximation

Berry-Esseen Theorem (one version): Y1, . . . , Yn i.i.d. r.v.’s with E[Y1] = 0, Var[Y1] = σ2, E[|Y1|3] = β3. Let ¯ Yn and S2

n be the empirical mean and variance, and Fn the

distribution function of √n ¯ Yn/Sn. Then, there is an absolute constant a < ∞ such that sup

n≥2, x∈R

|Fn(x) − Φ(x)| ≤ aβ3 σ3√n. Y (ε) is said to have Bounded Normal Approximation (BNA) if lim sup

ε→0

E

  • |Y (ε) − γ(ε)|3

σ3(ε) < ∞ (Tuffin 1999). This requires that the Berry-Esseen bound remains O(n−1/2) when ε → 0.

slide-17
SLIDE 17

BNA is not equivalent to BRM-3, because we divide by σ3(ε) here. One can have BNA and not BRM-3, or vice-versa. There are more general versions of the Berry-Esseen inequality that require only a bounded moment of order 2 + δ for any δ ∈ (0, 1] instead of the third moment β3; see, e.g., Petrov (1995). But the bound on |Fn(x) − Φ(x)| is only O(n−δ/2) instead of O(n−1/2).

slide-18
SLIDE 18

Robustness of the empirical variance

Let X1(ε), . . . , Xn(ε) be an i.i.d. sample of X(ε). We consider the empirical variance Y (ε) = S2

n(ε) as an estimator of the variance

σ2(ε). Let γ(ε) = E[X(ε)].

slide-19
SLIDE 19

Robustness of the empirical variance

Let X1(ε), . . . , Xn(ε) be an i.i.d. sample of X(ε). We consider the empirical variance Y (ε) = S2

n(ε) as an estimator of the variance

σ2(ε). Let γ(ε) = E[X(ε)]. BRM-4 for X(ε) and BRE for S2

n(ε) are both linked to E[X 4(ε)],

but they are not equivalent. We have Var[S2

n(ε)]

σ4(ε) = Θ E[(X(ε) − γ(ε))4] σ4(ε)

  • which differs in general from

Θ

  • E[X 4(ε)]/γ4(ε)
  • .

Proposition: If σ2(ε) = Θ(γ2(ε)), then BRM-2k for X(ε) implies BRM-k for S2

n(ε), for any k ≥ 1.

A similar observation applies to the equivalence between LE-4 for X(ε) and LE for S2

n(ε).

slide-20
SLIDE 20

Vanishing relative centered moments

The relative centered moment of order k, for Y (ε), is ck(ε) = E[|Y (ε) − γ(ε)|k] γk(ε) . Y (ε) has vanishing relative centered moment of order k (VRCM-k) if lim sup

ε→0

ck(ε) = 0. True if and only if lim sup

ε→0

mk(ε) = 1.

slide-21
SLIDE 21

Vanishing relative centered moments

The relative centered moment of order k, for Y (ε), is ck(ε) = E[|Y (ε) − γ(ε)|k] γk(ε) . Y (ε) has vanishing relative centered moment of order k (VRCM-k) if lim sup

ε→0

ck(ε) = 0. True if and only if lim sup

ε→0

mk(ε) = 1. It has vanishing relative variance or relative error (VRE), if lim sup

ε→0

σ(ε) γ(ε) = 0.

slide-22
SLIDE 22

Vanishing relative centered moments

The relative centered moment of order k, for Y (ε), is ck(ε) = E[|Y (ε) − γ(ε)|k] γk(ε) . Y (ε) has vanishing relative centered moment of order k (VRCM-k) if lim sup

ε→0

ck(ε) = 0. True if and only if lim sup

ε→0

mk(ε) = 1. It has vanishing relative variance or relative error (VRE), if lim sup

ε→0

σ(ε) γ(ε) = 0. VRCM-k implies VRCM-k′ for 1 ≤ k′ ≤ k. It also implies BRM-k.

slide-23
SLIDE 23

VRCM implies convergence to the zero-variance IS

Suppose γ(ε) = EPε[Y (ε)] =

Y (ε, ω)dPǫ(ω). Here, we can get zero variance (in principle) by doing importance sampling with the measure Q∗

ε defined by

dQ∗

ε (ω) = Y (ε, ω)

γ(ε) dPε(ω). Proposition: If Y (ε) is VRCM-(1 + δ) for some δ > 0, then lim

ε→0 sup A

|Pε(A) − Q∗

ε (A)| = 0.

slide-24
SLIDE 24

Proof: For any measurable set A, sup

A

|Q∗

ε (A) − Pε(A)|

≤ sup

A

|EPε [(dQ∗

ε /dPε) I(A)] − EPε[I(A)]|

≤ EPε |dQ∗

ε /dPε − 1|

≤ E1/(1+δ)

  • |dQ∗

ε /dPε − 1|(1+δ)

≤ E1/(1+δ)

  • |Y (ε)/γ(ε) − 1|(1+δ)

= [c1+δ(ε)]1/(1+δ)

ε→0

− → 0.

slide-25
SLIDE 25

First-Passage Probability in a Markov Chain

Markov chain X = {Xj, j ≥ 0} with state space S and transition kernel K = {K (x, C) : x ∈ S, C ⊆ S}. For C ⊂ S, define τC = inf{j ≥ 0 : Xj ∈ C}. Given A and B, two disjoint subsets of S, and initial state x0 ∈ (A ∪ B)c, want to estimate γ(x0), where γ(x) = γ(x, ε) = P[τB < τA | X0 = x] We have γ(x) = 1 for x ∈ B and γ(x) = 0 for x ∈ A. Here, K, A, and B may depend on ε.

slide-26
SLIDE 26

Importance sampling (IS) replace K by another kernel. We get zero variance if we use the kernel K ∗ (x, dy) = K (x, dy) γ(y) γ(x).

slide-27
SLIDE 27

Importance sampling (IS) replace K by another kernel. We get zero variance if we use the kernel K ∗ (x, dy) = K (x, dy) γ(y) γ(x). Not practical. But can be used to design a good IS scheme of the form Kv (x, dy) = K (x, dy) v (y) w (x), where v : S → [0, ∞) is a good approximation of γ(·) and w (x) =

  • S

K (x, dy) v (y) is the appropriate normalizing constant, assumed finite for x ∈ (A ∪ B)c. (When v = γ, we have w = v.)

slide-28
SLIDE 28

IS estimator of γ(x0): Y = Y (ε) = I[τB < τA] L, where L =

τB

  • k=1

w(Xk−1) v(Xk) = w(X0) v(XτB)

τB−1

  • k=1

w(Xk) v(Xk) Can take v(x) = 1 for x ∈ B and v(x) = 0 for x ∈ A.

slide-29
SLIDE 29

IS estimator of γ(x0): Y = Y (ε) = I[τB < τA] L, where L =

τB

  • k=1

w(Xk−1) v(Xk) = w(X0) v(XτB)

τB−1

  • k=1

w(Xk) v(Xk) Can take v(x) = 1 for x ∈ B and v(x) = 0 for x ∈ A. To establish robustness properties such as LE-k, BRM-k, and VRCM-k, we need an asymptotic bound on E[Y k(ε)]/γk(x0, ε).

slide-30
SLIDE 30

Proposition: Bounds via Lyapunov inequalities. Suppose there are positive constants κ1 and κ2 and a function hk : S → [0, ∞) such that v(x) ≥ κ1 and hk(x) ≥ κ2 for each x ∈ B, and Ev,x w(x) v(x) k hk(X1)

  • ≤ hk(x)

for all x ∈ (A ∪ B)c. Then, for all x ∈ (A ∪ B)c, Ev,x[Y k] ≤ vk(x)hk(x) κ1κ2 .

slide-31
SLIDE 31

Corollary. Under the proposition’s conditions: (i) If lim

ε→0 k ln[v(x0, ε)/γ(x0, ε)] + hk(x0, ε) = 1,

then Y (ε) is LE-k. (ii) If lim

ε→0[v(x0, ε)/γ(x0, ε)]khk(x0, ε) < ∞,

then Y (ε) is BRM-k. (iii) If lim

ε→0

[v(x0, ε)/γ(x0, ε)]khk(x0, ε) κ1κ2 = 1, then Y (ε) is VRCM-k.

slide-32
SLIDE 32

HRMS-type Framework

Suppose S is finite and {Xj, j ≥ 0} has transition probabilities p(x, y, ε) = P[Xj = y | Xj−1 = x] = a(x, y)εb(x,y), where a(x, y) and b(x, y) are non-negative constants. Ex.: multicomponent highly-reliable Markovian system (HRMS): A = {x0} is the single state where all components are up and B is the set of states where the system is failed. We typically have b(x, y) > 0 for “failure” transitions and b(x, y) = 0 for “repair” transitions (Shahabuddin, Nakayama, ...). In this setting, γ(ε) = Θ(εr) for some r > 0.

slide-33
SLIDE 33

Some proposed IS heuristics: Balanced failure biasing (BFB) (Shahabuddin 1994) changes p to q as follows, for x ∈ B: q(x, y) =         

1 |F(x)|

if y ∈ F(x) and pR(x) = 0; ρ

1 |F(x)|

if y ∈ F(x) and pR(x) > 0; (1 − ρ) p(x,y)

pR(x)

if y ∈ R(x);

  • therwise.
slide-34
SLIDE 34

Some proposed IS heuristics: Balanced failure biasing (BFB) (Shahabuddin 1994) changes p to q as follows, for x ∈ B: q(x, y) =         

1 |F(x)|

if y ∈ F(x) and pR(x) = 0; ρ

1 |F(x)|

if y ∈ F(x) and pR(x) > 0; (1 − ρ) p(x,y)

pR(x)

if y ∈ R(x);

  • therwise.

Simple failure biasing (SFB) (Shahabuddin 1988): Replace 1/|F(x)| above by p(x, y)/

y∈F(x) p(x, y).

slide-35
SLIDE 35

Some proposed IS heuristics: Balanced failure biasing (BFB) (Shahabuddin 1994) changes p to q as follows, for x ∈ B: q(x, y) =         

1 |F(x)|

if y ∈ F(x) and pR(x) = 0; ρ

1 |F(x)|

if y ∈ F(x) and pR(x) > 0; (1 − ρ) p(x,y)

pR(x)

if y ∈ R(x);

  • therwise.

Simple failure biasing (SFB) (Shahabuddin 1988): Replace 1/|F(x)| above by p(x, y)/

y∈F(x) p(x, y).

SBLR (Alexopoulos and Shultes 2001) changes the probabilities in a way that over any cycle in the visited states during the simulation, the cumulated likelihood ratio remains bounded

slide-36
SLIDE 36

Some proposed IS heuristics: Balanced failure biasing (BFB) (Shahabuddin 1994) changes p to q as follows, for x ∈ B: q(x, y) =         

1 |F(x)|

if y ∈ F(x) and pR(x) = 0; ρ

1 |F(x)|

if y ∈ F(x) and pR(x) > 0; (1 − ρ) p(x,y)

pR(x)

if y ∈ R(x);

  • therwise.

Simple failure biasing (SFB) (Shahabuddin 1988): Replace 1/|F(x)| above by p(x, y)/

y∈F(x) p(x, y).

SBLR (Alexopoulos and Shultes 2001) changes the probabilities in a way that over any cycle in the visited states during the simulation, the cumulated likelihood ratio remains bounded These methods do not attempt to mimic zero-variance sampling.

slide-37
SLIDE 37

Proposed approximation (ZVA)

Approximate γ by some easily computable function v, and plug into zero-variance formula. For any state y ∈ B, let Γ(y) be the set of all paths π = (y = y0 → y1 → · · · → yk) where y1, . . . , yk−1 ∈ B ∪ {0}, yk ∈ B, and having positive probability p(π) =

k

  • j=1

p(yj−1, yj) > 0.

slide-38
SLIDE 38

Proposed approximation (ZVA)

Approximate γ by some easily computable function v, and plug into zero-variance formula. For any state y ∈ B, let Γ(y) be the set of all paths π = (y = y0 → y1 → · · · → yk) where y1, . . . , yk−1 ∈ B ∪ {0}, yk ∈ B, and having positive probability p(π) =

k

  • j=1

p(yj−1, yj) > 0. Because these paths represent disjoint events, we have γ(y) =

  • π∈Γ(y)

p(π).

slide-39
SLIDE 39

Proposed approximation (ZVA)

Approximate γ by some easily computable function v, and plug into zero-variance formula. For any state y ∈ B, let Γ(y) be the set of all paths π = (y = y0 → y1 → · · · → yk) where y1, . . . , yk−1 ∈ B ∪ {0}, yk ∈ B, and having positive probability p(π) =

k

  • j=1

p(yj−1, yj) > 0. Because these paths represent disjoint events, we have γ(y) =

  • π∈Γ(y)

p(π). This last sum may contain a huge (perhaps ∞) number of terms.

slide-40
SLIDE 40

A very crude approximation is to just take the path with largest probability, i.e., approximate γ(y) =

  • π∈Γ(y)

p(π) by its lower bound v0(y) = max

π∈Γ(y) p(π).

Computing v0(y) amounts to computing a shortest path from y to B, where the length of a link y′ → y′′ is − log p(y′, y′′). Easy.

slide-41
SLIDE 41

A very crude approximation is to just take the path with largest probability, i.e., approximate γ(y) =

  • π∈Γ(y)

p(π) by its lower bound v0(y) = max

π∈Γ(y) p(π).

Computing v0(y) amounts to computing a shortest path from y to B, where the length of a link y′ → y′′ is − log p(y′, y′′). Easy. This would work fine if a single path dominates the sum (this may happen when failure transitions have very small probabilities), but this v0 may underestimate the bound significantly. Slight improvements: take the sum of probabilities of a few disjoint paths.

slide-42
SLIDE 42

Refinements

Typically, the farther we are from B, the more v0 underestimates γ. Close to B, things are fine, but not close to 0. First simple correction:

  • 1. Estimate γ(0) in preliminary runs with crude IS strategy;
  • 2. Find constant α ≤ 1 such that (v0(0))α equals this estimate;
  • 3. Use v1(y) = (v0(y))α for all y ∈ B as approx. of γ(y).

This v1 matches γ for y ∈ B and matches its estimate at y = 0.

slide-43
SLIDE 43

Refinements

Typically, the farther we are from B, the more v0 underestimates γ. Close to B, things are fine, but not close to 0. First simple correction:

  • 1. Estimate γ(0) in preliminary runs with crude IS strategy;
  • 2. Find constant α ≤ 1 such that (v0(0))α equals this estimate;
  • 3. Use v1(y) = (v0(y))α for all y ∈ B as approx. of γ(y).

This v1 matches γ for y ∈ B and matches its estimate at y = 0. Second refinement: Replace α by a state-dependent exponent α(y) = 1 + [α(0) − 1]log v0(y) log v0(0), where α(0) = α as above. This α(y) changes progressively from 1 near B to α(0) < 1 in state 0. The correction here is milder than in the previous case when we are close to B. Let v2(y) = (v0(y))α(y) be the resulting approximation.

slide-44
SLIDE 44

Example: Three types of components

c = 3 and n1 = n2 = n3.

  • Expon. repair times with mean 1.

Failure rate λi for component type i, with λ1 = ε, λ2 = 1.5ε, and λ3 = 2ε2, for some small real number ε. We will try different values of (ni, ε). B = states where at least one component type has fewer than 2

  • perational units.
slide-45
SLIDE 45

Example: Three types of components

c = 3 and n1 = n2 = n3.

  • Expon. repair times with mean 1.

Failure rate λi for component type i, with λ1 = ε, λ2 = 1.5ε, and λ3 = 2ε2, for some small real number ε. We will try different values of (ni, ε). B = states where at least one component type has fewer than 2

  • perational units.

To define v0(y), we consider all three paths to B that result from failures of a single component type, and sum their probabilities. The table contains results with n = 220 runs. Best estimate of γ(0): obtained from a large number of runs with

  • ur best IS strategies.
slide-46
SLIDE 46

Mean ni ε γ(0) v0(0) BFB SBLR 3 0.001 2.6 × 10−3 1.3 × 10−3 2.7 × 10−3 2.6 × 10−3 6 0.01 1.8 × 10−7 3.4 × 10−8 1.9 × 10−7 [9.9 × 10−7] 6 0.001 1.7 × 10−11 3.4 × 10−12 1.8 × 10−11 (1.8 × 10−16) 12 0.1 6.0 × 10−8 3.2 × 10−9 4.8 × 10−8 1.3 × 10−8 12 0.001 3.9 × 10−28 3.5 × 10−29 (1.8 × 10−40) (2.9 × 10−45) Variance ni ε BFB SBLR 3 0.001 1.8 × 10−2 8.0 × 10−3 6 0.01 6.3 × 10−11 (4.5 × 10−16) 6 0.001 8.8 × 10−19 (2.0 × 10−26) 12 0.1 8.1 × 10−10 1.7 × 10−10 12 0.001 (3.2 × 10−74) (3.5 × 10−84)

slide-47
SLIDE 47

Mean ni ε γ(0) ZVA(v0) ZVA(v1) ZVA(v2) 3 0.001 2.6 × 10−3 2.6 × 10−3 2.6 × 10−3 2.6 × 10−3 6 0.01 1.8 × 10−7 1.8 × 10−7 1.8 × 10−7 1.8 × 10−7 6 0.001 1.7 × 10−11 1.7 × 10−11 1.7 × 10−11 1.7 × 10−11 12 0.1 6.0 × 10−8 6.0 × 10−8 6.2 × 10−8 6.7 × 10−8 12 0.001 3.9 × 10−28 3.9 × 10−28 3.9 × 10−28 3.9 × 10−28 Variance ni ε α ZVA(v0) ZVA(v1) ZVA(v2) RE(v2) 3 0.001 0.906 6.5 × 10−4 2.7 × 10−3 9.3 × 10−9 0.04 6 0.01 0.903 2.0 × 10−14 1.2 × 10−14 7.7 × 10−15 0.48 6 0.001 0.939 1.2 × 10−23 1.1 × 10−23 7.6 × 10−24 0.16 12 0.1 0.851 1.6 × 10−10 2.9 × 10−10 1.5 × 10−11 64.50 12 0.001 0.963 1.4 × 10−55 9.3 × 10−56 9.4 × 10−56 0.78

slide-48
SLIDE 48
  • Proposition. In this HRMS framework, with SFB,

BRM-k and LE-k are equivalent. They are also equivalent for the gth empirical moment.

slide-49
SLIDE 49
  • Proposition. In this HRMS framework, with SFB,

BRM-k and LE-k are equivalent. They are also equivalent for the gth empirical moment.

  • Proposition. Suppose we have an IS scheme such that

p(x, y, ε) = Θ(εd) implies q(x, y, ε) = Θ(εℓ) for ℓ ≤ d. Let E[Y g(ε)] = Θ(εsg )]. Then we have BRM-k of the g-th empirical moment if and only if for all integers m such that r ≤ m < ksg and all sample paths (x0, · · · , xn) leading to B and having probability Θ(εm), P∗{(X0, · · · , Xτ) = (x0, · · · , xn)} = Θ(εℓ) for some ℓ ≤ k(mg − sg)/(kg − 1).

slide-50
SLIDE 50
  • Proposition. In this HRMS framework, with SFB,

BRM-k and LE-k are equivalent. They are also equivalent for the gth empirical moment.

  • Proposition. Suppose we have an IS scheme such that

p(x, y, ε) = Θ(εd) implies q(x, y, ε) = Θ(εℓ) for ℓ ≤ d. Let E[Y g(ε)] = Θ(εsg )]. Then we have BRM-k of the g-th empirical moment if and only if for all integers m such that r ≤ m < ksg and all sample paths (x0, · · · , xn) leading to B and having probability Θ(εm), P∗{(X0, · · · , Xτ) = (x0, · · · , xn)} = Θ(εℓ) for some ℓ ≤ k(mg − sg)/(kg − 1).

  • Proposition. With SFB or BFB, one cannot achieve VRCM-k.
slide-51
SLIDE 51

ZVA

  • Proposition. With our ZVA scheme, if we just take v(y) as the

probability of the most probable path to failure from y, then v(y) = Θ(γ(y)) for all y and we have BRE.

slide-52
SLIDE 52

ZVA

  • Proposition. With our ZVA scheme, if we just take v(y) as the

probability of the most probable path to failure from y, then v(y) = Θ(γ(y)) for all y and we have BRE.

  • Proposition. If v(y)/γ(y) → 1 for each y when ε → 0, then we

have VRCM-k. This holds if v(y) is the sum of probabilities of all the dominant paths from y (those with the smallest power of ε).

slide-53
SLIDE 53

Probability of a Large Deviation for a Random Walk

Let D1, D2, . . . be i.i.d. random variables, and Sj = D1 + · · · + Dj for j ≥ 0. Let ℓ > E[Dj], n = n(ε) = ⌈1/ε⌉, and γ(ε) = P[Sn/n ≥ ℓ]. We have γ(ε) → 0 when ε → 0.

slide-54
SLIDE 54

State-Independent Exponential Twisting

Here, it is well known that an LE-2 estimator can be obtained via IS with exponential twisting, if Dj has a light tail distribution Idea: multiply the density of Dj by eθx and normalize. The IS estimator is Y (θ, ε) = I[Sn ≥ nℓ] · L(θ, Sn).

slide-55
SLIDE 55

State-Independent Exponential Twisting

Here, it is well known that an LE-2 estimator can be obtained via IS with exponential twisting, if Dj has a light tail distribution Idea: multiply the density of Dj by eθx and normalize. The IS estimator is Y (θ, ε) = I[Sn ≥ nℓ] · L(θ, Sn). Let θ∗

ℓ be a positive root of Ψ′(θ) = ℓ, where Ψ is the cumulant

generating function of Dj. Let I(ℓ) = ℓθ∗

ℓ − Ψ(θ∗ ℓ), the large deviation rate function.

Results of Sadowsky (1993) directly imply the following:

  • Proposition. For any integer k ≥ 2 and any θ, Y (θ, ε) is not

BRM-k. It is LE-k if and only if θ = θ∗

ℓ.

  • Proposition. The sample variance of m copies of Y (θ, ε), as an

estimator of the true variance, is not BRM-k. It is LE-k if and only if θ = θ∗

ℓ.

slide-56
SLIDE 56

A State-Dependent IS Scheme Can Achieve BRM-k

We know that for n → ∞, γ(ε) = P[Sn ≥ nℓ] = exp[−nI(ℓ)] [2πnΨ′′(θ∗

ℓ)]1/2θ∗ ℓ

[1 + o(1)].

slide-57
SLIDE 57

A State-Dependent IS Scheme Can Achieve BRM-k

We know that for n → ∞, γ(ε) = P[Sn ≥ nℓ] = exp[−nI(ℓ)] [2πnΨ′′(θ∗

ℓ)]1/2θ∗ ℓ

[1 + o(1)]. When the Markov chain is in state Xj = (j, Sj) = (j, s) = x, we can approximate γ(x) = γ(j, s) = P[Sn − Sj ≥ ℓ − s] by v(j, s) = exp[−(n − j)I(ℓj)] [2π(n − j)Ψ′′(θ∗

ℓj)]1/2θ∗ ℓj

where ℓj = (nℓ − Sj)/(n − j). Then we use the corresponding zero-variance approximation. By verifying the Lyapunov conditions, we can show that this IS estimator has BRM-k for all k.

slide-58
SLIDE 58

Increments With Heavy Tails

Random walk with negative drift; probability that the walk exceeds a given positive threshold. Can obtain state-dependent IS having BRM-k for all k.