State-dependent Foster-Lyapunov criteria Stephen Connor - - PowerPoint PPT Presentation

state dependent foster lyapunov criteria
SMART_READER_LITE
LIVE PREVIEW

State-dependent Foster-Lyapunov criteria Stephen Connor - - PowerPoint PPT Presentation

Introduction Establishing drift Using drift Applications References State-dependent Foster-Lyapunov criteria Stephen Connor stephen.connor@york.ac.uk Joint work with Gersende Fort, CNRS-TELECOM ParisTech; supported by CRiSM and the French


slide-1
SLIDE 1

Introduction Establishing drift Using drift Applications References

State-dependent Foster-Lyapunov criteria

Stephen Connor

stephen.connor@york.ac.uk

Joint work with Gersende Fort, CNRS-TELECOM ParisTech; supported by CRiSM and the French National Research Agency

March 2012

slide-2
SLIDE 2

Introduction Establishing drift Using drift Applications References

Outline

1

Introduction The general problem Drift conditions

2

Establishing a subsampled drift condition Examples

3

Using a subsampled drift condition Interplay of subsampling rates and moments

4

Applications

slide-3
SLIDE 3

Introduction Establishing drift Using drift Applications References

The general problem

Let Φ = {Φn, n ≥ 0} be a time-homogeneous Markov chain on a state space X. (Assume that Φ is phi-irreducible and aperiodic, for simplicity.) We’re interested in what can be said about the long-term behaviour of Φ. For example: does Φ converge to an equilibrium distribution? if so, in what norm does this convergence take place, and how fast? and how is this related to the average time spent between successive visits to certain sets?

slide-4
SLIDE 4

Introduction Establishing drift Using drift Applications References

Notation

Pn: n-step transition kernel of Φ; for a non-negative function f , and measure µ: Pnf (x) = Ex[f (Φn)] , µ(f ) =

  • f (y)µ(dy) ;

norm µg: µg = sup

f :|f |≤g

|µ(f )| , · TV ≡ · 1 ; first return time to a set: τA = inf{n ≥ 1 : Φn ∈ A}; C is a small set if ∃ε > 0 and a measure ν s.t. P(x, ·) ≥ εν(·) for all x ∈ C .

slide-5
SLIDE 5

Introduction Establishing drift Using drift Applications References

Notation

Pn: n-step transition kernel of Φ; for a non-negative function f , and measure µ: Pnf (x) = Ex[f (Φn)] , µ(f ) =

  • f (y)µ(dy) ;

norm µg: µg = sup

f :|f |≤g

|µ(f )| , · TV ≡ · 1 ; first return time to a set: τA = inf{n ≥ 1 : Φn ∈ A}; C is a small set if ∃ε > 0 and a measure ν s.t. P(x, ·) ≥ εν(·) for all x ∈ C .

slide-6
SLIDE 6

Introduction Establishing drift Using drift Applications References

Notation

Pn: n-step transition kernel of Φ; for a non-negative function f , and measure µ: Pnf (x) = Ex[f (Φn)] , µ(f ) =

  • f (y)µ(dy) ;

norm µg: µg = sup

f :|f |≤g

|µ(f )| , · TV ≡ · 1 ; first return time to a set: τA = inf{n ≥ 1 : Φn ∈ A}; C is a small set if ∃ε > 0 and a measure ν s.t. P(x, ·) ≥ εν(·) for all x ∈ C .

slide-7
SLIDE 7

Introduction Establishing drift Using drift Applications References

Notation

Pn: n-step transition kernel of Φ; for a non-negative function f , and measure µ: Pnf (x) = Ex[f (Φn)] , µ(f ) =

  • f (y)µ(dy) ;

norm µg: µg = sup

f :|f |≤g

|µ(f )| , · TV ≡ · 1 ; first return time to a set: τA = inf{n ≥ 1 : Φn ∈ A}; C is a small set if ∃ε > 0 and a measure ν s.t. P(x, ·) ≥ εν(·) for all x ∈ C .

slide-8
SLIDE 8

Introduction Establishing drift Using drift Applications References

Notation

Pn: n-step transition kernel of Φ; for a non-negative function f , and measure µ: Pnf (x) = Ex[f (Φn)] , µ(f ) =

  • f (y)µ(dy) ;

norm µg: µg = sup

f :|f |≤g

|µ(f )| , · TV ≡ · 1 ; first return time to a set: τA = inf{n ≥ 1 : Φn ∈ A}; C is a small set if ∃ε > 0 and a measure ν s.t. P(x, ·) ≥ εν(·) for all x ∈ C .

slide-9
SLIDE 9

Introduction Establishing drift Using drift Applications References

Ergodicity

Φ is called ergodic if it has a finite invariant measure π (π = πP). In this case, for any x, Pn(x, ·) − πTV → 0 as n → ∞. (1) Equivalently, we can find a small set C with sup

x∈C

Ex[τC] < ∞.

slide-10
SLIDE 10

Introduction Establishing drift Using drift Applications References

Ergodicity

Φ is called ergodic if it has a finite invariant measure π (π = πP). In this case, for any x, Pn(x, ·) − πTV → 0 as n → ∞. (1) Equivalently, we can find a small set C with sup

x∈C

Ex[τC] < ∞. But how fast does the convergence in (1) take place?

slide-11
SLIDE 11

Introduction Establishing drift Using drift Applications References

Definition Φ is geometrically ergodic if there exists r > 1 with Pn(x, ·) − π(·)TV ≤ Mxr−n .

slide-12
SLIDE 12

Introduction Establishing drift Using drift Applications References

Definition Φ is geometrically ergodic if there exists r > 1 with Pn(x, ·) − π(·)TV ≤ Mxr−n . Equivalently: there exists a scale function V : X → [1, ∞), a small set C, and constants β ∈ (0, 1), b < ∞, with Ex [V (Φ1)] = PV (x) ≤ βV (x) + b1C(x); supx∈C Ex[β−τC] < ∞.

slide-13
SLIDE 13

Introduction Establishing drift Using drift Applications References

Drift conditions

The inequality PV (x) ≤ βV (x) + b1C(x); is called a Foster-Lyapunov drift condition.

  • ften easiest way of showing that Φ is geometrically ergodic;

if V is bounded then Φ is uniformly ergodic.

slide-14
SLIDE 14

Introduction Establishing drift Using drift Applications References

Drift conditions

The inequality PV (x) ≤ βV (x) + b1C(x); is called a Foster-Lyapunov drift condition.

  • ften easiest way of showing that Φ is geometrically ergodic;

if V is bounded then Φ is uniformly ergodic. Subgeometric ergodicity is implied by a weaker drift condition: PV (x) ≤ V (x) − φ ◦ V (x) + b1C(x) for some concave non-negative function φ.

slide-15
SLIDE 15

Introduction Establishing drift Using drift Applications References

Questions:

Drift conditions tend to look at only one step of Φ though: sometimes more convenient to work with a subsampled chain . . .

slide-16
SLIDE 16

Introduction Establishing drift Using drift Applications References

Questions:

Drift conditions tend to look at only one step of Φ though: sometimes more convenient to work with a subsampled chain . . .

1 When can we find a function n : X → N such that

Pn(x)V (x) ≤ βV (x) + b1C(x) ? (2)

slide-17
SLIDE 17

Introduction Establishing drift Using drift Applications References

Questions:

Drift conditions tend to look at only one step of Φ though: sometimes more convenient to work with a subsampled chain . . .

1 When can we find a function n : X → N such that

Pn(x)V (x) ≤ βV (x) + b1C(x) ? (2)

2 Alternatively, if (2) holds for some n and V , what can be said

about moments of the return time to C?

slide-18
SLIDE 18

Introduction Establishing drift Using drift Applications References

Questions:

Drift conditions tend to look at only one step of Φ though: sometimes more convenient to work with a subsampled chain . . .

1 When can we find a function n : X → N such that

Pn(x)V (x) ≤ βV (x) + b1C(x) ? (2)

2 Alternatively, if (2) holds for some n and V , what can be said

about moments of the return time to C?

3 And when is this useful?!

slide-19
SLIDE 19

Introduction Establishing drift Using drift Applications References

Establishing a subsampled drift condition

Theorem (1) Assume that there exist a small set D, a function V : X → [1, ∞) and a continuously differentiable increasing concave function φ : [1, ∞) → (0, ∞), such that supD V < ∞, inf[1,∞) φ > 0, and PV ≤ V − φ ◦ V + b1D . Fix β ∈ (0, 1) and let n : X → N satisfy n(x) ∼ 1

β

  • V

φ◦V

  • (x).

Then for any β < β′ < 1, Pn(x)W ≤ β′W + b′1C , where W = φ ◦ V .

slide-20
SLIDE 20

Introduction Establishing drift Using drift Applications References

Comments

Theorem (1) says that we can deduce a (state-dependent) subsampled geometric drift condition from a one-step drift, but on a different scale; More general results (not requiring a drift condition for V ) can be stated.

slide-21
SLIDE 21

Introduction Establishing drift Using drift Applications References

Examples

PV ≤ V − φ ◦ V + b1D ⇒ Pn(x)W ≤ β′W + b′1C Polynomially ergodic: if φ(t) ∼ ct1−α for some α ∈ (0, 1), then n ∼ V α and W = V 1−α.

slide-22
SLIDE 22

Introduction Establishing drift Using drift Applications References

Examples

PV ≤ V − φ ◦ V + b1D ⇒ Pn(x)W ≤ β′W + b′1C Polynomially ergodic: if φ(t) ∼ ct1−α for some α ∈ (0, 1), then n ∼ V α and W = V 1−α.

Motivation: Deterministic chain: Φn+1 = Φn − Φ0.4

n

(so V (x) = x)

100 200 300 400 500 10 20 30 40 50

  • 20

30 40 50 60 70 200 400 600 800

Decay of W (Φn) = Φ0.4

n

n : W (Φn) ≤ 0.2W (Φ0) (blue line ∝ Φ0.6

0 )

slide-23
SLIDE 23

Introduction Establishing drift Using drift Applications References

Examples

PV ≤ V − φ ◦ V + b1D ⇒ Pn(x)W ≤ β′W + b′1C Polynomially ergodic: if φ(t) ∼ ct1−α for some α ∈ (0, 1), then n ∼ V α and W = V 1−α. Subgeometrically ergodic: if φ(t) ∼ t[ln t]−α for some α > 0, then n ∼ [ln V (x)]α and W = V [ln V ]−α.

slide-24
SLIDE 24

Introduction Establishing drift Using drift Applications References

Examples

PV ≤ V − φ ◦ V + b1D ⇒ Pn(x)W ≤ β′W + b′1C Polynomially ergodic: if φ(t) ∼ ct1−α for some α ∈ (0, 1), then n ∼ V α and W = V 1−α. Subgeometrically ergodic: if φ(t) ∼ t[ln t]−α for some α > 0, then n ∼ [ln V (x)]α and W = V [ln V ]−α. Logarithmically ergodic: if φ(t) ∼ [1 + ln t]α for some α > 0, then n ∼

V [1+ln V ]α and W = [1 + ln V ]α.

slide-25
SLIDE 25

Introduction Establishing drift Using drift Applications References

Using a subsampled drift condition

Question 2 Suppose we know that Pn(x)V (x) ≤ βV (x) + b1C(x) . What can be said about moments of τC?

slide-26
SLIDE 26

Introduction Establishing drift Using drift Applications References

Using a subsampled drift condition

Question 2 Suppose we know that Pn(x)V (x) ≤ βV (x) + b1C(x) . What can be said about moments of τC? If n(x) = c then Φ is geometrically ergodic;

slide-27
SLIDE 27

Introduction Establishing drift Using drift Applications References

Using a subsampled drift condition

Question 2 Suppose we know that Pn(x)V (x) ≤ βV (x) + b1C(x) . What can be said about moments of τC? If n(x) = c then Φ is geometrically ergodic; Alternative drift condition Pn(x)V (x) ≤ βn(x)V (x) + b1C(x) (with no relation assumed between n and V ) shown by Meyn & Tweedie (1994) to also imply geometric ergodicity.

slide-28
SLIDE 28

Introduction Establishing drift Using drift Applications References

Theorem (2) Assume that Pn(x)V (x) ≤ βV (x) + b1C(x) . If there exists a strictly increasing function R : (0, ∞) → (0, ∞) satisfying one of the following conditions (i) t → R(t)/t is non-increasing and R ◦ n ≤ V , (ii) R is a convex continuously differentiable function such that R′ is log-concave and R−1(V ) − R−1(βV ) ≥ n, then there exists a constant M such that Ex[R(τC)] ≤ M (V (x) + b1C(x)) .

slide-29
SLIDE 29

Introduction Establishing drift Using drift Applications References

If n(x) ≤ V (x) then taking R(t) = t in (i) we obtain sup

x∈C

Ex[τC] < ∞ ;

slide-30
SLIDE 30

Introduction Establishing drift Using drift Applications References

If n(x) ≤ V (x) then taking R(t) = t in (i) we obtain sup

x∈C

Ex[τC] < ∞ ; Part (i) can be applied when V = ξ ◦ n for some increasing concave function ξ (i.e. useful when n ≫ V );

slide-31
SLIDE 31

Introduction Establishing drift Using drift Applications References

If n(x) ≤ V (x) then taking R(t) = t in (i) we obtain sup

x∈C

Ex[τC] < ∞ ; Part (i) can be applied when V = ξ ◦ n for some increasing concave function ξ (i.e. useful when n ≫ V ); Alternatively, if n = ξ ◦ V then R−1(t) ∼ t

1

ξ(u) u du satisfies (ii) (i.e. this is useful when n/V decreasing).

slide-32
SLIDE 32

Introduction Establishing drift Using drift Applications References

Interplay of subsampling rates and moments

Geometric rates: if n(x) = 1, take R(t) = κt, with 1 ≤ κ ≤ β−1. Then R is convex and log-concave; R−1(V ) − R−1(βV ) = (ln β−1)/(ln κ) ≥ 1 = n. Thus (ii) shows that Ex [R(τC)] = Ex [κτC] ≤ M (V (x) + b1C(x)) , and so Ex [β−τC] < ∞.

slide-33
SLIDE 33

Introduction Establishing drift Using drift Applications References

Interplay of subsampling rates and moments

Polynomial rates: suppose n(x) ∼ V

α (1−α) (x), for some α ∈ (0, 1].

Letting R(t) ∼ t1/α−1, we see that when α ≤ 1/2 then R satisfies (ii); when α ≥ 1/2 then R(t)/t is non-increasing, and R ◦ n ∼ (V

α (1−α) )1/α−1 = V ,

and so R satisfies (i).

slide-34
SLIDE 34

Introduction Establishing drift Using drift Applications References

Interplay of subsampling rates and moments

Polynomial rates: suppose n(x) ∼ V

α (1−α) (x), for some α ∈ (0, 1].

Letting R(t) ∼ t1/α−1, we see that when α ≤ 1/2 then R satisfies (ii); when α ≥ 1/2 then R(t)/t is non-increasing, and R ◦ n ∼ (V

α (1−α) )1/α−1 = V ,

and so R satisfies (i). In either case, we obtain Ex

  • τ 1/α−1

C

  • ≤ cV (x) .

Logarithmic and subgeometric rates can be dealt with similarly.

slide-35
SLIDE 35

Introduction Establishing drift Using drift Applications References

Corollary (to Theorem (2)) If π(V ) < ∞ then there exists a small set D with sup

x∈D

Ex τD

  • k=0

R(k)

  • < ∞ .
slide-36
SLIDE 36

Introduction Establishing drift Using drift Applications References

Corollary (to Theorem (2)) If π(V ) < ∞ then there exists a small set D with sup

x∈D

Ex τD

  • k=0

R(k)

  • < ∞ .

Polynomial chains: if PV ≤ V − cV 1−α + b1C then (Thm (1)) Pn(x)W (x) ≤ β′W (x) + b′1D , where W ∼ V 1−α and n ∼ V α; furthermore, π(W ) < ∞.

slide-37
SLIDE 37

Introduction Establishing drift Using drift Applications References

Corollary (to Theorem (2)) If π(V ) < ∞ then there exists a small set D with sup

x∈D

Ex τD

  • k=0

R(k)

  • < ∞ .

Polynomial chains: if PV ≤ V − cV 1−α + b1C then (Thm (1)) Pn(x)W (x) ≤ β′W (x) + b′1D , where W ∼ V 1−α and n ∼ V α; furthermore, π(W ) < ∞. Using R(t) ∼ t1/α−1 in the Corollary we obtain Ex

  • τ 1/α

C

  • < ∞ .
slide-38
SLIDE 38

Introduction Establishing drift Using drift Applications References

Application: tame chains

Class of Markov chains introduced by SBC & Kendall (2007). Definition Φ is tame if the following two conditions hold: (i) there exist δ ∈ (0, 1) and a deterministic function n satisfying n(x) ≤ W δ(x) such that Ex

  • W (Φn(x))
  • ≤ βW (x) + b1C(x) ;

(ii) the constant δ satisfies ln β < δ−1 ln(1 − δ). i.e. Φ satisfies a subsampled geometric drift condition, where the subsampling time n is not too large.

slide-39
SLIDE 39

Introduction Establishing drift Using drift Applications References

Theorem (SBC & Kendall, 2007 ) If Φ is tame then there exists a perfect simulation algorithm for Φ (using Dominated Coupling from the Past). Idea of proof: there exists a simple dominating process for any geometrically ergodic chain (Kendall, 2004); delay this (using n) to produce dominating process for X.

slide-40
SLIDE 40

Introduction Establishing drift Using drift Applications References

Theorem (SBC & Kendall, 2007 ) If Φ is tame then there exists a perfect simulation algorithm for Φ (using Dominated Coupling from the Past). Idea of proof: there exists a simple dominating process for any geometrically ergodic chain (Kendall, 2004); delay this (using n) to produce dominating process for X.

slide-41
SLIDE 41

Introduction Establishing drift Using drift Applications References

Theorem (SBC & Kendall, 2007 ) If Φ is tame then there exists a perfect simulation algorithm for Φ (using Dominated Coupling from the Past). Idea of proof: there exists a simple dominating process for any geometrically ergodic chain (Kendall, 2004); delay this (using n) to produce dominating process for X.

slide-42
SLIDE 42

Introduction Establishing drift Using drift Applications References

When is a chain tame?

All geometrically ergodic chains are tame. Proposition If PV ≤ V − cV 1−α + b1C, with α ∈ (0, 1/2), then Φ is tame. Proof easy, using Theorem (1). Follows that chains with subgeometric drift (φ(t) ∼ t[ln t]−α) are tame. Logarithmically ergodic chains (φ(t) ∼ [1 + ln t]α) not covered by this result.

slide-43
SLIDE 43

Introduction Establishing drift Using drift Applications References

Dominating process

Can also use above theory to determine ergodic properties of the dominating process D for Φ in the perfect simulation algorithm. D does not satisfy a simple one-step drift condition; but establishing state-dependent drift is simple! Theorem (2) provides information about ergodic properties of D.

slide-44
SLIDE 44

Introduction Establishing drift Using drift Applications References

Dominating process

Can also use above theory to determine ergodic properties of the dominating process D for Φ in the perfect simulation algorithm. D does not satisfy a simple one-step drift condition; but establishing state-dependent drift is simple! Theorem (2) provides information about ergodic properties of D. E.g. if n(x) ∼ W (x)γ (with γ ≤ δ), then D is ergodic and converges to πD polynomially fast (in total variation) but note that this isn’t enough to guarantee that the mean run-time of the domCFTP algorithm is finite . . .

slide-45
SLIDE 45

Introduction Establishing drift Using drift Applications References

Other applications/extensions

Sufficient conditions for ergodicity of strong Markov processes

applications in queueing and network stability continuum range of rates

  • f convergence

explicit norm of convergence

slide-46
SLIDE 46

Introduction Establishing drift Using drift Applications References

Other applications/extensions

Sufficient conditions for ergodicity of strong Markov processes

applications in queueing and network stability continuum range of rates

  • f convergence

explicit norm of convergence

Y¨ uksel & Meyn (2012) use random-time, state-dependent drift criteria to prove stability results, but not convergence rates . . .

slide-47
SLIDE 47

Introduction Establishing drift Using drift Applications References

References

Connor, S. B. and G. Fort (2009). State-dependent FosterLyapunov criteria for subgeometric convergence of Markov chains. Stochastic Processes and their Applications 119(12), 4176–4193. Connor, S. B. and W. S. Kendall (2007). Perfect Simulation for a Class of Positive Recurrent Markov Chains.

  • Ann. Appl. Probab. 17, 781–808.

Kendall, W. S. (2004). Geometric Ergodicity and Perfect Simulation.

  • Electron. Comm. Probab. 9, 140–151.

Meyn, S. P. and R. L. Tweedie (1994). State-dependent criteria for convergence of Markov chains.

  • Ann. Appl. Probab. 4, 149–168.

Y¨ uksel, S. and S. Meyn (2012). Random-Time, State-Dependent Stochastic Drift for Markov Chains and Application to Stochastic Stabilization Over Erasure Channels. Preprint.