Stochastic Processes MATH5835, P. Del Moral UNSW, School of - - PowerPoint PPT Presentation

stochastic processes
SMART_READER_LITE
LIVE PREVIEW

Stochastic Processes MATH5835, P. Del Moral UNSW, School of - - PowerPoint PPT Presentation

Stochastic Processes MATH5835, P. Del Moral UNSW, School of Mathematics & Statistics Lectures Notes 3 Consultations (RC 5112): Wednesday 3.30 pm 4.30 pm & Thursday 3.30 pm 4.30 pm 1/24 2/24 Citations of the day David


slide-1
SLIDE 1

Stochastic Processes

MATH5835, P. Del Moral UNSW, School of Mathematics & Statistics Lectures Notes 3 Consultations (RC 5112): Wednesday 3.30 pm 4.30 pm & Thursday 3.30 pm 4.30 pm

1/24

slide-2
SLIDE 2

2/24

slide-3
SLIDE 3

Citations of the day

The art of doing mathematics consists in finding that special case which contains all the germs of generality. – David Hilbert (1862-1943)

3/24

slide-4
SLIDE 4

Citations of the day

The art of doing mathematics consists in finding that special case which contains all the germs of generality. – David Hilbert (1862-1943)

  • David Hilbert - CSIRO Atherton, QLD

3/24

slide-5
SLIDE 5

Google PageRank algorithm

Stanford University patent [Larry Page ⊕ Sergey Brin] 1996

◮ Counts the number and quality of page links importance index. ◮ Hyp.: Important sites receive more links from others.

4/24

slide-6
SLIDE 6

Google PageRank - Some information

Using the web-spider bot Googlebot:

◮ d ≃ 25 × 109 Web pages (March 2014). ◮ di outgoing links from each website i ∈ {1, . . . , d}.

5/24

slide-7
SLIDE 7

Google PageRank - Some information

Using the web-spider bot Googlebot:

◮ d ≃ 25 × 109 Web pages (March 2014). ◮ di outgoing links from each website i ∈ {1, . . . , d}. ◮ How to use this data? ◮ Ranking stochastic model?

5/24

slide-8
SLIDE 8

Google PageRank - Stochastic model 1/4

A stochastic (sparse) matrix on {1, . . . , d} P(i, j) =   

1 di

if j is one of the di outgoing links if di = 0 (a.k.a. a dangling node) Markov chain model ?

6/24

slide-9
SLIDE 9

Google PageRank - Stochastic model 1/4

A stochastic (sparse) matrix on {1, . . . , d} P(i, j) =   

1 di

if j is one of the di outgoing links if di = 0 (a.k.a. a dangling node) Markov chain model ?

6/24

slide-10
SLIDE 10

Google PageRank - Stochastic model 2/4

More regular Markov transitions: M(i, j) = ǫ P(i, j) + (1 − ǫ) µ(j) with

◮ Damping factor ǫ ∈]0, 1[ (restart rate). ◮ µ(i) = 1/d uniform on {1, . . . , d}

7/24

slide-11
SLIDE 11

Google PageRank - Stochastic model 2/4

More regular Markov transitions: M(i, j) = ǫ P(i, j) + (1 − ǫ) µ(j) with

◮ Damping factor ǫ ∈]0, 1[ (restart rate). ◮ µ(i) = 1/d uniform on {1, . . . , d}

WHY?

7/24

slide-12
SLIDE 12

Google PageRank - Stochastic model 2/4

More regular Markov transitions: M(i, j) = ǫ P(i, j) + (1 − ǫ) µ(j) with

◮ Damping factor ǫ ∈]0, 1[ (restart rate). ◮ µ(i) = 1/d uniform on {1, . . . , d}

WHY? − → M(i, j) ≥ (1 − ǫ) µ(j)

7/24

slide-13
SLIDE 13

Google PageRank - Stochastic model 2/4

More regular Markov transitions: M(i, j) = ǫ P(i, j) + (1 − ǫ) µ(j) with

◮ Damping factor ǫ ∈]0, 1[ (restart rate). ◮ µ(i) = 1/d uniform on {1, . . . , d}

WHY? − → M(i, j) ≥ (1 − ǫ) µ(j) Consequences for 2 ⊥ Surfers (Xn, X ′

n) (start = sites) =pn(i)

  • P(Xn = i) −

=p′

n(i)

  • P(X ′

n = i)

= ??? P (Xn = X ′

n)

= ???

  • 7/24
slide-14
SLIDE 14

Google PageRank - Stochastic model 2/4

More regular Markov transitions: M(i, j) = ǫ P(i, j) + (1 − ǫ) µ(j) with

◮ Damping factor ǫ ∈]0, 1[ (restart rate). ◮ µ(i) = 1/d uniform on {1, . . . , d}

WHY? − → M(i, j) ≥ (1 − ǫ) µ(j) Consequences for 2 ⊥ Surfers (Xn, X ′

n) (start = sites) =pn(i)

  • P(Xn = i) −

=p′

n(i)

  • P(X ′

n = i)

= ??? P (Xn = X ′

n)

= ???

  • ⊕ Lecture slides 2!

7/24

slide-15
SLIDE 15

Google PageRank - Stochastic model 3/4

Surfers Xn starting at X0 = i: p0(j) = P (X0 = j) = 1i(j) ⇔ p0 :=  0, . . . , 0,

i−th

  • 1 , 0 . . . , 0

  ⇓ [Forgetting the initial condition] P (Xn = j) = P (Xn = j | X0 = i) = p0Mn = Mn(i, j) →n↑∞ p∞(j)

8/24

slide-16
SLIDE 16

Google PageRank - Stochastic model 4/4

More general situations (i.e. ∀p0) pn = p0Mn = ⇒ pn(j) =

  • k

p0(k) Mn(k, j) − →n↑∞ p∞(j) ⇓ Fixed point equation = invariant/stationary pn − →n↑∞ p∞ = p∞M Wolfram - Mathworld

9/24

slide-17
SLIDE 17

Google PageRank -Ranking

◮ Rate of convergence to equilibrium:

pn − p∞tv

admitted

:= 1 2

d

  • i=1

|pn(i) − p∞(i)| ≤ . . .??

◮ How to rank sites using the surfer exploration?

10/24

slide-18
SLIDE 18

Google PageRank -Ranking

◮ Rate of convergence to equilibrium:

pn − p∞tv

admitted

:= 1 2

d

  • i=1

|pn(i) − p∞(i)| ≤ . . .??

◮ How to rank sites using the surfer exploration?

Lecture notes ⊕ next slide

10/24

slide-19
SLIDE 19

Google PageRank ǫ = .85

10 20 30 40 50 0.2 0.4 0.6 0.8 number of iterations pn − p∞tv

11/24

slide-20
SLIDE 20

From Monte Carlo to Los Alamos

An introduction to simulation

◮ 3 simple ways to sample elementary random variables ◮ The Metropolis-Hasting model

(≃ 1960 [Metropolis-Rosenbluth (2)-Teller (2) cf. lect. notes]): ∈ Top-10 algo. in the 20th century.

◮ In the 21th century . . .

12/24

slide-21
SLIDE 21

The inverse method

p(x) 1 x x F9x) X1,...,Xn,... U1,...,Un,...

13/24

slide-22
SLIDE 22

The inverse method

p(x) 1 x x F9x) X1,...,Xn,... U1,...,Un,...

Formula F(x) = P(X ≤ x) = x

−∞

P(X ∈ dy) ⇒ X

def

= F −1(U) Examples: Exp(λ), discrete, binomial, multinomial,. . .

13/24

slide-23
SLIDE 23

The inverse method

p(x) 1 x x F9x) X1,...,Xn,... U1,...,Un,...

Formula F(x) = P(X ≤ x) = x

−∞

P(X ∈ dy) ⇒ X

def

= F −1(U) Examples: Exp(λ), discrete, binomial, multinomial,. . . Wolfram - Mathworld ⊕ Section 4.1 pp. 51-53

13/24

slide-24
SLIDE 24

The change of variables

14/24

slide-25
SLIDE 25

The change of variables

Some formulae (Ui ⊥ Unif [0, 1] ) [a1, b1]×[a2, b2] (X1, X2) = (a1 + (b1 − a1)U1, a2 + (b2 − a2)U2) ?? and    Y1 :=

  • −2 log(U1) cos (2πU2)

Y2 :=

  • −2 log(U1) sin (2πU2)

??

14/24

slide-26
SLIDE 26

The change of variables

Some formulae (Ui ⊥ Unif [0, 1] ) [a1, b1]×[a2, b2] (X1, X2) = (a1 + (b1 − a1)U1, a2 + (b2 − a2)U2) ?? and    Y1 :=

  • −2 log(U1) cos (2πU2)

Y2 :=

  • −2 log(U1) sin (2πU2)

?? Uniform on the unit circle ??

14/24

slide-27
SLIDE 27

The change of variables

Some formulae (Ui ⊥ Unif [0, 1] ) [a1, b1]×[a2, b2] (X1, X2) = (a1 + (b1 − a1)U1, a2 + (b2 − a2)U2) ?? and    Y1 :=

  • −2 log(U1) cos (2πU2)

Y2 :=

  • −2 log(U1) sin (2πU2)

?? Uniform on the unit circle ?? Lecture notes section 4.2 pp. 54-55

14/24

slide-28
SLIDE 28

Rejection technique

15/24

slide-29
SLIDE 29

Rejection technique

Some formulae (Ui ⊥ Unif [0, 1] ) [a1, b1]×[a2, b2] (X1, X2) = (a1 + (b1 − a1)U1, a2 + (b2 − a2)U2) ?? and    Y1 :=

  • −2 log(U1) cos (2πU2)

Y2 :=

  • −2 log(U1) sin (2πU2)

??

15/24

slide-30
SLIDE 30

Rejection technique

Some formulae (Ui ⊥ Unif [0, 1] ) [a1, b1]×[a2, b2] (X1, X2) = (a1 + (b1 − a1)U1, a2 + (b2 − a2)U2) ?? and    Y1 :=

  • −2 log(U1) cos (2πU2)

Y2 :=

  • −2 log(U1) sin (2πU2)

?? Uniform on the unit circle ??

15/24

slide-31
SLIDE 31

Rejection technique

Some formulae (Ui ⊥ Unif [0, 1] ) [a1, b1]×[a2, b2] (X1, X2) = (a1 + (b1 − a1)U1, a2 + (b2 − a2)U2) ?? and    Y1 :=

  • −2 log(U1) cos (2πU2)

Y2 :=

  • −2 log(U1) sin (2πU2)

?? Uniform on the unit circle ?? Wolfram - Mathworld ⊕ Section 4.2 pp. 54-55

15/24

slide-32
SLIDE 32

Boltzmann-Gibbs measures

π(dx) := 1 Zβ e−β V (x) λ(dx)

16/24

slide-33
SLIDE 33

Boltzmann-Gibbs measures

π(dx) := 1 Zβ e−β V (x) λ(dx) Some examples: (see also section 6.4)

◮ Ising/Sherrington-Kirkpatrick model:

x ∈ {−1, +1}{1,...,L}2 with λ(x) = 2−L2 V (x) = h

  • i∈E

x(i) − J

  • i∼j

θi,j x(i) x(j)

16/24

slide-34
SLIDE 34

Boltzmann-Gibbs measures

π(dx) := 1 Zβ e−β V (x) λ(dx) Some examples: (see also section 6.4)

◮ Ising/Sherrington-Kirkpatrick model:

x ∈ {−1, +1}{1,...,L}2 with λ(x) = 2−L2 V (x) = h

  • i∈E

x(i) − J

  • i∼j

θi,j x(i) x(j)

◮ Traveling Salesman m cities ei: x ∈ Gm with λ(x) =

1 m!

V (x) =

m

  • p=1

d(ex(p), ex(p+1))

16/24

slide-35
SLIDE 35

Boltzmann-Gibbs measures

π(dx) := 1 Zβ e−β V (x) λ(dx) Some examples: (see also section 6.4)

◮ Ising/Sherrington-Kirkpatrick model:

x ∈ {−1, +1}{1,...,L}2 with λ(x) = 2−L2 V (x) = h

  • i∈E

x(i) − J

  • i∼j

θi,j x(i) x(j)

◮ Traveling Salesman m cities ei: x ∈ Gm with λ(x) =

1 m!

V (x) =

m

  • p=1

d(ex(p), ex(p+1))

◮ Black Box problems:

Inputs = X → Numerical codes F → Outputs = Y = F(X) e−βV (x) ≃ 1F −1(A)(x) ⇒ π = Law(X | X ∈ A)

16/24

slide-36
SLIDE 36

The Metropolis Hasting sampler

Markov chain Xn−1 Xn with 2 steps:

◮ Propose a transition Xn−1 = x y with some probability

density P(x, dy) ∼ π(dy)

◮ Accept Xn = y or reject Xn = x with acceptance probability

a(x, y) = min

  • 1, π(dy)P(y, dx)

π(dx)P(x, dy)

πM = π

17/24

slide-37
SLIDE 37

The Metropolis Hasting sampler

The Markov transition: M(x, dy) := P(x, dy) × a(x, y) +

  • 1 −
  • P(x, dz) a(x, z)
  • δx(dy)

18/24

slide-38
SLIDE 38

The Metropolis Hasting sampler

The Markov transition: M(x, dy) := P(x, dy) × a(x, y) +

  • 1 −
  • P(x, dz) a(x, z)
  • δx(dy)

⇓ Master equation ⇔ π-reversible property of M π(dx)M(x, dy) = π(dy)M(y, dx) ⇒ πM = π Wolfram - Mathworld

18/24

slide-39
SLIDE 39

Reversible Proposals

Reversible Proposals: π(dx)P(x, dy) = π(dy)P(y, dx) ⇓ Maximal acceptance rate a(x, y) = min

  • 1, π(dy)P(y, dx)

π(dx)P(x, dy)

  • = 1

19/24

slide-40
SLIDE 40

Ex.- Gibbs Sampler on x = (x1, x2) ∈ S = (S1 × S2)

π(d(x1, x2)) = π1(dx1) L1,2(x1, dx2) = π2(dx2) L2,1(x2, dx1)

  • (X1, X2) ∼ π ⇒ π1 = Law(X1)

and L1,2(x1, dx2) = P (X2 ∈ dx2 | X1 = x1)

20/24

slide-41
SLIDE 41

Ex.- Gibbs Sampler on x = (x1, x2) ∈ S = (S1 × S2)

π(d(x1, x2)) = π1(dx1) L1,2(x1, dx2) = π2(dx2) L2,1(x2, dx1)

  • (X1, X2) ∼ π ⇒ π1 = Law(X1)

and L1,2(x1, dx2) = P (X2 ∈ dx2 | X1 = x1) Example: p(x1, x2) = 1 π 1x2

1 +x2 2 ≤1 = 10≤x1≤1 × 1|x2|≤√

1−x2

1

20/24

slide-42
SLIDE 42

Gibbs sampling types of proposals

P = K1K2

  • r

P = K2K1

  • r

P = 1 2 K1 + 1 2 K2 with the ”matrix-like” compositions: (K1K2)(x1, dx3) :=

  • x2

K1(x1, dx2) K2(x2, dx3) ⊕ the conditional transitions with a fixed coordinate: K1((x1, x2), d(y1, y2)) := δx1(dy1)L1,2(y1, dy2) K2((x1, x2), d(y1, y2)) := δx2(dy2)L2,1(y2, dy1) ↓

  • x1

x2

  • K2

− − − − − →

  • y1

x2

  • K1

− − − − − →

  • y1

y2

  • K2K1

− − − − − − − − − − − − − − − − − − − − − − − − − → The unit disk example!!

21/24

slide-43
SLIDE 43

Reversibility check - Back to Ki !

Proposition: The ”frozen first” coordinate transition K1((y1, y2), d(x1, x2)) := δy1(dx1)L1,2(x1, dx2) is π-reversible.

22/24

slide-44
SLIDE 44

Reversibility check - Back to Ki !

Proposition: The ”frozen first” coordinate transition K1((y1, y2), d(x1, x2)) := δy1(dx1)L1,2(x1, dx2) is π-reversible. Proof/Exercise:

22/24

slide-45
SLIDE 45

Reversibility check - Back to Ki !

Proposition: The ”frozen first” coordinate transition K1((y1, y2), d(x1, x2)) := δy1(dx1)L1,2(x1, dx2) is π-reversible. Proof/Exercise: π(d(y1, y2)) × K1((y1, y2), d(x1, x2)) = π1(dy1)L1,2(y1, dy2) × δy1(dx1)L1,2(x1, dx2) = π1(dy1)δy1(dx1)

  • =π1(dx1)δx1(dy1)

× (L1,2(y1, dy2)L1,2(x1, dx2))

  • (x,y)−symmetry

⇓ (x = (x1, x2) & y = (y1, y2)) Reversibility property π(dy) × K1(y, dx) = π(dx) × K1(x, dy)

22/24

slide-46
SLIDE 46

Exercise 1

Exercise/Proposition: If M1 and M2 two π-reversible Markov transitions on S ∀i = 1, 2 π(dx)Mi(x, dy) = π(dy)Mi(y, dx) Then π(dx1)M1(x1, dx2)M2(x2, dx3) = π(dx3) M2(x3, dx2) M1(x2, dx1)

23/24

slide-47
SLIDE 47

Exercise 1

Exercise/Proposition: If M1 and M2 two π-reversible Markov transitions on S ∀i = 1, 2 π(dx)Mi(x, dy) = π(dy)Mi(y, dx) Then π(dx1)M1(x1, dx2)M2(x2, dx3) = π(dx3) M2(x3, dx2) M1(x2, dx1) Proof: π(dx1)M1(x1, dx2)M2(x2, dx3) := π(dx2) M1(x2, dx1) M2(x2, dx3) = M1(x2, dx1) π(dx2) M2(x2, dx3) = M1(x2, dx1) π(dx3) M2(x3, dx2) = π(dx3) M2(x3, dx2) M1(x2, dx1)

23/24

slide-48
SLIDE 48

Exercise 2

Exercise/Proposition: The transition X = x Y ∈ dy Y = √ 1 − ǫ X + √ǫ W with W ∼ N(0, 1) is N(0, 1)-reversible for any ǫ ∈ [0, 1].

24/24

slide-49
SLIDE 49

Exercise 2

Exercise/Proposition: The transition X = x Y ∈ dy Y = √ 1 − ǫ X + √ǫ W with W ∼ N(0, 1) is N(0, 1)-reversible for any ǫ ∈ [0, 1]. Proof: M(x, dy) ∝ exp

  • − 1

  • y −

√ 1 − ǫ x 2 dy x2 + 1 ǫ

  • y −

√ 1 − ǫ x 2 = x2 + 1 ǫ

  • y 2 − 2

√ 1 − ǫ yx + (1−ǫ) x2 = 1 ǫ

  • y 2 − 2

√ 1 − ǫ yx + x2

  • (x,y)−symmetry

Consequences?

24/24

slide-50
SLIDE 50

Exercise 2

Exercise/Proposition: The transition X = x Y ∈ dy Y = √ 1 − ǫ X + √ǫ W with W ∼ N(0, 1) is N(0, 1)-reversible for any ǫ ∈ [0, 1]. Proof: M(x, dy) ∝ exp

  • − 1

  • y −

√ 1 − ǫ x 2 dy x2 + 1 ǫ

  • y −

√ 1 − ǫ x 2 = x2 + 1 ǫ

  • y 2 − 2

√ 1 − ǫ yx + (1−ǫ) x2 = 1 ǫ

  • y 2 − 2

√ 1 − ǫ yx + x2

  • (x,y)−symmetry

Consequences?

24/24