0%7*0 Just one of many ways to measure how close two I have an - - PowerPoint PPT Presentation

0 7 0
SMART_READER_LITE
LIVE PREVIEW

0%7*0 Just one of many ways to measure how close two I have an - - PowerPoint PPT Presentation

Disclaimer: What We Know... Every irreducible, aperiodic Markov chain has a unique stationary distribution. Mixing Time ' n' Much handwaving today!! HIFI IT = m (1) Goal is to get a high level intuition / picture for CS 70, Summer


slide-1
SLIDE 1

Mixing Time

CS 70, Summer 2019 Bonus Lecture, 8/9/19

1 / 17

Disclaimer:

Much handwaving today!! Goal is to get a high level intuition / picture for the concept of mixing time and applications Emphasis is on heuristics rather than rigor

2 / 17

What We Know...

Every irreducible, aperiodic Markov chain has a unique stationary distribution.

m πm(1) πm(2) πm(3) πm = π0P m = π0   0.8 0.2 0.3 0.7 0.6 0.4  

m

.

Q: How long does it take to get close to the stationary distribution?

3 / 17

HIFI

' n'

=

IT

Total Variation Distance

Just one of many ways to measure how close two distributions are. Let P1, P2 be two PMFs. Their TV distance is:

4 / 17

  • ver

States

{ 1,2 ,

. . . ,

n }

Estates i

/ Phil

  • Elif

thing lighted

Mixing Time: Definition

I have an irreducible, aperiodic Markov chain. Notation: µ(n) is the distribution at time n, and π is its (unique) stationary distribution. I want to keep running my chain until: The mixing time tmix(") is the first time this happens. (Omitted fact: The TV distance between µ(n) and π decreases as n increases.)

5 / 17

  • d

Tv

( UM, Tl )

E

E

small

positive

9

number

dist

at

time in stationary

↳ worst

case

starting

di St

.

U

l l

with

high prob , at

least

Complete Graph With Loops

Mixing time analysis sometimes direct! Take a random walk on a complete graph with

  • loops. What is the stationary distribution?

What does the transition matrix look like? Mixing Time? Dependence on "?

6 / 17

a

0%7*0

a kn

:

F-

ftnh

. . . In ]

symbtanwtnfn vertices

.

÷

. '

,

up

  • TIME
. .

.int

= IT

1

.

for

all

E

.
slide-2
SLIDE 2

Random-To-Top Shuffling I

S: All orderings of n cards in a deck. Transitions: Choose card randomly in the deck. Move it to the top. Different strategy called coupling:

7 / 17

I

n ! con figs

.

Tl

here

is

uniform

  • ver

all States

.

assume arena;Hed

.

A

T

chosen start

at

arbitrary

  • v. a. r
.

State

.

I #

we keep

applying p, dist

.

Stays

at

it

.

Random-To-Top Shuffling II

At each time, each card is labeled coupled or not. Initially, all cards start uncoupled. Pick a random card C. In both decks, take card C and move it to the top. If C isn’t already coupled, mark it as coupled. What happens when we look at each deck individually?

8 / 17

number number

LOOKS

like

random

  • to
  • top

move

!

  • nce

a

#

IS

labeled

" coupled ;

it

has

the

same

position

in

both

decks !

Random-To-Top Shuffling III

Time until all cards get coupled = time until Deck 1 is fully random. For all : tmix(") =

9 / 17

me 1st 2nd 3rd couple

T2 couple

  • 13

couple

.
  • const
.

't

Geomfnnt )

Geomftn) "

coupon

collector

.

ECT

, -1T , -1

. .

.tTnT=nl0gN

.

0cnu④

"

?9eEuFo¥

"

Random Hypercube Walk I

Take an n-dimensional hypercube. Stationary distribution of hypercube walk: Try coupling again:

10 / 17

vertices

n

  • length

bit strings F- Cfn

In

. . . In]

all vertices

same

deg

.

④ Lazywatk

:

Every

vertex self loop Wp 's

transitions

to Nbr

up In

iarbitraryvertexi

truly

random

:

randomize

all

bits

Random Hypercube Walk II

Each coordinate is labeled coupled or not. Initially all coordinates are uncoupled. Pick a random coordinate i. Flip a coin to set the i-th coordinate to 0 or 1. If i-th coordinate is uncoupled, set it to coupled.

11 / 17

Ef time

all

word

. randomized ]

=

N log N

Random Hypercube Walk III

Use Coupon Collector again! For all ", tmix(") =

12 / 17

same

as

before

.
slide-3
SLIDE 3

Conditions for Fast Mixing?

Complete graph Kn? Path on n vertices? Dumbbell?

13 / 17

=L

Start in

middle

.
  • w
  • mixing

time

NZ

mixing

time nz

mgaehffengnere

f-

chance getting hmltgefdte

from

pink dot

.

Bottlenecks

We use a measurement called the conductance to quantify the notion of a bottleneck. The conductance of a set A ⊆ S is: Φ(A) = The conductance of the chain M is: Φ(M) =

14 / 17

A

E

EA ,jEA

Milk ;

  • jeep 'll

VOILA)

min Ast

VOKAKE

ECA)

How To Measure Conductance?

Measuring conductance = looking at all subsets

  • f states with vol(A) ≤ 2.

How many subsets, potentially? Alternative: Get lower bound on Φ(M) using second largest eigenvalue of transition matrix. Φ(M) ≥ Eigenvalues are much faster to compute!

15 / 17

±

2

Markov Chain Monte Carlo

Monte Carlo: randomized algorithm where the

  • utput is allowed to be incorrect

Use cases:

I sampling from complicated distributions I counting combinatorial objects I Bayesian inference I statistical physics I volume estimation, integration

16 / 17

Markov Chain Monte Carlo

Key idea: Design a Markov chain so that its stationary follows the distribution that you want to sample from. Run the chain, wait for it to mix. Runtime depends on...

17 / 17

+

mix CE )

!