Perfect sampling using dynamic programming
Séminaire Équipe CALIN Christelle Rovetta
Équipe AMIB(io), Laboratoire LIX - Inria Saclay - LRI - RNALand
Perfect sampling using dynamic programming Sminaire quipe CALIN - - PowerPoint PPT Presentation
Perfect sampling using dynamic programming Sminaire quipe CALIN Christelle Rovetta quipe AMIB(io), Laboratoire LIX - Inria Saclay - LRI - RNALand th April What is Athena doing? State space State space 0.1 0.2 0.7
Équipe AMIB(io), Laboratoire LIX - Inria Saclay - LRI - RNALand
1
◮ Arithmetic mean
0.5 0.5
◮ Arithmetic mean
0.33 0.67
◮ Arithmetic mean
0.05 0.05 0.10 0.20 0.60
◮ Arithmetic mean
0.05 0.05 0.10 0.20 0.60 0.03 0.04 0.12 0.16 0.63
◮ Arithmetic mean
◮ Stationary distribution π (solution of πP = π)
◮ State space: |S| < ∞ ◮ Ergodic Markov chain (Xn)n∈Z on S ◮ Stationary distribution π
◮ Sample a random object according π ◮ Very large S
◮ (Un)n∈Z an i.i.d sequence of random variables
1
2
3
Time States
◮ (Un)n∈Z an i.i.d sequence of random variables
◮ How to detect the stopping criterion?
◮ Perfect sampling algorithm [Propp – Wilson, 1996]
◮ Produces y ∼ π ◮ Stopping criterion automatically detected ◮ Uses coupling from the past
◮ Perfect sampling algorithm [Propp – Wilson, 1996]
◮ Produces y ∼ π ◮ Stopping criterion automatically detected ◮ Uses coupling from the past
◮ Perfect sampling algorithm [Propp – Wilson, 1996]
◮ Produces y ∼ π ◮ Stopping criterion automatically detected ◮ Uses coupling from the past
◮ Perfect sampling algorithm [Propp – Wilson, 1996]
◮ Produces y ∼ π ◮ Stopping criterion automatically detected ◮ Uses coupling from the past
◮ Starts from all states, complexity at least in O(|S|)
0 0
◮ Perfect sampling algorithm [Propp – Wilson, 1996]
◮ Produces y ∼ π ◮ Stopping criterion automatically detected ◮ Uses coupling from the past
◮ Starts from all states, complexity at least in O(|S|) ◮ Find strategies (monotone chains, envelopes,. . . )
◮ Introduced by Erlang in 1917 to describe the Copenhagen
◮ Queues are everywhere in computing systems ◮ Analyze various kinds of system performance (average waiting
◮ Usually modeled by an ergodic Markov chain ◮ Computer simulation
0.9 0.7 0.5 0.05 0.1 0.2 0.7 0.05 0.5 0.5 0.5
◮ K = 5 queues, M = 4 customers ◮ State: x = (2, 0, 1, 1, 0) ◮ Sample π
◮ Gordon-Newell networks
1 2 3 4 5
0.9 0.7 0.5 0.05 0.1 0.2 0.7 0.05 0.5 0.5 0.5
k
k . ◮ G(K, M): normalization constant (partition function) ◮ Compute G(K, M) in O(KM), dynamic programming
◮ K queues ./M/1 (exponential service rate) ◮ Finite capacity Ck in each queue ◮ Blocking policy: Repetitive service - random destination ◮ Strongly connected network
1 2 3 4 5
0.9 0.7 0.5 0.05 0.1 0.2 0.7 0.05 0.5 0.5 0.5
◮ K = 5 queues, M = 3 customers, capacity C = (2, 1, 3, 1, 2) ◮ x = (1, 0, 1, 1, 0)
◮ K queues, M customers, capacity C = (C1, . . . , CK) ◮ State space:
K
◮ Number of states (M ≫ K):
◮ Transition on a state:
◮ Transition on a set of states:
◮ (Un)n∈Z := (in, jn)n∈Z an i.i.d sequence of random variables ◮ System described by an ergodic Markov chain:
◮ Unique stationary distribution π that is unknown ◮ GOAL: sample π with the perfect sampling algorithm
◮ Perfect sampling
−1 −2 −8 −4 −n/2
−n
◮ PROBLEM: |S| in O(MK−1) ◮ Find a strategy !
◮ Fixed number of customers (K k=1 xk = M) ◮ No lattice structure
◮ Reduce complexity: O(MK−1) to O(KM2) ◮ Represent states as paths in a graph ◮ Realize transitions directly on the graph
◮ State:
◮ x = (0, 0, 2, 0, 1)
◮ Diagram ◮ 5 queues, 3 customers,
◮ 5
k=1 xk = 3
◮ ∀k 0 ≤ xk ≤ Ck
0,0 1,0
Customers Queues
◮ State:
◮ x = (0, 0, 2, 0, 1)
◮ Diagram ◮ 5 queues, 3 customers,
◮ 5
k=1 xk = 3
◮ ∀k 0 ≤ xk ≤ Ck
0,0 2,0 1,0
Custom ers Queues
◮ State:
◮ x = (0, 0, 2, 0, 1)
◮ Diagram ◮ 5 queues, 3 customers,
◮ 5
k=1 xk = 3
◮ ∀k 0 ≤ xk ≤ Ck
0,0 2,0 4,2 1,0 3,2 5,3
Customers Queues
◮ State:
◮ x = (0, 0, 2, 0, 1) ◮ y = (1, 0, 1, 1, 0)
◮ Diagram ◮ 5 queues, 3 customers,
◮ 5
k=1 xk = 3
◮ ∀k 0 ≤ xk ≤ Ck
0,0 2,0 4,2 1,0 3,2 5,3 1,1 2,1 4,3
Custom ers Queues
◮ A diagram is a graph that encode a set of states ◮ A diagram is complete if it encodes all the states ◮ Number of arcs in a diagram: O(KM2)
0,0 2,0 2,1 2,2 2,3 4,1 4,2 4,3 1,0 1,1 1,2 3,0 3,1 3,2 3,3 5,3
0,0 2,0 4,2 1,0 3,2 5,3 1,1 2,1 4,3
0,0 2,0 4,2 1,0 3,2 5,3 1,1 2,1 4,3
◮ Galois connexion
◮ Transition on a diagram:
◮ Good properties for perfect sampling
◮ Preserves inclusion
◮ Preserves coupling
◮ Efficient algorithm to compute transitions Ti,j in O(KM2)
◮ Parameters: K = 5 queues, M = 3 customers, capacity
◮ Transition t4,2(S)?
x4 = 0 OR x2 = C2 x4 > 0 AND x2 < C2
x4 = 0 OR x2 = C2 x4 > 0 AND x2 < C2
◮ t4,2(S) = {(0, 1, 1, 0, 0), (0, 1, 1, 1, 0), (0, 1, 0, 0, 2), (1, 1, 1, 0, 0)}
2 4 3 4 2 Stay Full Transit Transit' 2 4 3 2 4 3 D
+1 T4,2,1(D) 2 3 4 2 3 4
2 4 3 4 2 Stay Full Transit Transit' 2 4 3 2 4 3 D
+1 T4,2,1(D) 2 3 4 2 3 4 ◮ Complexity in O(KM2) compared to O(MK−1) (transition on
◮ ψ(D) = S ◮ Transitions preserve inclusions and coupling ◮ There exists a finite sequence of transitions
1 2 3 4 5
1 2 3 4 6 7
0.7 0.3 0.7 0.3 1 2 3 4 5 6 7 8
5
◮ Monoclass: Clones, Matlab Toolbox [Best Tool Paper Award
◮ Multiclass: MClones, package Python [ROADEF ’16]
◮ Generalize the diagrams notations (monoclass and multiclass) ◮ Define generic algorithms (ψ, |ψ(D)|) ◮ Diagrams can be used in a more general context than closed
◮ Link with dynamic programming
1
...
i i+1 K
...
◮ Starting at the same node ◮ Ending at the same node ◮ Prefix (and suffix) factorization
(i, f(i)(x)) (i − 1, f(i−1)(y)) (i − 1, f(i−1)(x)) (i + 1, f(i+1)(x)) (i + 1, f(i+1)(y))
◮ f (K)(x) = M (ending at the same node) ◮ A diagram encodes a set of state S ⊆ Ω(K, M)
◮ Memoryless sequence:
sum = 0
sum
sum
◮ Diagram K = 5, M = 3, C = (2, 1, 3, 1, 2)
0,0 2,0 2,1 2,2 2,3 4,1 4,2 4,3 1,0 1,1 1,2 3,0 3,1 3,2 3,3 5,3
◮ Partition of 5: x = (1, 0, 0, 1, 0), y = (3, 1, 0, 0, 0), . . . ◮ 5 = 1 × 1 + 1 × 4 = 3 × 1 + 1 × 2 = . . .
0,0 1,0 1,1 1,2 1,3 1,5 2,5 3,5 4,5 5,5 2,0 2,1 2,2 3,0 3,1 4,0
1 1
◮ Diagram representation O(K 2 log(K))
◮ Partition of 5: x = (1, 0, 0, 1, 0), y = (3, 1, 0, 0, 0), . . . ◮ 5 = 1 × 1 + 1 × 4 = 3 × 1 + 1 × 2 = . . . ◮ Memoryless sequence:f (0) part = 0
part
part
0,0 1,0 1,1 1,2 1,3 1,5 2,5 3,5 4,5 5,5 2,0 2,1 2,2 3,0 3,1 4,0
1 1
◮ Diagram representation O(K 2 log(K))
◮ Partition of 5: x = (1, 0, 0, 1, 0), y = (3, 1, 0, 0, 0), . . . ◮ 5 = 1 × 1 + 1 × 4 = 3 × 1 + 1 × 2 = . . . ◮ Memoryless sequence:f (0) part = 0
part
part
◮ Complete diagram K = M = 5
0,0 1,0 1,1 1,2 1,3 1,5 2,5 3,5 4,5 5,5 2,0 2,1 2,2 3,0 3,1 4,0
1 1
◮ Diagram representation O(K 2 log(K))
◮ Update function (Markov Chain)
◮ Update function (Markov Chain)
◮ Sequence without memory
mc (u) =
mc
◮ K = 5, M = 0 ◮ Diagram
{u1 ∈ [2a, 1[} {u2 ∈ [a, 2a[} X2 = 1 X2 = 2 X3 = 1 X3 = 2 X4 = 1 {u4 ∈ [0, 2a[} {u5 ∈ [0, a[} {u3 ∈ [2a, 1[} X1 = 0 X2 = 0 X3 = 0 X4 = 0 X5 = 0 X1 = 1 X0 = 0
◮ StatesToDiagram transforms a set of states into a diagram (φ) ◮ DiagramToStates transforms a diagram into a set of states
◮ CardStates computes the number of states represented by a
◮ RandState samples a state according a product form
◮ Weights wk : Ek → R+ (given) ◮ RandState produces x ∈ ψ(D) according to:
K
K
◮ 2 steps:
◮ Assign weights: arcs and nodes ◮ Random walk on the diagram
1
1
1 1 1 2 1 2
◮ Probability: 13 27 7 13 7 7 6 7 2 3 2 1 = 4 27
◮ RandState produces x ∈ ψ(D) according to:
K
K
◮ Complexity: assign weights O(A), random walk O(K) ◮ Sample states for a Gordon-Newell networks ◮ Uniform generation
◮ [Application 1]
◮ Package Python, 2 mains classes: ◮ Using DiagramS
◮ Define E, ⊕ and 0G in MLS
◮ Define parameter K and M in Diagram
◮ Random generation of combinatorial objects ◮ Introduced in 2004 by Duchon, Flajolet, Louchard and
◮ Based on Analytic Combinatorics ◮ Draws uniformly random objects of size n
◮ Without enumerating ◮ Size of object is random ◮ Statistic control of the size of the objects
◮ Set of objects A ◮ Size function |.| : A → N ◮ an: number of objects of size n, ∀n ∈ N, an < ∞
◮ W = {E, 0, 1, 00, 01, 000, 001, . . .} ◮ |.| gives the length of the word (|0010| = 4)
◮ W = {E, 0, 1, 00, 01, 000, 001, . . .} ◮ |.| gives the length of the word (|0010| = 4) ◮ Generating function:
◮ Atomic classes
◮ (E, |.|): E = {ǫ}, |ǫ| = 0 ◮ (Z, |.|): Z = {ζ}, |ζ| = 1
◮ Disjoint union
◮ Cartesian product
◮ ...
A(x). ◮ Uniform generator for objects of same size ◮ Parameter x ∈]0, RA[, control the size
◮ C := A ∪ B ◮ Generating function: C(z) = A(z) + B(z) ◮ Ber(p) return a r.v distributed according a Bernoulli law
C(x))
◮ Px(γ) = x|γ| C(x)
1 2 3 4 5
0.9 0.7 0.5 0.05 0.1 0.2 0.7 0.05 0.5 0.5 0.5
1 5
4
3 2
3 2 2 5
◮ B = {1, 2, 3, 4, 5}, M = 9
◮ Multiset of fixed size: A = msetM(B) ◮ Generating function [Flajolet et al. ’07] uses partitions of M
M
◮ Multiset of fixed size: A = msetM(B) ◮ Generating function [Flajolet et al. ’07] uses partitions of M
M
◮ RandState can be used on a complete diagram of partitions to
◮ 2 secondary structures of RNA: sA and sB
UUCUUAUCAAGAGAAGCAGAGGGAC
U U C U U A U C A A G A G A A G C A G A G G G A C
U U C U U A U C A A G A G A A G C A G A G G G A C
◮ GOAL: Estimate the distribution of the hitting time DA→B ◮ Set of all secondary structure huge
◮ (Yn)n∈N an ergodic Markov Chain where Y0 = sA ◮ Paths of size N: p = (y0, y1, . . . , yN−1, yN) ∈ ΩN s.t.
U U C U U A U C A A G A G A A G C A G A G G G A C
SA SB
U U C U U A U C A A G A G A A G C A G A G G G A C
◮ Goal: estimate DA→B(N) ◮ Use DiagramToStates to extract the paths ◮ Compute the distribution from the set of paths
◮ Huge number of paths ◮ Select the "good nodes"
◮ Each node as at most d successors ◮ There are at most m nodes in a column
U U C U U A U C A A G A G A A G C A G A G G G A C
SA SB
U U C U U A U C A A G A G A A G C A G A G G G A C
◮ Perfect sampling for closed queueing network ◮ Diagram data structure ◮ 2 examples of applications
◮ Perfect sampling using dynamic programming: other networks,
◮ Diagram:
◮ Generalized: E discrete set, continuous ? ◮ Parallel computing ◮ DiagramS
◮ RNA folding kinetic