analysis of patterns and minimal embeddings of non
play

Analysis of patterns and minimal embeddings of non-Markovian - PowerPoint PPT Presentation

Analysis of patterns and minimal embeddings of non-Markovian sequences Manuel.Lladser@Colorado.EDU Department of Applied Mathematics University of Colorado Boulder AofA - April 13 2008 1 NOTATION & TERMINOLOGY. A is a finite alphabet A


  1. Analysis of patterns and minimal embeddings of non-Markovian sequences Manuel.Lladser@Colorado.EDU Department of Applied Mathematics University of Colorado Boulder AofA - April 13 2008 1

  2. NOTATION & TERMINOLOGY. A is a finite alphabet A ∗ is the set of all words of finite length A language is a set L ⊂ A ∗ X = ( X n ) n ≥ 1 is a sequence of A -valued random variables X may be non-Markovian X 1 · · · X l models a random word of length l 2

  3. PARADIGM. For various probabilistic models for X and languages L the frequency statistics of L are asymptotically normal. 0 1 @ number of prefixes in X 1 · · · X n S L n := A that belong to the language L The paradigm applies for: • generalized patterns ⊕ i.i.d. models [BenKoch93] • simple patterns ⊕ stationary Markovian models [RegSzp98] • primitive patterns ⊕ k -order Markovian models [NicSalFla02, Nic03] • primitive patterns ⊕ nice dynamical sources [BouVal02, BouVal06] • hidden patterns ⊕ i.i.d. models [FlaSpaVal06] 3

  4. THE MARKOV CHAIN EMBEDDING TECHNIQUE. IF X is a homogeneous Markov chain IF L is a regular language IF G = ( V, A , f, q, T ) is a DFA that recognizes L IF the embedding of X into G i.e. the stochastic process X G n := f ( q, X 1 · · · X n ) is a first-order homogenous Markov chain THEN    number of visits the embedded process S L n =  X G makes to T in the first n -steps 4

  5. EXAMPLE. Consider a 1-st order Markov chain X such that P [ X 1 = a ] = µ ; P [ X 1 = b ] = (1 − µ ); P [ X n +1 = a | X n = a ] = p ; P [ X n +1 = b | X n = a ] = (1 − p ); P [ X n +1 = a | X n = b ] = q ; P [ X n +1 = b | X n = b ] = (1 − q ) . Then the embedding of X into the Aho-Corasick automaton a a b a ab b a a a b abb abba ǫ b b a b ba a b b that recognizes matches with the regular expression { a, b } ∗ { ba, abba } i.e. all words of the form x = ...ba or x = ...abba is a 1-st order Markov chain. 5

  6. a a b a ab b a a a b abb abba ǫ b b a b ba a b b p p (1 − p ) (1 − p ) 1 3 µ q q 5 6 (1 − q ) (1 − µ ) p (1 − p ) 2 4 q (1 − q ) (1 − q ) 6

  7. What about a completely general sequence X ? 7

  8. EXAMPLE. A seemingly unbiassed coin. Let 0 < p < 1 / 2 Consider the random binary sequence X = ( X n ) n ≥ 1 such that n  1 X i > 1 � Bernoulli ( p ) ,  n 2   i =1   n  d 1 X i = 1 � Bernoulli (1 / 2) , X n +1 = 2 n i =1  n   1 X i < 1 �  Bernoulli (1 − p ) ,   2 n i =1 Question. Is there a Markovian structure where X can be embedded into for analyzing the asymptotic distribution of the frequency statistics of a given language? 8

  9. GENERAL SETTING. Given • a possibly non-Markovian sequence X • a possibly non-regular language L • a transformation R : A ∗ → S define X R to be the stochastic process X R n := R ( X 1 · · · X n ) Question 1. What conditions are necessary and sufficient in order for X R to be Markovian? Question 2. Given a pattern L , is there a transformation R such that X R is Markovian but also informative of the distribution of the frequency statistics of L ? 9

  10. REMARK. The Markovianity or non-Markovianity of X R n := R ( X 1 · · · X n ) , n ≥ 1 does not really depend on the range of R The above motivates to think of R : A ∗ → S as an equivalence relation over A ∗ : u R v ⇐ ⇒ R ( u ) = R ( v ) • R ( u ) is the unique equivalence class of R that contains u • c ∈ R means that c is an equivalence class of R 10

  11. DEFINITION. X is embedable w.r.t. R provided that for all u, v ∈ A ∗ and c ∈ R , if u R v then � � P [ X = uα... | X = u... ] = P [ X = vα... | X = v... ] α ∈A : R ( uα )= c α ∈A : R ( vα )= c 11

  12. DEFINITION. X is embedable w.r.t. R provided that for all u, v ∈ A ∗ and c ∈ R , if u R v then � � P [ X = uα... | X = u... ] = P [ X = vα... | X = v... ] α ∈A : R ( uα )= c α ∈A : R ( vα )= c Figure. Schematic partition of { 0 , 1 , 2 } ∗ into equivalence classes 12

  13. DEFINITION. X is embedable w.r.t. R provided that for all u, v ∈ A ∗ and c ∈ R , if u R v then � � P [ X = uα... | X = u... ] = P [ X = vα... | X = v... ] α ∈A : R ( uα )= c α ∈A : R ( vα )= c v u Figure. Schematic partition of { 0 , 1 , 2 } ∗ into equivalence classes 13

  14. DEFINITION. X is embedable w.r.t. R provided that for all u, v ∈ A ∗ and c ∈ R , if u R v then � � P [ X = uα... | X = u... ] = P [ X = vα... | X = v... ] α ∈A : R ( uα )= c α ∈A : R ( vα )= c v u u0 u2 v2 v1 u1 v0 Figure. Schematic partition of { 0 , 1 , 2 } ∗ into equivalence classes 14

  15. DEFINITION. X is embedable w.r.t. R provided that for all u, v ∈ A ∗ and c ∈ R , if u R v then � � P [ X = uα... | X = u... ] = P [ X = vα... | X = v... ] α ∈A : R ( uα )= c α ∈A : R ( vα )= c v u .3 .4 .7 u0 u2 v2 v1 u1 v0 Figure. Schematic partition of { 0 , 1 , 2 } ∗ into equivalence classes 15

  16. v u .3 .4 .7 u0 u2 v2 v1 u1 v0 THEOREM A. X is embedable w.r.t. R if and only if, for x ∈ A ∗ , if we condition on having X = x... then the stochastic process X R n := R ( X 1 · · · X n ) , n ≥ | x | , is a first-order homogeneous Markov chain with transition probabilities that do not depend on x THEOREM B. For each equivalence relation R in A ∗ , there exists a unique coarsest refinement R ′ of R w.r.t. which X is embedable 16

  17. APPLICATION/QUESTION. What is the smallest state-space for studying the frequency statistics of a language L in X ? − → X = a b b a b . . . (original sequence) → X R − = 1 0 0 1 0 . . . (non-Markovian encoding) X R ′ = 0 4 6 3 4 . . . (optimal Markovian encoding) X Q = 6 3 18 15 10 . . . (any other Markovian encoding) A*/L L ab a abbab abb abba Figure. Partition R = {L , A ∗ \ L} s.t. X R is non-Markovian 17

  18. APPLICATION/QUESTION. What is the smallest state-space for studying the frequency statistics of a language L in X ? − → X = a b b a b . . . (original sequence) X R = 1 0 0 1 0 . . . (non-Markovian encoding) → X R ′ − = 0 4 6 3 4 . . . (optimal Markovian encoding) X Q = 6 3 18 15 10 . . . (any other Markovian encoding) L A*/L (0) (1) (2) (4) (5) ab a (6) abbab (3) abb abba Figure. Coarsest refinement R ′ of R w.r.t. which X is embedable 18

  19. APPLICATION/QUESTION. What is the smallest state-space for studying the frequency statistics of a language L in X ? − → X = a b b a b . . . (original sequence) X R = 1 0 0 1 0 . . . (non-Markovian encoding) X R ′ = 0 4 6 3 4 . . . (optimal Markovian encoding) → X Q − = 6 3 18 15 10 . . . (any other Markovian encoding) A*/L L (0) (1) (2) (3) (4) (5) ab (6) (8) (9) (7) a (10) (11) abbab (12) (13) (18) (14) (15) (16) (17) abb abba Figure. Arbitrary refinement Q of R w.r.t. which X is embedable 19

  20. REMARK. The optimal refinement R ′ of R such that X R ′ is embedable is obtained through a limiting process: this makes it almost impossible to characterize de equivalence classes of R ′ Motivated by this we will introduce an embedding which—while not as optimal—it is analytically tractable (!) 20

  21. DEFINITION. The Markov relation induced by X into A ∗ is the equivalence relation defined as uR X v ⇔ ( ∀ w ∈ A ∗ ) : P [ X = uw... | X = u... ]= P [ X = vw... | X = v... ] 21

  22. DEFINITION. The Markov relation induced by X into A ∗ is the equivalence relation defined as uR X v ⇔ ( ∀ w ∈ A ∗ ) : P [ X = uw... | X = u... ]= P [ X = vw... | X = v... ] 00 .4 0 .6 01 .8 u ε .2 10 .5 1 v .5 11 Figure. Weighted tree visualization of definition with A = { 0 , 1 } 22

  23. .4 00 0 .6 01 .8 u ε .2 10 .5 1 v .5 11 An equivalence relation R is said to be right-invariant if for all u, v ∈ A ∗ and α ∈ A : R ( u ) = R ( v ) = ⇒ R ( uα ) = R ( vα ) THEOREM C. X is embedable w.r.t. any right-invariant equivalence relation that is a refinement of R X ; in particular, X is embedable w.r.t. R X 23

  24. EXAMPLE. Back to the seemingly unbiassed coin. For 0 < p < 1 / 2, define n 8 1 X i > 1 P Bernoulli( p ) , > 2 n > > i =1 > > n < d 1 X i = 1 P Bernoulli(1 / 2) , X n +1 = 2 n i =1 > n > > 1 X i < 1 P > Bernoulli(1 − p ) , > : 2 n i =1 We aim to understand the frequency statistics of { 0 , 1 } ∗ { 1 } , L 1 = { 0 } ∗ { 1 }{ 0 } ∗ ( { 1 }{ 0 } ∗ { 1 }{ 0 } ∗ ) ∗ L 2 = within X 24

  25. PROPOSITION. R : { 0 , 1 } ∗ → Z defined as 8 9 | x | | x | | x | x i − | x | < = X X X R ( x ) = 2 ; = x i − (1 − x i ) 2 : i =1 i =1 i =1 is a right-invariant refinement of R X . In particular, X R n := R ( X 1 · · · X n ) is a first-order homogeneous Markov chain p (1-p) 1/2 1/2 (1-p) p n<0 n>0 0 X R is recurrent , with period 2. Because 0 < p < 1 / 2, X R is positive recurrent ; in particular, there exists a stationary distribution π . Observe that n X S L 1 = X i n i =1 25

  26. n S L 1 P = X i n i =1 COROLLARY A. If U and V are Z -valued random variables such that P [ U = n ] = 2 · π ( n ) , n = 0( mod 2); P [ V = n ] = 2 · π ( n ) , n = 1( mod 2); then for L 1 := { 0 , 1 } ∗ { 1 } it applies that  S L 1 ff − 1 d n lim 2 n · = U ; n 2 n →∞ n =0(mod 2)  S L 1 ff − 1 d n lim 2 n · = V. n 2 n →∞ n =1(mod 2) 26

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend