phonology and speech applications with weighted automata
play

Phonology and speech applications with weighted automata Natural - PowerPoint PPT Presentation

Phonology and speech applications with weighted automata Natural Language Processing LING/CSCI 5832 Mans Hulden Dept. of Linguistics mans.hulden@colorado.edu Feb 19 2014 Overview (1) Recap unweighted finite automata and transducers (2)


  1. Phonology and speech applications with weighted automata Natural Language Processing LING/CSCI 5832 Mans Hulden Dept. of Linguistics mans.hulden@colorado.edu Feb 19 2014

  2. Overview (1) Recap unweighted finite automata and transducers (2) Extend to probabilistic weighted automata/transducers (3) See how these can be used in natural language applications + a brief look at speech applications

  3. RE: anatomy of a FSA Regular expression Formal definition L = a b* c Q = {0,1,2} (set of states) Σ = {a,b,c} (alphabet) 
 q 0 = 0 (initial state) Graph representation F = {2} (set of final states) b δ (0,a) = 1, δ (1,b) = 1, δ (1,c) = 2 a c (transition function) 0 1 2 defines a set of strings

  4. RE: anatomy of an FST Formal definition Graph representation Q = {0,1,2,3} (set of states) Σ = {a,b,c,d} (alphabet) 
 q 0 = 0 (initial state) a b a b d 2 a c F = {0,1,2} (set of final states) c b d 0 δ (transition function) 1 c <a:b> 3 d string-to-string mapping

  5. RE: composition NEG+possible+ity+NOUN+PLURAL NEG+possible+ity+NOUN+PLURAL t r a n 24 25 26 27 g 23 e s 33 34 35 s s i b l 18 19 20 21 22 28 n e s o a s 32 36 l y i + r e t t c 39 p 14 15 16 17 31 l 10 u 37 8 n + t y 29 g u i 30 d e e 0 1 9 l 38 i k s i 11 12 13 a e m o 2 3 4 u t 5 6 7 0 d s a in+possible+ity+s 2 5 u i 1 e u 3 6 <+:0> m e t 10 4 7 <n:m> @ + m p <n:m> @ + m n i o n 4 n <n:m> 22 11 0 8 @ m p + 2 <n:m> <+:0> <+:0> <+:0> n + 1 23 12 9 3 p s u l s u l p p 27 33 24 13 im+possible+ity+s t i o r 28 25 17 14 g k r s e 29 26 18 15 a e s t @ + e l t y 30 34 19 16 @ e i l t y n i l t @ + i l t y l 7 e 8 + b 9 @ + e i l t y b b i 20 35 b 10 @ + e i t y 0 1 t b b 11 31 b b <l:i> <e:l> <+:i> 2 3 4 <i:t> 5 21 @ + e i l y g <t:y> @ + e i l t 6 l y <y:0> 32 im+possibility+s e 36 a n 37 i 40 e @ <+:0> 41 38 t l s 42 c 39 0 y s 43 impossibilities impossibilities

  6. Orthographic vs. phonetic representation NEG+possible+ity+NOUN+PLURAL NEG+possible+ity+NOUN+PLURAL t r a n 24 25 26 27 g 23 e s 33 34 35 s s i b l 18 19 20 21 22 28 n e s o a s 32 36 l y i + r e t t c 39 p 14 15 16 17 31 l 10 u 37 8 n + t y 29 g u i 30 d e e 0 1 9 l 38 i k s i 11 12 13 a e m o 2 3 4 u t 5 6 7 0 d s a in+possible+ity+s 2 5 u i 1 e u 3 6 <+:0> m e t 10 4 7 <n:m> @ + m p <n:m> @ + m n i o n 4 n <n:m> 22 11 0 8 @ m p + 2 <n:m> <+:0> <+:0> <+:0> n + 1 23 12 9 3 p s u l s u l p p 27 33 24 13 im+possible+ity+s t i o r 28 25 17 14 g k r s e 29 26 18 15 a e s t @ + e l t y 30 34 19 16 @ e i l t y n i l t @ + i l t y l 7 e 8 + b 9 @ + e i l t y b b i 20 35 b 10 @ + e i t y 0 1 t b b 11 31 b b <l:i> <e:l> <+:i> 2 3 4 <i:t> 5 21 @ + e i l y g <t:y> @ + e i l t 6 l y <y:0> 32 e impossibilities 36 a n 37 i 40 e 41 38 G2P t l s 42 c 39 y s 43 [ ɪ mp ɑ s ə b ɪ l ə tis] [ ɪ mp ɑ s ə b ɪ l ə tis]

  7. Noisy channel models guess at noisy DECODER word original word SOURCE word NOISY CHANNEL A general framework for thinking about spell checking, speech recognition, and other problems that involve decoding in probabilistic models Similar problem to morphology ‘decoding’

  8. Example: spell checking guess at noisy DECODER word original word SOURCE word NOISY CHANNEL Problem form w ˆ argmax P w O w V The function argmax

  9. Noisy channel models guess at noisy DECODER word original word SOURCE word NOISY CHANNEL Problem form w ˆ argmax P w O w V x O into three other proba The function argmax P y x P x (Bayes’ Rule) P x y P y We can see this by substitutin

  10. Noisy channel models guess at noisy DECODER word original word SOURCE word NOISY CHANNEL Problem form w ˆ argmax P w O w V The function argmax We can see this by substituting (5. P O w P w w ˆ argmax P O w V The probabilities on the righ

  11. Noisy channel models guess at noisy DECODER word original word SOURCE word NOISY CHANNEL Problem form language model w ˆ argmax P w O w V The function argmax P O w P w w ˆ argmax argmax P O w P w P O w V w V prior To summarize, the most probable word w given som likelihood #3. w ˆ argmax P O w P w error model w V tions we will show how to compute the

  12. Decoding NEG+possible+ity+NOUN+PLURAL impossibility t r a n 24 25 26 27 g 23 e s 33 34 35 s s i b l 18 19 20 21 22 28 n e s o a s 32 36 l y i + r e t t c 39 p 14 15 16 17 31 l 10 u 37 8 n + t y 29 g u i 30 d e e 0 1 9 l 38 i k s i 11 12 13 a e m o 2 3 4 u t 5 6 7 in+possible+ity+s <n:m> @ + m p @ + m n 4 n <n:m> 0 @ m p + 2 <n:m> n + 1 3 noisy p im+possible+ity+s word word @ + e l t y @ e i l t y @ + i l t y l 7 e 8 + b 9 @ + e i l t y b b i b 10 @ + e i t y 0 1 t NOISY CHANNEL b b 11 b <l:i> <e:l> <+:i> 2 3 4 <i:t> 5 @ + e i l y <t:y> @ + e i l t 6 <y:0> im+possibility+s @ <+:0> 0 impssblity impossibilities

  13. Decoding NEG+possible+ity+NOUN+PLURAL impossibility non-probabilistic probabilistic changes changes (errors) decode noisy Morphology/ word word phonology NOISY CHANNEL impssblity impossibilities

  14. Decoding/speech processing NEG+possible+ity+NOUN+PLURAL decoding is a problem non-probabilistic probabilistic changes changes decode noisy Morphology/ word word phonology NOISY CHANNEL impossibilities

  15. Probabilistic automata Intuition - define probability distributions over strings - symbols have transition probabilities - states have final/halting probabilities - probabilities are multiplied along paths - probabilities are summed for several parallel paths

  16. Probabilistic automata Intuition

  17. Aside: HMMs and prob. automata Are equivalent (though automata may be more compact) 0.04 0.04 0.36 0.36 a 0.02 a 0.18 0.9 b 0.72 0.1 11 12 11 12 b 0.08 [a 0.3] [a 0.2] a 0.21 [b 0.7] [b 0.8] b 0.21 b 0.49 0.7 0.3 a 0.09 0.1 b 0.02 a 0.08 0.9 a 0.72 b 0.18 a 0.27 ⇒ 0.7 22 22 21 21 0.3 a 0.63 [a 0.8] [a 0.9] b 0.03 b 0.07 [b 0.2] [b 0.1] 0.42 0.42 0.18 0.18

  18. Probabilistic automata from probabilistic to weighted As always, we would prefer using(negative) logprobs, since this makes calculations easier: -log(0.16) ≈ 1.8326 -log(0.84) ≈ 0.1744 -log(1) = 0 -log(0) = ∞ Since the more probable is now numerically smaller, we call them weights

  19. Semirings A semiring ( K , ⊕ , ⊗ , 0 , 1) = a ring that may lack negation. • Sum : to compute the weight of a sequence (sum of the weights of the paths labeled with that sequence). • Product: to compute the weight of a path (product of the weights of con- stituent transitions). 0 1 Semiring Set ⊕ ⊗ Boolean { 0 , 1 } 0 1 ∨ ∧ Probability + 0 1 R + × Log R ∪ { −∞ , + ∞ } + + ∞ 0 ⊕ log Tropical R ∪ { −∞ , + ∞ } min + + ∞ 0 Σ ∗ ∪ { ∞ } String · ∧ ∞ � ⊕ log is defined by: x ⊕ log y = − log( e − x + e − y ) and ∧ is longest common prefix. The string semiring is a left semiring . ⊗ respecti ⊕ = s = s , s ⊕ 0 and s ⊗ 1 , = additional constraints equal 0 . equal . Also, s ⊗ 0

  20. Semirings b/1 a/1 1/2 c/3 b/4 3/2 0 a/2 b/3 b/1 c/5 2 Probability semiring ( R + , + , × , 0 , 1) Tropical semiring ( R + ∪ { ∞ } , min , + , ∞ , 0) [ [ A ] ]( ab ) = 14 [ [ A ] ]( ab ) = 4 (1 × 1 × 2 + 2 × 3 × 2 = 14) (min(1 + 1 + 2 , 3 + 2 + 2) = 4)

  21. Formal definition Σ is an automaton, Σ Initial output function , Output function : , Σ Final output function , Function : Σ associated with : .

  22. Weighted transducers Intuition

  23. Weighted transducers Semirings 1/2 a: ε /1 b:r/2 0 a:r/3 b: ε /2 3/2 2 c:s/1 Probability semiring ( R + , + , × , 0 , 1) Tropical semiring ( R + ∪ { ∞ } , min , + , ∞ , 0) [ [ T ] ]( ab, r ) = 16 [ [ T ] ]( ab, r ) = 5 (1 × 2 × 2 + 3 × 2 × 2 = 16) (min(1 + 2 + 2 , 3 + 2 + 2) = 5)

  24. Weighted transducers Formal definition Σ ∆ Finite alphabets Σ and ∆ , Finite set of states , Transition function : 2 , Σ Output function : Σ , Σ set of initial states, set of final states. defines a relation: 2 : Σ

  25. Operations on weighted automata

  26. Booleans Union: Example a/5 a/3 b/1 1 a/3 b/2 3 /0 a/4 0 2 b/6 b/7 c/1 c/0 b/2 a/6 b/3 a/4 1 b/5 2 3 4 5 /0 0 a/3 c/1 c/0 b/2 a/6 b/3 a/4 1 2 3 4 5 /0 b/5 a/3 0 ε /0 10 a/5 b/1 ε /0 a/3 a/3 7 6 b/2 9 /0 b/7 8 b/6 a/4

  27. Composition x x T T ○ U y U z z

  28. Composition x x T T ○ U y U z z Multiplicative ~ p(y|x) p(z|y)

  29. Composition A a:a/3 b: ε /1 c: ε /4 d:d/2 0 1 2 3 4 a:d/5 ε /7 :e d:a/6 B 0 1 2 3 A o B a:d/15 b:e/7 c: ε /4 d:a/12 (0,0) (1,1) (2,2) (3,2) (4,3)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend