semiring parsing
play

Semiring parsing Arnd Hartmanns Probabilistic grammars Motivation - PowerPoint PPT Presentation

Semiring parsing Arnd Hartmanns Probabilistic grammars Motivation Natural language is ambiguous: Sentence I saw the man with the telescope. Arnd Hartmanns Semiring parsing Probabilistic grammars Motivation Natural language is ambiguous:


  1. Semiring parsing Arnd Hartmanns

  2. Probabilistic grammars Motivation Natural language is ambiguous: Sentence I saw the man with the telescope. Arnd Hartmanns Semiring parsing

  3. Probabilistic grammars Motivation Natural language is ambiguous: Sentence I saw the man with the telescope. Arnd Hartmanns Semiring parsing

  4. Probabilistic grammars Probabilistic context-free grammars Context-free grammar: G = (N, T, S, R) + probability distribution on derivations S S e.g. P( ) = 0.001, but P( ) = 0.000001 telescope telescope p : R → 0,1 Use s.t. ∀ A ∈ N : p A → 𝛽 = 1 A →𝛽 ∈ R P = p A → 𝛽 and get A →𝛽 ∈ Arnd Hartmanns Semiring parsing

  5. Probabilistic grammars Example PCFG A → A P A → A P A → PP P → I saw | the man | with the telescope A A A A a a a a P a P a P P P P I saw I sa w th the m e man an wi with th th the te e tele lesc scop ope I sa I saw w th the m e man an wi with th th the te e tele lesc scop ope Arnd Hartmanns Semiring parsing

  6. Probabilistic grammars Example PCFG A → A a p(A → Aa) = 0.4 p(A → aA) = 0.1 A → A a p(A → aa) = 0.5 A → aa A A A A a a a a a P a P P P P P lower probability higher probability P = 0.1 × 0.5 = 0. 0.05 05 P = 0.4 × 0.5 = 0. 0.2 Arnd Hartmanns Semiring parsing

  7. Probabilistic grammars Interesting values Example calculations Inside probability Input: . a . a . a . 1 2 3 4 Viterbi inside(1, A, 4) Viterbi-derivation = P( ) + P( ) Viterbi-n-best = 0.2 + 0.05 Outside probability = 0.25 Telescope grammar viterbi-derivation(1, A, 4) p(A → Aa) = 0.4 = p(A → aA) = 0.1 viterbi(1, A, 4) = 0.2 p(A → aa) = 0.5 Arnd Hartmanns Semiring parsing

  8. Semirings? Arnd Hartmanns Semiring parsing

  9. Extending CKY CKY parsing Input: w 1 … w n ; Goal item: [1,S,n+1] S w 1 … w n Beyond recognition [i,A,k] provable ⇔ V[i,A,k] = true A B C w 1 … w m … w k Arnd Hartmanns Semiring parsing

  10. Extending CKY CKY parsing Input: w 1 … w n ; Goal item: [1,S,n+1] (A → w i ) ∈ R (A → BC) ∈ R [i,B,m] [m,C,k] Rules: , [i,A,i+1] [i,A,k] Beyond recognition Unary rule: (A → w i ) ∈ R ⇒ V[i,A,i+1] = true Binary rule: (A → BC) ∈ R ⇒ V[i,A,k] = V[i,A,k] ∨ (V[i,B,m] ∧ V[m,C,k] ) ∨ ∧ success = V[1,S,n+1] success Arnd Hartmanns Semiring parsing

  11. Extending CKY CKY parsing Input: w 1 … w n ; Goal item: [1,S,n+1] (A → w i ) ∈ R (A → BC) ∈ R [i,B,m] [m,C,k] Rules: , [i,A,i+1] [i,A,k] Beyond recognition p(A → w i ) Unary rule: (A → w i ) ∈ R ⇒ V[i,A,i+1] = Binary rule: (A → BC) ∈ R ⇒ V[i,A,k] = V[i,A,k] ∨ (V[i,B,m] ∧ V[m,C,k] + ) × p(A → BC)) × success = V[1,S,n+1] inside Arnd Hartmanns Semiring parsing

  12. Semirings Semiring definition Recall: field → ring + −x x −1 0 × 1 → se semir miring ing ∞ Complete semiring: is well-defined Some semirings Natural numbers: 〈 ℕ [0, ∞ ], +, × , 0, 1 〉 Reals with max: 〈 ℝ [0,1], max, × , 0, 1 〉 Arnd Hartmanns Semiring parsing ∞ ∀ ≤ ⇒ ≤

  13. Extending CKY Derivations Grammar Parser T → a T → a T → a A → AT [2,T,3] [1,T,2] A → TT A → TT T → a [3,T,4] [1,A,3] A → AT T → a T → a a a a [1,A,4] Derivation values Grammar: Multiply all rule values Arnd Hartmanns Semiring parsing

  14. Extending CKY Derivations Grammar Parser [1,A,4] A → AT A → TT T → a A → AT [1,A,3] [3,T,4] T → a T → a A → TT [1,T,2] [2,T,3] a a a T → a T → a T → a Derivation values Grammar: Multiply all rule values Parser: Multiply rule values recursively via item values Arnd Hartmanns Semiring parsing

  15. Semiring computations Notations Value of a rule R(A → BC) – from semiring Grammar derivation E = e 1 … e m – list of rules Item derivation tree D = D 1 …D m – leaves are rules Grammar Parser Value of a derivation: V D = ⨂ d leaf R d m R e i = R D (leaf node) V G E = ⨂ i = 1 m V D i (inner node) ⨂ i = 1 Word, derivable by E 1 … E k : Item x, heading D 1 … D k : k k V G = ⊕ j = 1 V G E j V x = ⊕ j = 1 V ( D j ) Arnd Hartmanns Semiring parsing

  16. Semirings Useful semirings Recognition: 〈 {true, false}, ∨ , ∧ , false, true 〉 Derivation number: 〈 ℕ [0, ∞ ], +, × , 0, 1 〉 Derivation forest: 〈 2 𝔽 , ∪ , ∙ , ∅ , { 〈〉 } 〉 Inside probability: 〈 ℝ [0, ∞ ], +, × , 0, 1 〉 Viterbi: 〈 ℝ [0,1], max, × , 0, 1 〉 Viterbi-derivation: 〈 ℝ [0,1] × 2 𝔽 , max Vit , × Vit , 〈 0, ∅〉 , 〈 1, { 〈〉 } 〉 〉 Viterbi-n-best: way too complicated… Arnd Hartmanns Semiring parsing

  17. Semiring computations Derivation forest example 〈 2 𝔽 , ∪ , ∙ , ∅ ,{ 〈〉 } 〉 Input: . a . a . a . 1 2 3 4 V([1,T,2])={ 〈 T → a 〉 } (T → a) [i,T,i+1] V([2,T,3])={ 〈 T → a 〉 } V([3,T,4])={ 〈 T → a 〉 } Arnd Hartmanns Semiring parsing

  18. Semiring computations Derivation forest example 〈 2 𝔽 , ∪ , ∙ , ∅ ,{ 〈〉 } 〉 Input: . a . a . a . 1 2 3 4 V([1,T,2])={ 〈 T → a 〉 } (A → TT) [i,T,m] V([2,T,3])={ 〈 T → a 〉 } [m,T,k] V([3,T,4])={ 〈 T → a 〉 } [i,A,k] V([1,A,3])={ 〈 A → TT,T → a,T → a 〉 } V([2,A,4])={ 〈 A → TT,T → a,T → a 〉 } Arnd Hartmanns Semiring parsing

  19. Semiring computations Derivation forest example 〈 2 𝔽 , ∪ , ∙ , ∅ ,{ 〈〉 } 〉 Input: . a . a . a . 1 2 3 4 V([1,T,2])={ 〈 T → a 〉 } (A → TA) [i,T,m] V([2,T,3])={ 〈 T → a 〉 } [m,A,k] V([3,T,4])={ 〈 T → a 〉 } [i,A,k] V([1,A,3])={ 〈 A → TT,T → a,T → a 〉 } V([2,A,4])={ 〈 A → TT,T → a,T → a 〉 } V([1,A,4])={ 〈 A → AT,A → TT,T → a,T → a,T → a 〉 } ∪ { 〈 A → TA,A → TT,T → a,T → a,T → a 〉 } Arnd Hartmanns Semiring parsing

  20. Semiring parsing Beyond CKY Works for many parsers e.g. Earley, but also for TAGs Omissions Outside values complicated, but similar proofs ∞ Infinite summation for A → A, semiring-dependent Further reading Joshua Goodman: Semiring parsing …and his Ph.D. thesis Arnd Hartmanns Semiring parsing

  21. Semiring parsing Summary Natural language processing problems Probabilistic grammars p(A → Aa) = 0.4 Inside probability, Viterbi , … ⊕ ⊗ Semiring operation substitution Arnd Hartmanns Semiring parsing ∞

  22. Semiring parsing Summary Natural language processing problems Probabilistic grammars p(A → Aa) = 0.4 Inside probability, Viterbi , … ⊕ ⊗ Semiring operation substitution one parser many values Arnd Hartmanns Semiring parsing ∞

  23. Arnd Hartmanns Semiring parsing

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend