1
play

1 The Minimization Problem The Minimization Problem Input: A DFA - PDF document

Simpler & More General Minimization The Minimization Problem for Weighted Finite-State Automata Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| )


  1. Simpler & More General Minimization The Minimization Problem for Weighted Finite-State Automata Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| ) (Hopcroft 1971) Jason Eisner Jason Eisner Johns Hopkins University a May 28, 2003 — HLT-NAACL b a b a b b First half of talk is setup - reviews past work. b Second half gives outline of the new results. Represents the language { aab, abb, bab, bbb} The Minimization Problem The Minimization Problem Input: A DFA (deterministic finite-state automaton) Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| ) (Hopcroft 1971) Complexity: O(|arcs| log |states| ) (Hopcroft 1971) a a b b a a b b a a b b b b b b Represents the language { aab, abb, bab, bbb} Represents the language { aab, abb, bab, bbb} The Minimization Problem The Minimization Problem Input: A DFA (deterministic finite-state automaton) Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| ) (Hopcroft 1971) Complexity: O(|arcs| log |states| ) (Hopcroft 1971) a a b b a a b b a a b b b b Represents the language { aab, abb, bab, bbb} Represents the language { aab, abb, bab, bbb} 1

  2. The Minimization Problem The Minimization Problem Input: A DFA (deterministic finite-state automaton) Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| ) (Hopcroft 1971) Complexity: O(|arcs| log |states| ) (Hopcroft 1971) Here’s what you should worry about: a a b b a a b b a b b b b Mergeable because they Mergeable because they Can’t always work backward from final state like this. have the same suffix have the same suffix A bit more complicated because of cycles. language : { ab,bb} language : { b} Don’t worry about it for this talk. An equivalence relation on states … merge the equivalence classes Real-World NLP: The Minimization Problem Automata With Weights or Outputs � Finite-state computation of functions Input: A DFA (deterministic finite-state automaton) � Concatenate strings Output: An equiv. DFA with as few states as possible b:wx d: ε abd → wwx a:w Complexity: O(|arcs| log |states| ) (Hopcroft 1971) acd → wwz c:wz Q: Why minimize # states, rather than # arcs? � Add scores A: Minimizing # states also minimizes # arcs! b:3 abd → 5 a:2 d:0 acd → 9 Q: What if the input is an NDFA (nondeterministic) ? c:7 A: Determinize it first. (could yield exponential blowup � ) � Multiply probabilities Q: How about minimizing an NDFA to an NDFA? b:0.3 abd → 0.06 a:0.2 d:1 A: Yes, could be exponentially smaller ☺ , acd → 0.14 but problem is PSPACE-complete so we don’t try. � c:0.7 Real-World NLP: Real-World NLP: Automata With Weights or Outputs Automata With Weights or Outputs � Want to compute functions on strings: Σ * → K � Want to compute functions on strings: Σ * → K � After all, we’re doing language and speech! � After all, we’re doing language and speech! � Finite-state machines can often do the job � Finite-state machines can often do the job � Easy to build, easy to combine, run fast How do we minimize such DFAs? How do we minimize such DFAs? � Build them with weighted regular expressions � To clean up the resulting DFA, � Didn’t Mohri already answer this question? minimize it to merge redundant portions � This smaller machine is faster to intersect/compose � Only for special cases of the output set K! � More likely to fit on a hand-held device � I s there a general recipe? � More likely to fit into cache memory � What new algorithms can we cook with it? 2

  3. Weight Algebras Weight Algebras � Finite-state computation of fu � Finite-state computation of fu Specify a weight algebra (K, ⊗ ⊗ ) ⊗ ⊗ Specify a weight algebra (K, ⊗ ⊗ ⊗ ) ⊗ � � � Concatenate strings � Concatenate strings Define DFAs over (K, ⊗ ⊗ ) ⊗ ⊗ Define DFAs over (K, ⊗ ⊗ ⊗ ⊗ ) � b:wx � b:wx d: ε d: ε a:w a:w Arcs have weights in set K Arcs have weights in set K � � A path’s weight is also in K: A path’s weight is also in K: � � multiply its arc weights with ⊗ ⊗ multiply its arc weights with ⊗ ⊗ ⊗ ⊗ c:wz ⊗ ⊗ c:wz Examples: � � Add scores � Add scores b:3 Q: Semiring is (K, ⊕ , ⊗ ⊗ ⊗ ). Why ⊗ b:3 � (strings, concatenation) � a:2 a:2 d:0 aren’t you talking about ⊕ too? d:0 � (scores, addition) A: Minimization is about DFAs. � (probabilities, multiplication) � c:7 c:7 � (score vectors, addition) At most one path per input. OT phonology � So no need to ⊕ the weights of conditional random fields, rational kernels � (real weights, multiplication) � Multiply probabilities � � Multiply probabilities multiple accepting paths. � (objective func & gradient, training the parameters of a model b:0.3 b:0.3 product-rule multiplication) a:0.2 d:1 a:0.2 d:1 � (bit vectors, conjunction) membership in multiple languages at once c:0.7 c:0.7 Shifting Outputs Along Paths Shifting Outputs Along Paths � Doesn’t change the function computed: � Doesn’t change the function computed: b: wx b: x d: ε abd → wwx d: ε abd → wwx a:w a:ww acd → wwz acd → wwz c: wz c: z Shifting Outputs Along Paths Shifting Outputs Along Paths � Doesn’t change the function computed: � Doesn’t change the function computed: b: wx b: wwx d: ε abd → wwx a: ε d: ε abd → wwx a:w acd → wwz acd → wwz c: wz c: wwz 3

  4. Shifting Outputs Along Paths Shifting Outputs Along Paths � Doesn’t change the function computed: � Doesn’t change the function computed: b: wx d: ε abd → wwx a:w acd → wwz c: wz b:3 abd → 5 a:2 d:0 acd → 9 c:7 Shifting Outputs Along Paths Shifting Outputs Along Paths � Doesn’t change the function computed: � Doesn’t change the function computed: 2 1 3 4 b:3-1 b:3-2 abd → 5 abd → 5 a:2+1 d:0 a:2+2 d:0 acd → 9 acd → 9 c:7-1 c:7-2 6 5 Shifting Outputs Along Paths Shifting Outputs Along Paths � Doesn’t change the function computed: b: wx d: ε abd → wwx a:w acd → wwz 0 5 c: wz …ebd → uwx b:3-3 abd → 5 e:u a:2+3 d:0 …ecd → uwz acd → 9 c:7-3 4 4

  5. Shifting Outputs Along Paths Shifting Outputs Along Paths � State sucks back a prefix from its out-arcs � State sucks back a prefix from its out-arcs and deposits it at end of its in-arcs. b: x b: x d: ε abd → wwx d: ε abd → wwx a:w a:ww w acd → wwz acd → wwz c: z c: z …ebd → uwx …ebd → uwx e:u e:uw …ecd → uwz …ecd → uwz Shifting Outputs Along Paths Shifting Outputs Along Paths b:wx b: x b: wx d: ε abd → wwx d: ε abd → wwx a:w a:w w acd → wwz acd → wwz c: z c: wz …ebd → uwx …ebd → uwx e:u e:u …ecd → uwz …ecd → uwz …ab n bd → u(wx) n wx …ab n cd → u(wx) n wz Shifting Outputs Along Paths Shifting Outputs Along Paths b: x b: xw b: x b: x d: ε abd → wwx d: ε abd → wwx a:w a:ww w acd → wwz acd → wwz c: z c: z …ebd → uwx …ebd → uwx e:u e:uw …ecd → uwz …ecd → uwz …ab n bd → u(wx) n wx …ab n bd → u(wx) n wx …ab n cd → u(wx) n wz …ab n cd → u(wx) n wz …ab n bd → uw(xw) n x …ab n cd → uw(xw) n z 5

  6. Shifting Outputs Along Paths Shifting Outputs Along Paths b: x b:wx b: x b: wx d: ε abd → wwx d: ε abd → wwx a:w a:w w acd → wwz acd → wwz c: z c: wz …ebd → uwx …ebd → uwx e:u e:u …ecd → uwz …ecd → uwz …ab n bd → u(wx) n wx …ab n bd → u(wx) n wx …ab n cd → u(wx) n wz …ab n cd → u(wx) n wz …ab n bd → uw(xw) n x …ab n bd → uw(xw) n x …ab n cd → uw(xw) n z …ab n cd → uw(xw) n z Shifting Outputs Along Paths (Mohri) Shifting Outputs Along Paths (Mohri) � Here, not all the out-arcs start with w � Here, not all the out-arcs start with w � But all the out- paths start with w � But all the out- paths start with w � Do pushback at later states first: � Do pushback at later states first: now we’re ok! b: wx b: wx d: ε d: ε a:w a:w ε c: c: w e:u e:u ε : ε ε : ε d: ε d:w b:wz b: zw Shifting Outputs Along Paths (Mohri) Shifting Outputs Along Paths (Mohri) � Here, not all the out-arcs start with w � Here, not all the out-arcs start with w � But all the out- paths start with w � But all the out- paths start with w � Do pushback at later states first: now we’re ok! � Do pushback at later states first: now we’re ok! b: x b: x d: ε d: ε a:w a:ww w ε ε c: c: e:u e:uw ε : ε d: ε ε : ε d: ε b: zw b: zw 6

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend