1 The Minimization Problem The Minimization Problem Input: A DFA - PDF document

Simpler & More General Minimization The Minimization Problem for Weighted Finite-State Automata Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| ) (Hopcroft 1971) Jason Eisner Jason Eisner Johns Hopkins University a May 28, 2003 — HLT-NAACL b a b a b b First half of talk is setup - reviews past work. b Second half gives outline of the new results. Represents the language { aab, abb, bab, bbb} The Minimization Problem The Minimization Problem Input: A DFA (deterministic finite-state automaton) Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| ) (Hopcroft 1971) Complexity: O(|arcs| log |states| ) (Hopcroft 1971) a a b b a a b b a a b b b b b b Represents the language { aab, abb, bab, bbb} Represents the language { aab, abb, bab, bbb} The Minimization Problem The Minimization Problem Input: A DFA (deterministic finite-state automaton) Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| ) (Hopcroft 1971) Complexity: O(|arcs| log |states| ) (Hopcroft 1971) a a b b a a b b a a b b b b Represents the language { aab, abb, bab, bbb} Represents the language { aab, abb, bab, bbb} 1

The Minimization Problem The Minimization Problem Input: A DFA (deterministic finite-state automaton) Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| ) (Hopcroft 1971) Complexity: O(|arcs| log |states| ) (Hopcroft 1971) Here’s what you should worry about: a a b b a a b b a b b b b Mergeable because they Mergeable because they Can’t always work backward from final state like this. have the same suffix have the same suffix A bit more complicated because of cycles. language : { ab,bb} language : { b} Don’t worry about it for this talk. An equivalence relation on states … merge the equivalence classes Real-World NLP: The Minimization Problem Automata With Weights or Outputs � Finite-state computation of functions Input: A DFA (deterministic finite-state automaton) � Concatenate strings Output: An equiv. DFA with as few states as possible b:wx d: ε abd → wwx a:w Complexity: O(|arcs| log |states| ) (Hopcroft 1971) acd → wwz c:wz Q: Why minimize # states, rather than # arcs? � Add scores A: Minimizing # states also minimizes # arcs! b:3 abd → 5 a:2 d:0 acd → 9 Q: What if the input is an NDFA (nondeterministic) ? c:7 A: Determinize it first. (could yield exponential blowup � ) � Multiply probabilities Q: How about minimizing an NDFA to an NDFA? b:0.3 abd → 0.06 a:0.2 d:1 A: Yes, could be exponentially smaller ☺ , acd → 0.14 but problem is PSPACE-complete so we don’t try. � c:0.7 Real-World NLP: Real-World NLP: Automata With Weights or Outputs Automata With Weights or Outputs � Want to compute functions on strings: Σ * → K � Want to compute functions on strings: Σ * → K � After all, we’re doing language and speech! � After all, we’re doing language and speech! � Finite-state machines can often do the job � Finite-state machines can often do the job � Easy to build, easy to combine, run fast How do we minimize such DFAs? How do we minimize such DFAs? � Build them with weighted regular expressions � To clean up the resulting DFA, � Didn’t Mohri already answer this question? minimize it to merge redundant portions � This smaller machine is faster to intersect/compose � Only for special cases of the output set K! � More likely to fit on a hand-held device � I s there a general recipe? � More likely to fit into cache memory � What new algorithms can we cook with it? 2

Weight Algebras Weight Algebras � Finite-state computation of fu � Finite-state computation of fu Specify a weight algebra (K, ⊗ ⊗ ) ⊗ ⊗ Specify a weight algebra (K, ⊗ ⊗ ⊗ ) ⊗ � � � Concatenate strings � Concatenate strings Define DFAs over (K, ⊗ ⊗ ) ⊗ ⊗ Define DFAs over (K, ⊗ ⊗ ⊗ ⊗ ) � b:wx � b:wx d: ε d: ε a:w a:w Arcs have weights in set K Arcs have weights in set K � � A path’s weight is also in K: A path’s weight is also in K: � � multiply its arc weights with ⊗ ⊗ multiply its arc weights with ⊗ ⊗ ⊗ ⊗ c:wz ⊗ ⊗ c:wz Examples: � � Add scores � Add scores b:3 Q: Semiring is (K, ⊕ , ⊗ ⊗ ⊗ ). Why ⊗ b:3 � (strings, concatenation) � a:2 a:2 d:0 aren’t you talking about ⊕ too? d:0 � (scores, addition) A: Minimization is about DFAs. � (probabilities, multiplication) � c:7 c:7 � (score vectors, addition) At most one path per input. OT phonology � So no need to ⊕ the weights of conditional random fields, rational kernels � (real weights, multiplication) � Multiply probabilities � � Multiply probabilities multiple accepting paths. � (objective func & gradient, training the parameters of a model b:0.3 b:0.3 product-rule multiplication) a:0.2 d:1 a:0.2 d:1 � (bit vectors, conjunction) membership in multiple languages at once c:0.7 c:0.7 Shifting Outputs Along Paths Shifting Outputs Along Paths � Doesn’t change the function computed: � Doesn’t change the function computed: b: wx b: x d: ε abd → wwx d: ε abd → wwx a:w a:ww acd → wwz acd → wwz c: wz c: z Shifting Outputs Along Paths Shifting Outputs Along Paths � Doesn’t change the function computed: � Doesn’t change the function computed: b: wx b: wwx d: ε abd → wwx a: ε d: ε abd → wwx a:w acd → wwz acd → wwz c: wz c: wwz 3

Shifting Outputs Along Paths Shifting Outputs Along Paths � Doesn’t change the function computed: � Doesn’t change the function computed: b: wx d: ε abd → wwx a:w acd → wwz c: wz b:3 abd → 5 a:2 d:0 acd → 9 c:7 Shifting Outputs Along Paths Shifting Outputs Along Paths � Doesn’t change the function computed: � Doesn’t change the function computed: 2 1 3 4 b:3-1 b:3-2 abd → 5 abd → 5 a:2+1 d:0 a:2+2 d:0 acd → 9 acd → 9 c:7-1 c:7-2 6 5 Shifting Outputs Along Paths Shifting Outputs Along Paths � Doesn’t change the function computed: b: wx d: ε abd → wwx a:w acd → wwz 0 5 c: wz …ebd → uwx b:3-3 abd → 5 e:u a:2+3 d:0 …ecd → uwz acd → 9 c:7-3 4 4

Shifting Outputs Along Paths Shifting Outputs Along Paths � State sucks back a prefix from its out-arcs � State sucks back a prefix from its out-arcs and deposits it at end of its in-arcs. b: x b: x d: ε abd → wwx d: ε abd → wwx a:w a:ww w acd → wwz acd → wwz c: z c: z …ebd → uwx …ebd → uwx e:u e:uw …ecd → uwz …ecd → uwz Shifting Outputs Along Paths Shifting Outputs Along Paths b:wx b: x b: wx d: ε abd → wwx d: ε abd → wwx a:w a:w w acd → wwz acd → wwz c: z c: wz …ebd → uwx …ebd → uwx e:u e:u …ecd → uwz …ecd → uwz …ab n bd → u(wx) n wx …ab n cd → u(wx) n wz Shifting Outputs Along Paths Shifting Outputs Along Paths b: x b: xw b: x b: x d: ε abd → wwx d: ε abd → wwx a:w a:ww w acd → wwz acd → wwz c: z c: z …ebd → uwx …ebd → uwx e:u e:uw …ecd → uwz …ecd → uwz …ab n bd → u(wx) n wx …ab n bd → u(wx) n wx …ab n cd → u(wx) n wz …ab n cd → u(wx) n wz …ab n bd → uw(xw) n x …ab n cd → uw(xw) n z 5

Shifting Outputs Along Paths Shifting Outputs Along Paths b: x b:wx b: x b: wx d: ε abd → wwx d: ε abd → wwx a:w a:w w acd → wwz acd → wwz c: z c: wz …ebd → uwx …ebd → uwx e:u e:u …ecd → uwz …ecd → uwz …ab n bd → u(wx) n wx …ab n bd → u(wx) n wx …ab n cd → u(wx) n wz …ab n cd → u(wx) n wz …ab n bd → uw(xw) n x …ab n bd → uw(xw) n x …ab n cd → uw(xw) n z …ab n cd → uw(xw) n z Shifting Outputs Along Paths (Mohri) Shifting Outputs Along Paths (Mohri) � Here, not all the out-arcs start with w � Here, not all the out-arcs start with w � But all the out- paths start with w � But all the out- paths start with w � Do pushback at later states first: � Do pushback at later states first: now we’re ok! b: wx b: wx d: ε d: ε a:w a:w ε c: c: w e:u e:u ε : ε ε : ε d: ε d:w b:wz b: zw Shifting Outputs Along Paths (Mohri) Shifting Outputs Along Paths (Mohri) � Here, not all the out-arcs start with w � Here, not all the out-arcs start with w � But all the out- paths start with w � But all the out- paths start with w � Do pushback at later states first: now we’re ok! � Do pushback at later states first: now we’re ok! b: x b: x d: ε d: ε a:w a:ww w ε ε c: c: e:u e:uw ε : ε d: ε ε : ε d: ε b: zw b: zw 6

1 The Minimization Problem The Minimization Problem Input: A DFA - PDF document

Simpler & More General Minimization The Minimization Problem for Weighted Finite-State Automata Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| )

Lecture 4: Transformations and Matrices CSE 40166 Computer Graphics (Fall 2010) Overall

Publius: A robust, tamper-evident, censorship-resistant web publishing system M Waldman, A Rubin

Concatenating bipartite graphs Paul Seymour (Princeton) joint with Maria Chudnovsky, Patrick

Dynamic Graph CNN for learning on point clouds Wang Yue, et al. Otakar Jaek March 25, 2019

Satellite operators as group actions on knot concordance Arunima Ray, Rice University (Joint

Concordance of positive knots Alexander School of General Studies, GIST

Ribbon Concordance and Link Homology Theories Adam Simon Levine (with Ian Zemke, Onkar Singh

Grand Summary The Concordance: 1998-2018 ASTR/PHYS 4080: Introduction to Cosmology Spring 2018:

A new family of links topologically, but not smoothly, concordant to the Hopf link Arunima Ray

Corks, exotic 4-manifolds and knot concordance Kouichi Yasui Hiroshima University March 10,

Word Sense Disambiguation Word Sense Disambiguation (WSD) Given A

NEUTRINOS AND FUTURE CONCORDANCE COSMOLOGIES Neutrino 2008 / Richard Easther (Yale)

Part II Semistructured Data XML: II.1 Semistructured data, XPath and XML II.2 Structuring XML

Visualizing Text You are scrapping twitter for tweets about to create a visualization

The BabySeq Project: Genome Sequencing for Childhood Risk and Newborn Illness Sarah Kalia, ScM,

Detecting and comparing genomic compartments Cyril Kurylo , Sylvain Foissac , Matthias Zytnicki

Dataflow Testing Chapter 10 Dataflow Testing Testing All-Nodes and All-Edges in a control

Cosmological background solutions and cosmological backreactions V. Marra, E. W. Kolb, S.

Ocelot and the Seman.c Web MULTILINGUAL WEB WORKSHOP, RIGA PHIL RITCHIE,

Massive Text Corpora Jialu Liu, Jingbo Shang, Chi Wang, Xiang Ren, Jiawei Han University of

Tight Bounds on Minimax Regret under Logarithmic Loss via Self-Concordance Blair Bilodeau 1,2,3 ,

Building ilding an an op open en con oncordancer ordancer for or Mal alay ay/In

1 Timothy 6:1-2 (NIV) All who are under the yoke of slavery should consider their masters worthy

A Novel Holistic Behavior Change Coaching Approach Harm op den Akker, PhD Roessingh Research and

1 The Minimization Problem The Minimization Problem Input: A DFA - PDF document

Simpler & More General Minimization The Minimization Problem for Weighted Finite-State Automata Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| )

Lecture 4: Transformations and Matrices CSE 40166 Computer Graphics (Fall 2010) Overall

Publius: A robust, tamper-evident, censorship-resistant web publishing system M Waldman, A Rubin

Concatenating bipartite graphs Paul Seymour (Princeton) joint with Maria Chudnovsky, Patrick

Dynamic Graph CNN for learning on point clouds Wang Yue, et al. Otakar Jaek March 25, 2019

Satellite operators as group actions on knot concordance Arunima Ray, Rice University (Joint

Concordance of positive knots Alexander School of General Studies, GIST

Ribbon Concordance and Link Homology Theories Adam Simon Levine (with Ian Zemke, Onkar Singh

Grand Summary The Concordance: 1998-2018 ASTR/PHYS 4080: Introduction to Cosmology Spring 2018:

A new family of links topologically, but not smoothly, concordant to the Hopf link Arunima Ray

Corks, exotic 4-manifolds and knot concordance Kouichi Yasui Hiroshima University March 10,

Word Sense Disambiguation Word Sense Disambiguation (WSD) Given A

NEUTRINOS AND FUTURE CONCORDANCE COSMOLOGIES Neutrino 2008 / Richard Easther (Yale)

Part II Semistructured Data XML: II.1 Semistructured data, XPath and XML II.2 Structuring XML

Visualizing Text You are scrapping twitter for tweets about to create a visualization

The BabySeq Project: Genome Sequencing for Childhood Risk and Newborn Illness Sarah Kalia, ScM,

Detecting and comparing genomic compartments Cyril Kurylo , Sylvain Foissac , Matthias Zytnicki

Dataflow Testing Chapter 10 Dataflow Testing Testing All-Nodes and All-Edges in a control

Cosmological background solutions and cosmological backreactions V. Marra, E. W. Kolb, S.

Ocelot and the Seman.c Web MULTILINGUAL WEB WORKSHOP, RIGA PHIL RITCHIE,

Massive Text Corpora Jialu Liu*, Jingbo Shang*, Chi Wang, Xiang Ren, Jiawei Han University of

Tight Bounds on Minimax Regret under Logarithmic Loss via Self-Concordance Blair Bilodeau 1,2,3 ,

Building ilding an an op open en con oncordancer ordancer for or Mal alay ay/In

1 Timothy 6:1-2 (NIV) All who are under the yoke of slavery should consider their masters worthy

A Novel Holistic Behavior Change Coaching Approach Harm op den Akker, PhD Roessingh Research and

Massive Text Corpora Jialu Liu, Jingbo Shang, Chi Wang, Xiang Ren, Jiawei Han University of