xerox finite state tool building finite state machines
play

Xerox Finite-State Tool Building Finite-State Machines Youll use - PDF document

Xerox Finite-State Tool Building Finite-State Machines Youll use it for homework Commercial product (but we have academic license here) One of several finite-state toolkits available This one is easiest to use but doesnt


  1. Xerox Finite-State Tool Building Finite-State Machines � You’ll use it for homework … � Commercial product (but we have academic license here) � One of several finite-state toolkits available � This one is easiest to use but doesn’t have probabilities � Usage: � Enter a regular expression; it builds FSA or FST � Now type in input string � FSA: It tells you whether it’s accepted � FST: It tells you all the output strings (if any) � Can also invert FST to let you map outputs to inputs � Could hook it up to other NLP tools that need finite- state processing of their input or output 600.465 - Intro to NLP - J. Eisner 1 600.465 - Intro to NLP - J. Eisner 2 Common Regular Expression Operators What the Operators Mean concatenation EF � [blackboard discussion] * + iteration E* , E+ � [Composition is the most interesting | union E | F case: see following slides.] & intersection E & F ~ \ - complementation, minus ~ E, \x, E-F .x. crossproduct E .x. F .o. composition E .o. F .u upper (input) language E.u “domain” .l. lower (output) language E.l “range” 600.465 - Intro to NLP - J. Eisner 3 600.465 - Intro to NLP - J. Eisner 4 slide courtesy of L. Karttunen (modified) How to implement? How to define transducers? � state set Q concatenation EF * + iteration E* , E+ � initial state i | union E | F � set of final states F ~ \ - complementation, minus ~ E, \x, E-F � input alphabet Σ (also define Σ * , Σ + , Σ ?) & intersection E & F � output alphabet ∆ .x. crossproduct E .x. F � transition function d: Q x Σ ? --> 2 Q .o. composition E .o. F � output function s: Q x Σ ? x Q --> ∆ * .u upper (input) language E.u “domain” .o. lower (output) language E.l “range” 600.465 - Intro to NLP - J. Eisner 5 600.465 - Intro to NLP - J. Eisner 6 1

  2. example courtesy of M. Mohri example courtesy of M. Mohri Concatenation Union + + = = = = 600.465 - Intro to NLP - J. Eisner 7 600.465 - Intro to NLP - J. Eisner 8 example courtesy of M. Mohri example courtesy of M. Mohri Closure (this example has outputs too) Upper language (domain) * * .u .u = = = = similarly construct lower language .l why add new start state 4? why add new start state 4? also called input & output languages why not just make state 0 final? why not just make state 0 final? 600.465 - Intro to NLP - J. Eisner 9 600.465 - Intro to NLP - J. Eisner 10 example courtesy of M. Mohri example courtesy of M. Mohri Reversal Inversion .r .i .r .i = = = = 600.465 - Intro to NLP - J. Eisner 11 600.465 - Intro to NLP - J. Eisner 12 2

  3. example adapted from M. Mohri Complementation Intersection fat/0.5 � Given a machine M, represent all strings pig/0.3 eats/0 2/0.8 0 1 not accepted by M sleeps/0.6 � Just change final states to non-final and pig/0.4 vice-versa fat/0.2 sleeps/1.3 & & 2/0.5 0 1 � Works only if machine has been eats/0.6 determinized and completed first (why?) 2,0/0.8 eats/0.6 fat/0.7 pig/0.7 = = 0,0 0,1 1,1 sleeps/1.9 2,2/1.3 600.465 - Intro to NLP - J. Eisner 13 600.465 - Intro to NLP - J. Eisner 14 Intersection Intersection fat/0.5 fat/0.5 pig/0.3 eats/0 2/0.8 0 1 pig/0.3 eats/0 2/0.8 0 1 sleeps/0.6 sleeps/0.6 pig/0.4 pig/0.4 fat/0.2 sleeps/1.3 & & 2/0.5 fat/0.2 sleeps/1.3 0 1 & & 2/0.5 0 1 eats/0.6 eats/0.6 2,0/0.8 eats/0.6 fat/0.7 pig/0.7 = = 0,0 0,1 1,1 fat/0.7 = = 0,0 0,1 sleeps/1.9 2,2/1.3 Paths 0012 and 0110 both accept fat pig eats Paths 00 and 01 both accept fat So must the new machine: along path 0,0 0,1 1,1 2,0 So must the new machine: along path 0,0 0,1 600.465 - Intro to NLP - J. Eisner 15 600.465 - Intro to NLP - J. Eisner 16 Intersection Intersection fat/0.5 fat/0.5 pig/0.3 eats/0 pig/0.3 eats/0 0 1 2/0.8 0 1 2/0.8 sleeps/0.6 sleeps/0.6 pig/0.4 pig/0.4 fat/0.2 sleeps/1.3 fat/0.2 sleeps/1.3 & & 0 1 2/0.5 & & 0 1 2/0.5 eats/0.6 eats/0.6 fat/0.7 pig/0.7 fat/0.7 pig/0.7 = = = = 0,0 0,1 0,0 0,1 1,1 1,1 Paths 00 and 11 both accept pig Paths 12 and 12 both accept fat sleeps/1.9 2,2/1.3 So must the new machine: along path 0,1 1,1 So must the new machine: along path 1,1 2,2 600.465 - Intro to NLP - J. Eisner 17 600.465 - Intro to NLP - J. Eisner 18 3

  4. Intersection What Composition Means fat/0.5 pig/0.3 eats/0 2/0.8 0 1 sleeps/0.6 g f pig/0.4 3 4 αβγδ fat/0.2 sleeps/1.3 ab?d abcd & & 2/0.5 0 1 eats/0.6 2 2 αβεδ abed 2,0/1.3 eats/0.6 fat/0.7 pig/0.7 8 = = 0,0 0,1 1,1 6 αβ∈δ abjd sleeps/1.9 2,2/0.8 ... 600.465 - Intro to NLP - J. Eisner 19 600.465 - Intro to NLP - J. Eisner 20 does not contain any pair of the What Composition Means Relation = set of pairs form abjd � … abcd � αβγδ ab?d � abcd abed � αβεδ ab?d � abed abed � αβ∈δ ab?d � abjd … … g f 4 3 4 αβγδ αβγδ ab?d ab?d abcd 2 2 2 αβεδ αβεδ Relation composition: f ° g abed 8 8 αβ∈δ 6 αβ∈δ abjd ... ... 600.465 - Intro to NLP - J. Eisner 21 600.465 - Intro to NLP - J. Eisner 22 Relation = set of pairs Intersection vs. Composition abcd � αβγδ ab?d � abcd abed � αβεδ Intersection ab?d � abed f ° g abed � αβ∈δ ab?d � abjd … ab?d � αβγδ … pig /0.4 ab?d � αβεδ pig /0.3 pig /0.7 ab?d � αβ∈δ & & = = 0,1 0 1 1 1,1 4 … αβγδ ab?d Composition 2 αβεδ pig :pink/0.4 f ° g = { x � z: ∃ y (x � y ∈ f and y � z ∈ g)} Wilbur: pig /0.3 Wilbur:pink/0.7 8 .o. = αβ∈δ .o. = 0,1 where x, y, z are strings 0 1 1 1,1 ... 600.465 - Intro to NLP - J. Eisner 23 600.465 - Intro to NLP - J. Eisner 24 4

  5. Composition example courtesy of M. Mohri Intersection vs. Composition Intersection mismatch elephant/0.4 .o. .o. = = pig /0.3 pig /0.7 & = & = 0,1 0 1 1 1,1 Composition mismatch elephant:gray/0.4 Wilbur: pig /0.3 Wilbur:gray/0.7 .o. .o. = = 0,1 0 1 1 1,1 600.465 - Intro to NLP - J. Eisner 25 Composition Composition .o. .o. = = .o. .o. = = a:b .o. b: :b .o. b:b b = = a a: :b b a:b .o. b: :b .o. b:a a = = a a: :a a a a Composition Composition .o. .o. = = .o. .o. = = a:b .o. b: :b .o. b:a a = = a a: :a a b:b .o. b: :b .o. b:a a = = b b: :a a a b 5

  6. Composition Composition .o. .o. = = .o. .o. = = a:b .o. b: a :b .o. b:a a = = a a: :a a a a:a .o. a: :a .o. a:b b = = a a: :b b Composition Composition .o. .o. = = .o. .o. = = b:b .o. a: :b .o. a:b b = nothing = nothing b:b .o. b: :b .o. b:a a = = b b: :a a b b (since intermediate symbol doesn’ ’t match) t match) (since intermediate symbol doesn Composition Relation = set of pairs abcd � αβγδ ab?d � abcd abed � αβεδ ab?d � abed f ° g abed � αβ∈δ ab?d � abjd … ab?d � αβγδ … .o. .o. = = ab?d � αβεδ ab?d � αβ∈δ 4 … αβγδ ab?d 2 αβεδ f ° g = { x � z: ∃ y (x � y ∈ f and y � z ∈ g)} 8 αβ∈δ where x, y, z are strings a:b .o. a: :b .o. a:b b = = a a: :b b a ... 600.465 - Intro to NLP - J. Eisner 36 6

  7. Composition w ith Sets Composition and Coercion � We’ve defined A .o. B where both are FSTs � Really just treats a set as identity relation on set � Now extend definition to allow one to be a FSA { abc, pqr, …} = { abc � abc, pqr � pqr, …} � Two relations (FSTs): � Two relations (FSTs): A ° B = { x � z: ∃ y (x � y ∈ A and y � z ∈ B)} A ° B = { x � z: ∃ y (x � y ∈ A and y � z ∈ B)} � Set and relation is now special case (if ∃ y then y= x) : � Set and relation: A ° B = { x � z: x ∈ A and x � z ∈ B } A ° B = { x � z: x � x ∈ A and x � z ∈ B } � Relation and set is now special case (if ∃ y then y= z) : � Relation and set: A ° B = { x � z: x � z ∈ A and z ∈ B } A ° B = { x � z: x � z ∈ A and z � z ∈ B } � � Two sets (acceptors) – same as intersection: � Two sets (acceptors) is now special case: A ° B = { x: x ∈ A and x ∈ B } A ° B = { x � z: x � x ∈ A and x � x ∈ B } 600.465 - Intro to NLP - J. Eisner 37 600.465 - Intro to NLP - J. Eisner 38 What are the “basic” 3 Uses of Set Composition: transducers? � Feed string into Greek transducer: � The operations on the previous slides � { abed � abed } .o. Greek = { abed � αβεδ , abed � αβ∈δ } combine transducers into bigger ones { abed } .o. Greek = { abed � αβεδ , abed � αβ∈δ } � � But where do we start? [{ abed } .o. Greek].l = { αβεδ, αβ∈δ } � � Feed several strings in parallel: a: ε � { abcd, abed } .o. Greek � a: ε for a ∈ Σ = { abcd � αβγδ , abed � αβεδ , abed � αβ∈δ } � ε :x for x ∈ ∆ ε :x � [{ abcd,abed } .o. Greek].l = { αβγδ , αβεδ , αβ∈δ } � Filter result via No ε ε = { αβγδ , ε ε αβ∈δ , … } � { abcd,abed } .o. Greek .o. No ε � Q: Do we also need a:x? How about ε:ε ? = { abcd � αβγδ , abed � αβ∈δ } 600.465 - Intro to NLP - J. Eisner 39 600.465 - Intro to NLP - J. Eisner 40 7

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend