Finite-State Transducers in Language and Speech Processing : - - PowerPoint PPT Presentation

finite state transducers in language and speech processing
SMART_READER_LITE
LIVE PREVIEW

Finite-State Transducers in Language and Speech Processing : - - PowerPoint PPT Presentation

Finite-State Transducers in Language and Speech Processing : 05/20/2003 1. M. Mohri, On some applications of Finite-state automata theory to natural language processing, J. Nature Language Eng. 2 (1996). 2. M. Mohri,


slide-1
SLIDE 1

Finite-State Transducers in Language and Speech Processing

報告人:郭榮芳 05/20/2003

  • 1. M. Mohri, On some applications of Finite-state automata theory to natural

language processing, J. Nature Language Eng. 2 (1996).

  • 2. M. Mohri, Finite-state transducers in language and speech processing,
  • Comput. Linguistics 23 (2) (1997).
slide-2
SLIDE 2

Outline

Introduction Sequential string-to-string transducers Power series and subsequential string-to-

weight transducers

Application to speech recognition

slide-3
SLIDE 3

Introduction

Finite-state machines have been used in many

areas of computational linguistics. Their use can be justified by both linguistic and computational arguments.

slide-4
SLIDE 4

Linguistically

Finite automata are convenient since they

allow one to describe easily most of the relevant local phenomena encountered in the empirical study of language.

They often lead to a compact representation of

lexical rules, or idioms and clich es, that appears as natural to linguists (Gross, 1989).

slide-5
SLIDE 5

Linguistically(cont.)

Graphic tools also allow one to visualize and

modify automata.This helps in correcting and completing a grammar.

Other more general phenomena such as

parsing context-free grammars can also be dealt with using finite-state machines such as RTN’s (Woods, 1970).

slide-6
SLIDE 6

Computational

The use of finite-state machines is mainly motivated by

considerations of time and space efficiency.

Time efficiency is usually achieved by using

deterministic automata.

– Deterministic automata

Have a deterministic input. For every state,at most one transition labeled with a given

element of the alphabet . The output of deterministic machines depends, in gen-

eral linearly.

slide-7
SLIDE 7

Computational(cont.)

Space efficiency is achieved with classical

minimization algorithms (Aho,Hopcroft, and Ullman, 1974) for deterministic automata.

Applications such as compiler construction

have shown deterministic finite automata to be very efficient in practice (Aho, Sethi, and Ullman, 1986).

slide-8
SLIDE 8

Applications in natural language processing

Lexical analyzers The compilation of morphological Phonological rules Speech processing

slide-9
SLIDE 9

The idea of deterministic automata

Produce output strings or weights in addition to

accepting(deterministically) input.

Time efficiency Space efficiency A large increase in the size of data.

slide-10
SLIDE 10

Limitations of the corresponding techniques,

however, are very often pointed out more than their advantages.

The reason for that is probably that recent work

in this field are not yet described in computer science textbooks.

Sequential finite-state transducers are now

used in all areas of computational linguistics.

slide-11
SLIDE 11

The case of string-to-string transducers.

These transducers have been successfully

used in the representation of large-scale dictionaries, computational morphology, and local grammars and syntax.

We describe the theoretical bases for the use

  • f these transducers.In particular, we recall

classical theorems and give new ones characterizing these transducers.

slide-12
SLIDE 12

The case of sequential string-to- weight transducers

These transducers appear as very interesting

in speech processing. Language models, phone lattices and word lattices.

– Determinization – Minimization – Unambiguous

Some applications in speech recognition.

slide-13
SLIDE 13

Sequential string-to-string transducers

Sequential string-to-string transducers are used in

various areas of natural language processing.

Both determinization (Mohri, 1994c) and minimization

algorithms (Mohri,1994b) have been defined for the class of p-subsequential transducers which includes sequential string-to-string transducers.

In this section the theoretical basis of the use of

sequential transducers is described.

Classical and new theorems help to indicate the

usefulness of these devices as well as their characterization.

slide-14
SLIDE 14

Sequential transducers

Sequential transducers:

– Sequential transducers has a deterministic

input,namely at any state there is at most one transition labeled with a given element of the input alphabet.

– Output labels might be strings, including the empty

stringε.

slide-15
SLIDE 15

Sequential transducers(cont.)

Their use with a given input does not depend

  • n the size of the transducer but only on that of

the input.

The total computational time is linear in the

size of the input.

slide-16
SLIDE 16

Example of a sequential transducer

slide-17
SLIDE 17

Definition of Non-sequential transducer

–V1 is the set of states, –I1 is the initial state, –F1 is the set of final states, –A and B, finite sets corresponding respectively to the input and

  • utput alphabets of the transducer,

– δ1 , the state transition function which maps V1 × A to

,

–σ1, the output function which maps V1 × A × V1 to B* .

slide-18
SLIDE 18

Definition of Subsequential transducer

– I2 the unique initial state, –

δ2 , the state transition function which maps V2 × A to V2 ,

– σ1, the output function which maps V1 × A to B* , – Φ2 , the final function maps F to B*

slide-19
SLIDE 19

Denote

x ^ y is the longest common prefix of two strings x

and y.

  • is the string y obtained by dividing (xy) at

left by x.

Subsets made of pairs (q,w) of a state q of T1 and

a string

J1(a)={(q,w)|δ1(q,a) defined and (q,w) q2 } J2(a)={(q,w,q’)|δ1(q,a) defined and

(q,w) q2 and q’ δ1(q,a) }

∈ ∈ ∈

slide-20
SLIDE 20
slide-21
SLIDE 21

Transducer T1 Subsequential transducer T2 obtained from T1 by determinization.

slide-22
SLIDE 22

Transducer T1 Subsequential transducer T2 obtained from T1 by determinization.

slide-23
SLIDE 23
slide-24
SLIDE 24
slide-25
SLIDE 25

Definition of a sequential string-to- string transducer

More formally, a sequential string-to-string transducer

T is a 7-tuple (Q,I,F,Σ,Δ,δ,σ).

– Q is the set of states, –

is the initial state,

, the set of final states,

– Σ andΔ, finite sets corresponding respectively to the input

and output alphabets of the transducer,

– Δ , the state transition function which maps Q × Σ to Q , – σ, the output function which maps Q × Σ to

.

Q i ∈

Q F ⊆

*

slide-26
SLIDE 26

Subsequential and p -Subsequential transducers

p :at most p final output strings at each final

state.

p -subsequential transducers seem to be

sufficient for describing linguistic ambiguities.

slide-27
SLIDE 27

Subsequential and p -Subsequential transducers (cont.)

Figure 2 Example of a 2-subsequential transducer 1

t

EX.input string w = aa gives two distinct outputs aaa and aab .

slide-28
SLIDE 28

Composition

If t1is a transducer from input1 to output1 and

t2is a transducer from input2 to output2,then t1ot2 maps from input1 to output2.

making the intersection of the outputs of t1 with

the inputs of t2.

slide-29
SLIDE 29

Theorem 1

Let f :

be a sequential (resp. p - subsequential) and g : be a sequential (resp. q -subsequential) function, then is sequential (resp. pq -subsequential).

slide-30
SLIDE 30

Proof

f:

a p –subsequential transducer

g: a q –subsequential

transducer

  • denote the final output functions of

which map respectively

  • represents for instance the set of final output

strings at a final state r .

Define the pq -subsequential transducer

slide-31
SLIDE 31

Proof(cont.)

transition and output functions final output function

slide-32
SLIDE 32
slide-33
SLIDE 33

Theorem 2

Let

be a sequential (resp. p - subsequential) and be a sequential (resp. q -subsequential) function, then g + f is 2-subsequential (resp. (p + q)-subsequential).

slide-34
SLIDE 34
slide-35
SLIDE 35
slide-36
SLIDE 36
slide-37
SLIDE 37

Theorem 3

Let f be a rational function mapping

f is sequential iff there exists a positive integer K such that:

slide-38
SLIDE 38

Theorem 4

Let f be a partial function mapping .

f is rational iff there exist a left sequential function and a right sequential function such that

slide-39
SLIDE 39

Transducer T with no equivalent sequential representation. Left to right sequential transducer L . Right to left sequential transducer R

slide-40
SLIDE 40

Theorem 5

Let T be a transducer mapping . It is

decidable whether T is sequential.

Based on the definition of a metric on

Denote by the longest common prefix of two strings u and v in . It is easy to verify that the following defines a metric on :

slide-41
SLIDE 41

Theorem 6

Let f be a partial function mapping .

f is subsequential iff:

– 1. f has bounded variation (according to the metric

defined above).

– 2. for any rational subset Y of

is rational.

slide-42
SLIDE 42

Theorem 7

Let be a partial function mapping.

f is p –subsequential iff:

– 1. f has bounded variation (using the metric d

  • n ).

– 2. for all i (1<= i<= p ) and any rational subset Y of

is rational.

slide-43
SLIDE 43

Theorem 8

Let f be a rational function mapping

. f is p -subsequential iff it has bounded variation (using the semi-metric ).

slide-44
SLIDE 44

Application to language processing

The composition, union,and equivalence

algorithms for subsequential transducers are also very useful in many applications.

slide-45
SLIDE 45

Representation of very large dictionaries.

The corresponding representation offers very

fast look-up since then the recognition does not depend on the size of the dictionary but only on that of the input string considered.

As an example, a French morphological

dictionary of about 21.2 Mb can be compiled into a p -subsequential transducer of size 1.3 Mb, in a few minutes (Mohri, 1996b).

slide-46
SLIDE 46

Compilation of morphological and phonological rules

Similarly, context-depen-dent phonological and

morphological rules can be represented by finite-state transducers (Kaplan and Kay, 1994).

This increases considerably the time efficiency

  • f the transducer. It can be further minimized to

reduce its size.

slide-47
SLIDE 47

Syntax

Finite-state machines are also currently used

to represent local syntactic constraints (Silberztein, 1993; Roche, 1993; Karlsson et al., 1995; Mohri, 1994d).

Linguists can conveniently introduce local

grammar transducers that can be used to disambiguate sentences.