Machine Translation Classification of divergences Classical and - - PowerPoint PPT Presentation

machine translation
SMART_READER_LITE
LIVE PREVIEW

Machine Translation Classification of divergences Classical and - - PowerPoint PPT Presentation

Session 4: Interlingua-based MT Dorr (1992, 1994): UNITRAN system Machine Translation Classification of divergences Classical and Statistical Approaches Lexical Conceptual Structure Translation mappings between syntactic


slide-1
SLIDE 1

Machine Translation

– Classical and Statistical Approaches

Session 4: Interlingua-based MT

Jonas Kuhn Universität des Saarlandes, Saarbrücken The University of Texas at Austin jonask@coli.uni-sb.de

DGfS/CL Fall School 2005, Ruhr-Universität Bochum, September 19-30, 2005

Jonas Kuhn: MT 2

Session 4: Interlingua-based MT

Dorr (1992, 1994): UNITRAN system Classification of divergences Lexical Conceptual Structure Translation mappings between syntactic

structure and LCS representations

Language-specific exceptions to translation

mappings

Jonas Kuhn: MT 3

UNITRAN

Translation between Spanish, English and German

(bidirectionally)

Jonas Kuhn: MT 4

Translation divergences

(1) Thematic divergence: E: I like Mary S: Maria me gusta a mi 'Mary pleases me' (2) Promotional divergence: E: John usually goes home S: Juan suele ira casa 'John tends to go home' (3) Demotional divergence: E: I like eating G: Ich esse gern 'I eat likingly' (4) Structural divergence: E: John entered the house S: Juan entró en la casa 'John entered in the house'

slide-2
SLIDE 2

Jonas Kuhn: MT 5

Translation divergences

(5) Conflational divergence: E: I stabbed John S: Yo le di puñaladas a Juan 'I gave knife-wounds to John' (6) Categorial divergence: E: I am hungry G: Ich habe Hunger 'I have hunger' (7) Lexical divergence: E: John broke into the room S: Juan forzó la entrada al cuarto 'John forced (the) entry to the room'

Jonas Kuhn: MT 6

Lexical Conceptual Structure

Following Jackendoff (1983, 1990) Example:

English:

Bill went into the house

LCS:

GO(BILL,TO(IN(HOUSE)))

Spanish:

Bill entró a la casa.

Jonas Kuhn: MT 7

LCS – Definitions

Definition 1 (Dorr 1994)

  • A lexical conceptual structure (LCS) is a modified version of the representation

proposed by Jackendoff (1983, 1990) that conforms to the following structural form:

  • This corresponds to the tree-like representation shown in Figure 2, in which

(1) X' is the logical head; (2) W' is the logical subject; (3) Z'1 ... Z'n are the logical arguments; and (4) Q'1 ... Q'n are the logical modifiers.

Figure 2:

  • In addition, T(φ) is the logical type (Event, State, Path, Position, etc.)

corresponding to the primitive φ (CAUSE, LET, GO, STAY, BE, etc.);

  • Primitives are further categorized into fields (e.g., Possessional, Identificational,

Temporal, Locational, etc.).

Jonas Kuhn: MT 8

LCS – Definitions

Example 1

John went happily to school

[Event GOLoc

([Thing JOHN], [Path TOLoc ([Position ATLoc ([Thing JOHN], [Location SCHOOL])])] [Manner HAPPILY])] Logical Head Logical Subject Logical Argument Logical Modifier

slide-3
SLIDE 3

Jonas Kuhn: MT 9

LCS – Definitions

Types and primitives:

Jonas Kuhn: MT 10

LCS – Definitions

Primitives must adhere to constraints on argument

structure

Spatial dimension Causal dimension

Jonas Kuhn: MT 11

LCS – Definitions

Field dimension (specialization of primitive stating undre

which domain it is interpreted – e.g., GOLoc vs. GOTemp)

Footnote 14: Technically the second argument for each of these fields is a Path or a

  • Position. For the purposes of the current description the column under “Argument 2” refers

to the lowest leaf node embedded inside of the second argument.

Jonas Kuhn: MT 12

LCS – Definitions

LCS representation in the lexicon and as the interlingua

representation Definition 2 (Dorr 1994)

A RLCS (i.e., a root LCS) is an uninstantiated LCS that is

associated with a word definition in the lexicon (i.e., a LCS with unfilled variable positions). Definition 3 (Dorr 1994)

A CLCS (i.e., a composed LCS) is an instantiated LCS that is

the result of combining two or more RLCSs by means of unification (roughly). This is the interlingua, or language- independent, form that serves as the pivot between the source and target languages.

slide-4
SLIDE 4

Jonas Kuhn: MT 13

LCS – Definitions

Examples of RLCSs and CLCSs:

RLCS associated with the word go:

[Event GOLoc ([Thing X], [Path TOLoc ([Position ATLoc ([Thing X], [Location Z])])])]

CLCS: composition of RLCSs for go, John, school, and

happily leads to the LCS seen previously (using a concept

  • f “unification”)

Jonas Kuhn: MT 14

Composition of LCSs

Notion of “Unification” differs from standard

unification

Not directly invertible More “relaxed” notion (for words associated

with special parameters like :INT, :EXT, :PROMOTE etc.)

Jonas Kuhn: MT 15

Composition of LCSs

  • Composition based on syntactic parse (following the GB framework

(Government-and-Binding theory)) Definition 4 (Dorr 1994)

  • A syntactic phrase is a maximal projection that conforms to the

following structural form: Syntactic Head External Argument Internal Arguments Syntactic Adjuncts Syntactic Adjuncts

Jonas Kuhn: MT 16

Composition of LCSs

Example

John went happily to school

Syntactic Head External Argument Internal Argument Syntactic Adjunct

slide-5
SLIDE 5

Jonas Kuhn: MT 17

The translation mappings

Generalized linking routine (GLR) Canonical syntactic realization (CSR)

Jonas Kuhn: MT 18

The translation mappings

Generalized linking routine (GLR)

Simplified schema: X: Syntactic Head W: External Argument Z: Internal Argument Q: Syntactic Adjunct X’: Logical Head W’: Logical Subject Z’: Logical Argument Q’: Logical Modifier

Jonas Kuhn: MT 19

The translation mappings

X: Syntactic Head W: External Argument Z: Internal Argument Q: Syntactic Adjunct X’: Logical Head W’: Logical Subject Z’: Logical Argument Q’: Logical Modifier

Generalized linking routine (GLR)

Example

Jonas Kuhn: MT 20

The translation mappings

Canonical syntactic realization (CSR)

slide-6
SLIDE 6

Jonas Kuhn: MT 21

The Divergence Problem

There can be (language-specific) exceptions

to the GLR and/or the CSR

Translation divergences occur when such

exceptions occur in one language, but not in the other Formal classification of lexical-semantic divergences

Jonas Kuhn: MT 22

Addressing the Divergence Problem

Parameters for encoding language-specific

information

GLR, CSR: language independent Parameters: language-specific information about

lexical items

Seven parameters:

:INT :EXT :PROMOTE :DEMOTE * :CAT :CONFLATED

Jonas Kuhn: MT 23

Thematic Divergence

E: I like Mary S: Maria me gusta a mi 'Mary pleases me'

Arises only where there is a logical subject

Jonas Kuhn: MT 24

Thematic Divergence

Encoded with the :INT and :EXT parameters

slide-7
SLIDE 7

Jonas Kuhn: MT 25

Thematic Divergence

Translation mapping for English relies on GLR defaults

Jonas Kuhn: MT 26

Parameter markings

Parameter markers such as :INT and :EXT

show up only in the RLCS (for lexicon entries)

The CLCS does not include such markers, it

is a language-independent representation

Jonas Kuhn: MT 27

Promotional Divergence

E: John usually goes home S: Juan suele ira casa

'John tends to go home‘

Logical Modifier Logical Head Logical Argument Logical Head

Jonas Kuhn: MT 28

Promotional Divergence

slide-8
SLIDE 8

Jonas Kuhn: MT 29

Promotional Divergence

Jonas Kuhn: MT 30

Demotional Divergence

E: I like eating G: Ich esse gern 'I eat likingly'

Jonas Kuhn: MT 31

Demotional Divergence

:DEMOTE parameter:

logical head and logical argument swap places

Jonas Kuhn: MT 32

Demotional Divergence

slide-9
SLIDE 9

Jonas Kuhn: MT 33

Divergence Types

The difference between promotional and

demotional divergences

In promotional divergences (e.g., soler-

usually), the verb (soler) triggers the head switching, no matter what event is substituted as its argument

In demotional divergences (e.g., like-gern), the

adverbial satellite (gern) is the trigger

Jonas Kuhn: MT 34

Structural Divergence

E: John entered the house S: Juan entró en la casa 'John entered in the house'

In structural divergence it is not the positions in the GLR

mapping that are altered, but the nature of the relation between the different positions

Jonas Kuhn: MT 35

Structural Divergence

Jonas Kuhn: MT 36

Conflational Divergence

E: I stabbed John S: Yo le di puñaladas a Juan 'I gave knife-wounds to John‘ Logical Argument; suppressed in English

slide-10
SLIDE 10

Jonas Kuhn: MT 37

Conflational Divergence

Not realized syntactically

Jonas Kuhn: MT 38

Conflational Divergence

Jonas Kuhn: MT 39

Divergence Types

(1) Thematic divergence (2) Promotional divergence (3) Demotional divergence (4) Structural divergence (5) Conflational divergence (6) Categorial divergence (7) Lexical divergence

Default Operation

  • f GLR is changed

Default Operation

  • f CSR is changed

Jonas Kuhn: MT 40

Categorial Divergence

E: I am hungry G: Ich habe Hunger 'I have hunger'

slide-11
SLIDE 11

Jonas Kuhn: MT 41

Categorial Divergence

Jonas Kuhn: MT 42

Lexical Divergence

Arises only in the context of other divergence

types

Choice of lexical items in any languge relies

  • n the realization and composition properties
  • f those items

Since the various other divergences alter

these properties, lexical divergence is viewed as a side effect of other divergences

No specific override markers used

Jonas Kuhn: MT 43

Lexical Divergence

E: John broke into the room

S: Juan forzó la entrada al cuarto 'John forced (the) entry to the room‘

Conflational divergence forces the occurrence of a

lexical divergence

Jonas Kuhn: MT 44

Lexical Divergence

“break into” subsumes two concepts

slide-12
SLIDE 12

Jonas Kuhn: MT 45

Discussion

Full coverage constraint Generation-based view of GB parsing Bias in “interlingua” representation?