4CSLL5 Advanced Computational Linguistics Introduction Phrase - - PowerPoint PPT Presentation

4csll5 advanced computational linguistics
SMART_READER_LITE
LIVE PREVIEW

4CSLL5 Advanced Computational Linguistics Introduction Phrase - - PowerPoint PPT Presentation

4CSLL5 Advanced Computational Linguistics Phrase Based Machine Trans 4CSLL5 Advanced Computational Linguistics Phrase Based Machine Trans 4CSLL5 Advanced Computational Linguistics Introduction Phrase Based Machine Trans


slide-1
SLIDE 1

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans

Martin Emms November 4, 2020

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans

Introduction Learning the Phrase Translation Table

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans

Intro and Learning

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Introduction

Motivation

◮ Word-Based Models translate words as atomic units ◮ Phrase-Based Models translate phrases as atomic units ◮ Advantages:

◮ many-to-many translation can handle non-compositional phrases ◮ use of local context in translation ◮ the more data, the longer phrases can be learned

slide-2
SLIDE 2

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Introduction

Phrase-Based Model

er ja nicht nach hause geht he go home does not source

  • bserved

◮ source is segmented in phrases ◮ each source phrase is translated into observed phrase ◮ observed phrases are reordered

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Introduction

Compared to IBM Model

◮ recall IBM models assumed a hidden alignment between s and o, giving a

formula p(o, a|s) and so a formula for p(o, a, s) as p(o, a|s) × p(s)

◮ phrase-based models assume hidden segmentations of s and o into K

phrases ¯ s1:K and ¯

  • 1:K

◮ phrase-based models also assume a hidden mapping from the phrases ¯

s to the phrases ¯

  • . This 1-to-1, and generally not order preserving.

◮ we will have a formula for formula for p(¯

  • , τ,¯

s) as p(¯

  • , τ|¯

s) × p(¯ s)

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Introduction

Example

er ja nicht nach hause geht he go home does not source

  • bserved

◮ assume s1:5 = he does not go home and o1:6 = er geht ja nicht nach hause ◮ possible segmentation of s1:5 into ¯

s1:4 is ¯ s1 = s1:1 = he, ¯ s2 = s2:3 = does not, ¯ s3 = s4:4 = go, ¯ s4 = s5:5 = home

◮ possible segmentation of o1:6 into ¯

  • 1:4 is

¯

  • 1 = o1:1 = er, ¯
  • 2 = o2:2 = geht, ¯
  • 3 = o3:4 = ja nicht,

¯

  • 4 = o5:6 = nach hause

◮ possible mapping τ from ¯

s to ¯

  • is

τ(1) = 1, τ(2) = 3, τ(3) = 2, τ(4) = 4

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Introduction

Constructing a Phrase-Based Translation

◮ Task: translate a certain German ’observed’ sentence into ’source’ English

er geht ja nicht nach hause

slide-3
SLIDE 3

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Introduction

Constructing a Phrase-Based Translation

◮ Assume a ’phrase-table’ giving for many possible ’phrases’ ¯

  • in the
  • bserved German, possible ’phrases’ ¯

s in potential source English

he

er geht ja nicht nach hause

it , it , he is are goes go yes is , of course not do not does not is not after to according to in house home chamber at home not is not does not do not home under house return home do not it is he will be it goes he goes is are is after all does to following not after not to , not is not are not is not a

◮ the phrase-based translation will be built with these ingredients

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Introduction

Constructing a Phrase-Based Translation

er geht ja nicht nach hause er he

◮ Pick a phrase ¯

  • = ’er’ in observed, choose ’he’ as ¯

s1 in source

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Introduction

Constructing a Phrase-Based Translation

er geht ja nicht nach hause er ja nicht he does not

◮ Pick a phrase ¯

  • = ’ja nicht’ in observed, choose ’does not’ as ¯

s2 in source

◮ NB: allowed to choose ¯

  • phrases out of sequence; ¯

s phrases chosen in sequence

◮ NB: phrases may have multiple words: many-to-many translation

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Introduction

Constructing a Phrase-Based Translation

er geht ja nicht nach hause er ja nicht geht he go does not

◮ Pick a phrase ¯

  • = ’geht’ in observed, choose ’go’ as ¯

s3 in source

slide-4
SLIDE 4

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Introduction

Constructing a Phrase-Based Translation er geht ja nicht nach hause er ja nicht nach hause geht he go home does not

◮ Pick a phrase ¯

  • = ’nach hause’ in observed, choose ’home’ as ¯

s4

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Introduction

◮ just constructed one particular translation, could have constructed many,

many others using the available phrases pairs

◮ need probabilistic model which favours one over the other ◮ need to set parameters of that model

→ these won’t be learned by EM but instead some are (heuristically) derived from IBM models, and some just set by common sense

◮ to find high scoring translations need to manage somehow an exponential

search space → ’beam search’ heuristic

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Learning the Phrase Translation Table

Learning a Phrase Translation Table

◮ Task: learn the model from a parallel corpus ◮ Three stages:

◮ word alignment: using IBM models or other method ◮ extraction of phrase pairs ◮ scoring phrase pairs 4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Learning the Phrase Translation Table

Learning ctd: alignment both ways

a : Ger → Eng a : Eng → Ger do IBM model learning in both directions, and find best alignments both ways

slide-5
SLIDE 5

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Learning the Phrase Translation Table

Learning ctd: unite alignment

house the in stay will he that assumes michael michael geht davon aus dass er im haus bleibt ,

for each training pair, merge these alignments then extract phrase pair consistent with this merge: next slides show a few cases

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Learning the Phrase Translation Table

Learning ctd: extract consistent phrase pairs

house the in stay will he that assumes michael michael geht davon aus dass er im haus bleibt ,

  • bvious 1-to-N, N-to-1 cases eg:

(that – dass) (assumes – geht davon aus) (in the – im)

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Learning the Phrase Translation Table

Learning ctd: extract consistent phrase pairs

  • michael

geht aus davon will michael assumes that he stay in the house dass , er im haus bleibt

N-to-N cases: basically taping together adjacent smaller cases.

  • eg. (in the – im) + (house – haus)

→ (in the house — im haus)

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Learning the Phrase Translation Table

Learning ctd: extract consistent phrase pairs

  • michael

geht aus davon michael assumes that he dass ,

will stay in the house bleibt

er

im haus N-to-N cases: (will stay – bleibt) + (in the house — im haus) → (will stay in the house — im haus bleibt)

slide-6
SLIDE 6

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Learning the Phrase Translation Table

Learning ctd: extract consistent phrase pairs

  • will

that he stay in the house dass , er im haus bleibt

michael geht davon aus michael assumes N-to-N cases: (michael — michael) + (assumes — geht davon aus) → (michael assumes — michael geht davon aus)

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Learning the Phrase Translation Table

Learning ctd: extract consistent phrase pairs

  • michael

will michael he stay in the house er im haus bleibt davon geht dass aus , assumes that

N-to-N cases1: (assumes — geht davon aus) + (ǫ — ,) + (that — dass) → (assumes that — geht davon aus , dass)

1here the unaligned German , is swept up; tantamount to treating it as paired with empty

string ǫ

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Learning the Phrase Translation Table

Learning ctd: what you can’t extract

house the in stay will he that assumes michael michael geht davon aus dass er im haus bleibt ,

no pair (¯ e — er im) no pair (he will stay — ¯ g) because corresponding parts not adjacent

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Learning the Phrase Translation Table

Scoring Phrase Translations

◮ Preceding slides show some of the phrase pairs extracted from one

sentence pair; this is done over all sentence pairs. Some pairs will be frequently extracted, others less so . . .

◮ so from huge table of counts of phrase pairs, phrase-translation

probabilities are simply defined by relative frequencies: tr(¯ e|¯ g) = count(¯ e, ¯ g)

  • ¯

e′ count( ¯

e′, ¯ g) tr(¯ g|¯ e) = count(¯ e, ¯ g)

  • ¯

g′ count(¯

e, ¯ g ′)

◮ so phrase probs acquired by exploiting the EM-learned IBM probs

slide-7
SLIDE 7

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Learning the Phrase Translation Table

Phrase Translation Probabilities: an example

◮ below is an extract from table learnt from the Europarl corpus, giving

some values of tr(¯ e|¯ g) for ¯ g = den Vorschlag and various English ’phrases’ English φ(¯ e|¯ g) English φ(¯ e|¯ g) the proposal 0.6227 the suggestions 0.0114 ’s proposal 0.1068 the proposed 0.0114 a proposal 0.0341 the motion 0.0091 the idea 0.0250 the idea of 0.0091 this proposal 0.0227 the proposal , 0.0068 proposal 0.0205 its proposal 0.0068

  • f the proposal

0.0159 it 0.0068 the proposals 0.0159 ... ...

◮ lexical variation (proposal vs suggestions) ◮ morphological variation (proposal vs proposals) ◮ included function words (the, a, ...) ◮ noise (it)

4CSLL5 ’Advanced Computational Linguistics’ Phrase Based Machine Trans Learning the Phrase Translation Table

Linguistic Phrases?

◮ Phrase-table emphatically is not limited to ’linguistic’ phrases – that is

sequences which are defined by detailed language grammars (noun phrases, verb phrases, prepositional phrases, ...)

◮ Example non-linguistic phrase pair

spass am → fun with the

◮ Prior noun often helps with translation of preposition ◮ ’phrases’ can include tacked on punctuation ◮ consensus is that if attempts are made to limit to grammatically motivated

’linguistic’ phrases, overall translation quality goes down