chunk based verb reordering in vso sentences for arabic
play

Chunk-based Verb Reordering in VSO Sentences for Arabic-English SMT - PowerPoint PPT Presentation

Chunk-based Verb Reordering in VSO Sentences for Arabic-English SMT Arianna Bisazza, Marcello Federico FBK-irst Trento, Italy WMT 2010, Uppsala, 15-16 July 2010 1 Introduc)on Englishwordorder:SubjectVerbObject


  1. Chunk-based Verb Reordering in VSO Sentences for Arabic-English SMT Arianna Bisazza, Marcello Federico FBK-irst Trento, Italy WMT 2010, Uppsala, 15-16 July 2010 1

  2. Introduc)on
 ● 
English
word
order
:
Subject‐Verb‐Object
 ● 
Arabic
:
both
SVO
and
VSO
 ● 
Common
errors
in
phrase‐based
SMT
outputs:
 −
wrong
order
of
syntacBc
consBtuents
 −
verbless
sentences
 WMT 2010, Uppsala A. Bisazza, M. Federico 2

  3. Outline 
 ● 
Reordering
paEerns
in
Arabic‐English

 ● 
Chunk‐based
verb
reordering:
technique
and
analysis
 ● 
Impact
of
VSO
sentences
on
translaBon
quality

 ● 
Chunk‐based
reordering
laIces
 WMT 2010, Uppsala A. Bisazza, M. Federico 3

  4. Outline 
 ● 
Reordering
pa3erns
in
Arabic‐English

 ● 
Chunk‐based
verb
reordering:
technique
and
analysis
 ● 
Impact
of
VSO
sentences
on
translaBon
quality

 ● 
Chunk‐based
reordering
laIces
 WMT 2010, Uppsala A. Bisazza, M. Federico 4

  5. Reordering
pa3erns
in
Arabic‐English
 VSO
sentence:
Arabic
verb
 an#cipated 
wrt
English
 WMT 2010, Uppsala A. Bisazza, M. Federico 5

  6. Reordering
pa3erns
in
Arabic‐English
 VSO
sentence:
Arabic
verb
 an#cipated 
wrt
English
 Several
local,
one
long
reordering
involving
the
verb
 Typical
phrase‐based
SMT
outputs:

 *The
Moroccan
monarch
King
Mohamed
VI
__
his
support
to…
 *He
renewed
the
Moroccan
monarch
King
Mohamed
VI
his
support
to…
 WMT 2010, Uppsala A. Bisazza, M. Federico 6

  7. Previous
works
 
















(Habash
'07;
Crego&Habash
'08;
Elming&Habash
'09)
 • 
preprocess
source
data
to
approximate
target
word
order
 • 
address
 all
 reorderings
 • 
determinisBc
reordering
=>
1
most
probable
permutaBon
 • 
non‐determinisBc
=>
word
reordering
laIces
 
Our
work:
 • 
only
one
class
of
reorderings
 • 
mixed
approach:
determinisBc
for
train,
laIces
for
test
 WMT 2010, Uppsala A. Bisazza, M. Federico 7

  8. Reordering
pa3erns
in
Arabic‐English
 Working
hypothesis:

 






uneven
distribu#on
of
reordering
phenomena

 WMT 2010, Uppsala A. Bisazza, M. Federico 8

  9. Reordering
pa3erns
in
Arabic‐English
 Working
hypothesis:

 






uneven
distribu#on
of
reordering
phenomena

 Many
local
 
 − 
adjecBval
modifiers
following
their
noun

 

 − 
head‐iniBal
geniBve
construcBons
( idafa )

 
 
 
 
























Example
=> 

 Few
global



 − 
Verb‐Subject‐Object
sentences

 WMT 2010, Uppsala A. Bisazza, M. Federico 9

  10. Reordering
pa3erns
in
Arabic‐English
 Working
hypothesis:

 






uneven
distribu#on
of
reordering
phenomena

 Many
local
 
 − 
adjecBves
follow
nouns
 

 − 
head‐iniBal
geniBve
construcBons
( idafa )

 
 
 
 
























Example
=> 

 Few
global



 − 
Verb‐Subject‐Object
sentences

 WMT 2010, Uppsala A. Bisazza, M. Federico 10

  11. Reordering
pa3erns
in
Arabic‐English
 Working
hypothesis:

 






uneven
distribu#on
of
reordering
phenomena

 Many
local
 
 − 
adjecBves
follow
nouns
 

 − 
head‐iniBal
geniBve
construcBons
( idafa )

 
 
 
 
























Example
=> 

 Few
global



 − 
Verb‐Subject‐Object
sentences

 WMT 2010, Uppsala A. Bisazza, M. Federico 11

  12. Reordering
pa3erns
in
Arabic‐English
 VSO
sentences:

 moving
verb
a\er
subject
simplifies
reordering
 Other
(local)
reorderings:

 handled
inside
phrases
or
through
distorBon
 WMT 2010, Uppsala A. Bisazza, M. Federico 12

  13. Outline 
 ● 
Reordering
paEerns
in
Arabic‐English

 ● 
Chunk‐based
verb
reordering:
technique
and
analysis
 ● 
Impact
of
VSO
sentences
on
translaBon
quality

 ● 
Chunk‐based
reordering
laIces
 WMT 2010, Uppsala A. Bisazza, M. Federico 13

  14. Chunk‐based
verb
reordering
 –
Simplifying
assumpBons:


 1)
verb
reordering
only
between
shallow
syntax
chunks;





 2)
no
overlap
between
consecuBve
verb
movements
 WMT 2010, Uppsala A. Bisazza, M. Federico 14

  15. Chunk‐based
verb
reordering
 –
Simplifying
assumpBons:


 1)
verb
reordering
only
between
shallow
syntax
chunks;





 2)
no
overlap
between
consecuBve
verb
movements
 –
Possible
movements:

 move
verb
chunk…
 WMT 2010, Uppsala A. Bisazza, M. Federico 15

  16. Chunk‐based
verb
reordering
 –
Simplifying
assumpBons:


 1)
verb
reordering
only
between
shallow
syntax
chunks;





 2)
no
overlap
between
consecuBve
verb
movements
 –
Possible
movements:

 move
verb
chunk…
 ...or
verb
chunk
+
next
chunk
(e.g.
adverbials)
 by
up
to
X
chunks
to
the
right

 WMT 2010, Uppsala A. Bisazza, M. Federico 16

  17. Chunk‐based
verb
reordering
 Best
movement:

 minimizes
distorBon
wrt
English
translaBon
 WMT 2010, Uppsala A. Bisazza, M. Federico 17

  18. Chunk‐based
verb
reordering:
 corpus
analysis
 DistribuBon
by
movement
length
 IntersecBon
of
GIZA++
alignments
 Manual
alignments
 WMT 2010, Uppsala A. Bisazza, M. Federico 18

  19. Chunk‐based
verb
reordering:
 corpus
analysis
 DistribuBon
by
movement
length
 =>
Good
coverage
(≥
99.5%)

 with
max
movement
length
 6
 WMT 2010, Uppsala A. Bisazza, M. Federico 19

  20. Outline 
 ● 
Reordering
paEerns
in
Arabic‐English

 ● 
Chunk‐based
verb
reordering:
technique
and
analysis
 ● 
Impact
of
VSO
sentences
on
transla)on
quality

 ● 
Chunk‐based
reordering
laIces
 WMT 2010, Uppsala A. Bisazza, M. Federico 20

  21. Impact
of
VSO
sentences
on
MT
quality
 • 
Baseline:
Moses,
30M
words
newswire
from
NIST09
 WMT 2010, Uppsala A. Bisazza, M. Federico 21

  22. Impact
of
VSO
sentences
on
MT
quality
 • 
Baseline:
Moses,
30M
words
newswire
from
NIST09
 • 
Shallow
syntax
chunking:
AMIRA
(Diab&al.2004)







 • 
Verb‐reorder
training
and
devset,
re‐train
whole
system
 WMT 2010, Uppsala A. Bisazza, M. Federico 22

  23. Impact
of
VSO
sentences
on
MT
quality
 • 
Baseline:
Moses,
30M
words
newswire
from
NIST09
 • 
Shallow
syntax
chunking:
AMIRA
(Diab&al.2004)







 • 
Verb‐reorder
training
and
devset,
re‐train
whole
system
 • 
Verb‐reorder
test
aligned
with
reference
 (oracle) 
 • 
Tested
with
different
DistorBon
Limits
(DL)
from
2
to
10





 and
wide
beam
search
 WMT 2010, Uppsala A. Bisazza, M. Federico 23

  24. Impact
of
VSO
sentences
on
MT
quality
 %BLEU
scores
on
Eval08‐NW
(MERT
on
Dev06‐NW):
 WMT 2010, Uppsala A. Bisazza, M. Federico 24

  25. Impact
of
VSO
sentences
on
MT
quality
 %BLEU
scores
on
Eval08‐NW
(MERT
on
Dev06‐NW):
 Verb
reordering
of
training
 data
only
=>
posiBve
effect
 (9%
more
phrases
extracted)
 WMT 2010, Uppsala A. Bisazza, M. Federico 25

  26. Impact
of
VSO
sentences
on
MT
quality
 %BLEU
scores
on
Eval08‐NW
(MERT
on
Dev06‐NW):
 Verb
reordering
of
training
 and 
test
=>
further
gain










 (+1.2
with
1/3
of
sentences
modified)
 Verb
reordering
of
training
 data
only
=>
posiBve
effect
 (9%
more
phrases
extracted)
 WMT 2010, Uppsala A. Bisazza, M. Federico 26

  27. Impact
of
VSO
sentences
on
MT
quality
 %BLEU
scores
on
Eval08‐NW
(MERT
on
Dev06‐NW):
 Verb
reordering
of
training
 and 
test
=>
further
gain










 Relaxing
the
DL
to
high
 (+1.2
with
1/3
of
sentences
modified)
 values
doesn’t
help
 Verb
reordering
of
training
 data
only
=>
posiBve
effect
 (9%
more
phrases
extracted)
 WMT 2010, Uppsala A. Bisazza, M. Federico 27

  28. Impact
of
VSO
sentences
on
MT
quality
 To
resume:
 • 
VSO
sentences
affect
negaBvely
phrase‐based
SMT
 • 
Specific
models
needed
to
handle
verb
reordering
of
test
 WMT 2010, Uppsala A. Bisazza, M. Federico 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend