Incremental Syntactic Language Models for Phrase-based Translation - - PowerPoint PPT Presentation

incremental syntactic language models for phrase based
SMART_READER_LITE
LIVE PREVIEW

Incremental Syntactic Language Models for Phrase-based Translation - - PowerPoint PPT Presentation

Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz Chris Callison-Burch William Schuler Stephen Wu Air Force Research Lab lane.schwartz@wpafb.af.mil Johns Hopkins University ccb@cs.jhu.edu Ohio State


slide-1
SLIDE 1

Incremental Syntactic Language Models for Phrase-based Translation

Lane Schwartz Chris Callison-Burch William Schuler Stephen Wu

Air Force Research Lab lane.schwartz@wpafb.af.mil Johns Hopkins University ccb@cs.jhu.edu Ohio State University schuler@ling.ohio-state.edu Mayo Clinic wu.stephen@mayo.edu

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-2
SLIDE 2

Syntax in Statistical Machine Translation

Translation Model vs Language Model

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-3
SLIDE 3

Syntax in the Translation Model

Abeill´ e et al., 1990; Poutsma, 1998; Poutsma, 2000; Yamada & Knight, 2001; Yamada & Knight, 2002; Eisner, 2003; Gildea, 2003; Hearne & Way, 2003; Poutsma, 2003; Imamura et al., 2004; Galley et al., 2004; Graehl & Knight, 2004; Melamed, 2004; Ding & Palmer, 2005; Hearne, 2005; Quirk et al., 2005; Cowan et al., 2006; Galley et al., 2006; Huang et al., 2006; Liu et al., 2006; Marcu et al., 2006; Zollmann & Venugopal, 2006; Bod, 2007; DeNeefe et al., 2007; Liu et al., 2007; Chiang et al., 2008; Lavie et al., 2008; Mi & Huang, 2008; Mi et al., 2008; Resnik, 2008; Shen et al., 2008; Zhou et al., 2008; Chiang, 2009; Hanneman & Lavie, 2009; Liu et al., 2009; Chiang, 2010; Huang & Mi, 2010; . . .

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-4
SLIDE 4

Syntax in the Language Model

Translation Model vs Language Model

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-5
SLIDE 5

Syntax in the Language Model

Definition

An incremental syntactic language model uses an incremental statistical parser to define a probability model over the dependency or phrase structure of target language strings.

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-6
SLIDE 6

Syntax in the Language Model

Definition

An incremental syntactic language model uses an incremental statistical parser to define a probability model over the dependency or phrase structure of target language strings. Phrase-based decoder produces translation in the target language incrementally from left-to-right Phrase-based syntactic LM parser should parse target language hypotheses incrementally from left-to-right Related work:

Galley & Manning (2009) obtained 1-best dependency parse using a greedy dependency parser

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-7
SLIDE 7

Syntax in the Language Model

Definition

An incremental syntactic language model uses an incremental statistical parser to define a probability model over the dependency or phrase structure of target language strings. Phrase-based decoder produces translation in the target language incrementally from left-to-right Phrase-based syntactic LM parser should parse target language hypotheses incrementally from left-to-right Related work:

Galley & Manning (2009) obtained 1-best dependency parse using a greedy dependency parser

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-8
SLIDE 8

Syntax in the Language Model

Definition

An incremental syntactic language model uses an incremental statistical parser to define a probability model over the dependency or phrase structure of target language strings. Phrase-based decoder produces translation in the target language incrementally from left-to-right Phrase-based syntactic LM parser should parse target language hypotheses incrementally from left-to-right Related work:

Galley & Manning (2009) obtained 1-best dependency parse using a greedy dependency parser

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-9
SLIDE 9

Syntax in the Language Model

Definition

An incremental syntactic language model uses an incremental statistical parser to define a probability model over the dependency or phrase structure of target language strings. We use a standard HHMM parser (Schuler et al., 2010) Engineering simple model, equivalent to PPDA Engineering linear-time parsing Algorithmic elegant fit into phrase-based decoder Cognitive nice psycholinguistic properties Other parsers Roark (2001), Henderson (2004), Huang & Sagae (2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-10
SLIDE 10

Syntax in the Language Model

Definition

An incremental syntactic language model uses an incremental statistical parser to define a probability model over the dependency or phrase structure of target language strings. We use a standard HHMM parser (Schuler et al., 2010) Engineering simple model, equivalent to PPDA Engineering linear-time parsing Algorithmic elegant fit into phrase-based decoder Cognitive nice psycholinguistic properties Other parsers Roark (2001), Henderson (2004), Huang & Sagae (2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-11
SLIDE 11

Incremental Parsing

S NP DT The NN president VP VP VB meets NP DT the NN board PP IN

  • n

NP Friday

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-12
SLIDE 12

Incremental Parsing

S NP DT the NN president VP VP VB meets NP DT the NN

  • S/VP

           VP/NN

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-13
SLIDE 13

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Transform right-expanding sequences of constituents into left-expanding sequences of incomplete constituents (Johnson 1998)

S NP DT The NN president VP VP VB meets NP DT the NN board PP IN

  • n

NP Friday

S S/NP S/PP S/VP NP NP/NN DT The NN president VP VP/NN VP/NP VB meets DT the NN board IN

  • n

NP Friday

Incomplete constituents can be processed incrementally using a Hierarchical Hidden Markov Model parser. (Murphy & Paskin, 2001; Schuler et al. 2010)

slide-14
SLIDE 14

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Transform right-expanding sequences of constituents into left-expanding sequences of incomplete constituents (Johnson 1998)

S NP DT The NN president VP VP VB meets NP DT the NN board PP IN

  • n

NP Friday

S S/NP S/PP S/VP NP NP/NN DT The NN president VP VP/NN VP/NP VB meets DT the NN board IN

  • n

NP Friday

Incomplete constituents can be processed incrementally using a Hierarchical Hidden Markov Model parser. (Murphy & Paskin, 2001; Schuler et al. 2010)

slide-15
SLIDE 15

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Transform right-expanding sequences of constituents into left-expanding sequences of incomplete constituents (Johnson 1998)

S NP DT The NN president VP VP VB meets NP DT the NN board PP IN

  • n

NP Friday

S S/NP S/PP S/VP NP NP/NN DT The NN president VP VP/NN VP/NP VB meets DT the NN board IN

  • n

NP Friday

Incomplete constituents can be processed incrementally using a Hierarchical Hidden Markov Model parser. (Murphy & Paskin, 2001; Schuler et al. 2010)

slide-16
SLIDE 16

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

S S/NP S/PP S/VP NP NP/NN DT The NN president VP VP/NN VP/NP VB meets DT the NN board IN

  • n

NP Friday

Hierarchical Hidden Markov Model Circles denote hidden random variables Edges denote conditional dependencies Shaded circles denote

  • bserved values

s1

1

s2

1

s3

1

e1 r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 r1

8

r2

8

r3

8

=The =president =meets =the =board =on =Friday

slide-17
SLIDE 17

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

S S/NP S/PP S/VP NP NP/NN DT The NN president VP VP/NN VP/NP VB meets DT the NN board IN

  • n

NP Friday

Hierarchical Hidden Markov Model Circles denote hidden random variables Edges denote conditional dependencies Shaded circles denote

  • bserved values

s1

1

s2

1

s3

1

e1 r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 r1

8

r2

8

r3

8

=The =president =meets =the =board =on =Friday

slide-18
SLIDE 18

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S S/NP S/PP S/VP NP NP/NN DT

The

NN president VP VP/NN VP/NP VB meets DT the NN board IN

  • n

NP Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

slide-19
SLIDE 19

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S S/NP S/PP S/VP NP NP/NN

DT

The NN president VP VP/NN VP/NP VB meets DT the NN board IN

  • n

NP Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT

slide-20
SLIDE 20

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S S/NP S/PP S/VP NP

NP/NN

DT The NN

president

VP VP/NN VP/NP VB meets DT the NN board IN

  • n

NP Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT =NP/NN

slide-21
SLIDE 21

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S S/NP S/PP S/VP NP

NP/NN

DT The

NN

president VP VP/NN VP/NP VB meets DT the NN board IN

  • n

NP Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT =NP/NN =NN

slide-22
SLIDE 22

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S S/NP S/PP S/VP

NP

NP/NN DT The NN president VP VP/NN VP/NP VB meets DT the NN board IN

  • n

NP Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT =NP/NN =NN =NP

slide-23
SLIDE 23

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S S/NP S/PP

S/VP

NP NP/NN DT The NN president VP VP/NN VP/NP VB

meets

DT the NN board IN

  • n

NP Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT =NP/NN =NN =NP =S/VP

slide-24
SLIDE 24

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S S/NP S/PP

S/VP

NP NP/NN DT The NN president VP VP/NN VP/NP

VB

meets DT the NN board IN

  • n

NP Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT =NP/NN =NN =NP =S/VP =VB

slide-25
SLIDE 25

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S S/NP S/PP

S/VP

NP NP/NN DT The NN president VP VP/NN

VP/NP

VB meets DT

the

NN board IN

  • n

NP Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT =NP/NN =NN =NP =S/VP =VB =VP/NP

slide-26
SLIDE 26

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S S/NP S/PP

S/VP

NP NP/NN DT The NN president VP VP/NN

VP/NP

VB meets

DT

the NN board IN

  • n

NP Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT =NP/NN =NN =NP =S/VP =VB =VP/NP =DT

slide-27
SLIDE 27

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S S/NP S/PP

S/VP

NP NP/NN DT The NN president VP

VP/NN

VP/NP VB meets DT the NN

board

IN

  • n

NP Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT =NP/NN =NN =NP =S/VP =VB =VP/NP =DT =VP/NN

slide-28
SLIDE 28

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S S/NP S/PP

S/VP

NP NP/NN DT The NN president VP

VP/NN

VP/NP VB meets DT the

NN

board IN

  • n

NP Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT =NP/NN =NN =NP =S/VP =VB =VP/NP =DT =VP/NN =NN

slide-29
SLIDE 29

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S S/NP S/PP

S/VP

NP NP/NN DT The NN president

VP

VP/NN VP/NP VB meets DT the NN board IN

  • n

NP Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT =NP/NN =NN =NP =S/VP =VB =VP/NP =DT =VP/NN =NN =VP

slide-30
SLIDE 30

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S S/NP

S/PP

S/VP NP NP/NN DT The NN president VP VP/NN VP/NP VB meets DT the NN board IN

  • n

NP Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT =NP/NN =NN =NP =S/VP =VB =VP/NP =DT =VP/NN =NN =VP =S/PP

slide-31
SLIDE 31

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S S/NP

S/PP

S/VP NP NP/NN DT The NN president VP VP/NN VP/NP VB meets DT the NN board

IN

  • n

NP Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT =NP/NN =NN =NP =S/VP =VB =VP/NP =DT =VP/NN =NN =VP =S/PP =IN

slide-32
SLIDE 32

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S

S/NP

S/PP S/VP NP NP/NN DT The NN president VP VP/NN VP/NP VB meets DT the NN board IN

  • n

NP

Friday s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT =NP/NN =NN =NP =S/VP =VB =VP/NP =DT =VP/NN =NN =VP =S/PP =IN =S/NP

slide-33
SLIDE 33

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S

S/NP

S/PP S/VP NP NP/NN DT The NN president VP VP/NN VP/NP VB meets DT the NN board IN

  • n

NP

Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT =NP/NN =NN =NP =S/VP =VB =VP/NP =DT =VP/NN =NN =VP =S/PP =IN =S/NP =NP

slide-34
SLIDE 34

Incremental Parsing using HHMM (Schuler et al. 2010)

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Analogous to “Maximally Incremental” CCG Parsing Equivalent to Probabilistic Push-Down Automata Isomorphic Tree → Path

S

S/NP S/PP S/VP NP NP/NN DT The NN president VP VP/NN VP/NP VB meets DT the NN board IN

  • n

NP Friday

s1

1

s2

1

s3

1

e1 =The r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 =president r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 =meets r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 =the r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =board r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 =on r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 =Friday r1

8

r2

8

r3

8

=DT =NP/NN =NN =NP =S/VP =VB =VP/NP =DT =VP/NN =NN =VP =S/PP =IN =S/NP =NP =S

slide-35
SLIDE 35

Phrase-Based Translation

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Der Pr¨ asident trifft am Freitag den Vorstand The president meets the board on Friday

Stack 0 ➀➁➂➃➄➅➆

s

➊➁➂➃➄➅➆

s the

➊➁➂➃➄➅➆

s that

➀➋➂➃➄➅➆

s president

Stack 1 ➊➋➂➃➄➅➆

the president

➊➋➂➃➄➅➆

that president

➀➋➂➍➄➅➆

president Friday

Stack 2 ➊➋➌➃➄➅➆

president meets

➊➋➌➃➄➅➆

Obama met

Stack 3

slide-36
SLIDE 36

Phrase-Based Translation

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Der Pr¨ asident trifft am Freitag den Vorstand The president meets the board on Friday

Stack 0 ➀➁➂➃➄➅➆

s

➊➁➂➃➄➅➆

s the

➊➁➂➃➄➅➆

s that

➀➋➂➃➄➅➆

s president

Stack 1 ➊➋➂➃➄➅➆

the president

➊➋➂➃➄➅➆

that president

➀➋➂➍➄➅➆

president Friday

Stack 2 ➊➋➌➃➄➅➆

president meets

➊➋➌➃➄➅➆

Obama met

Stack 3

slide-37
SLIDE 37

Phrase-Based Translation with Syntactic LM

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Definition

˜ τ th represents parses of the partial translation at node h in stack t

Stack 0 ➀➁➂➃➄➅➆

s

➊➁➂➃➄➅➆

s the

➊➁➂➃➄➅➆

s that

➀➋➂➃➄➅➆

s president

Stack 1 ➊➋➂➃➄➅➆

the president

➊➋➂➃➄➅➆

that president

➀➋➂➍➄➅➆

president Friday

Stack 2 ➊➋➌➃➄➅➆

president meets

➊➋➌➃➄➅➆

Obama met

Stack 3

slide-38
SLIDE 38

Phrase-Based Translation with Syntactic LM

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Definition

˜ τ th represents parses of the partial translation at node h in stack t

Stack 0 ➀➁➂➃➄➅➆

s ˜ τ0

➊➁➂➃➄➅➆

s the ˜ τ11

➊➁➂➃➄➅➆

s that ˜ τ12

➀➋➂➃➄➅➆

s president ˜ τ13

Stack 1 ➊➋➂➃➄➅➆

the president ˜ τ21

➊➋➂➃➄➅➆

that president ˜ τ22

➀➋➂➍➄➅➆

president Friday ˜ τ23

Stack 2 ➊➋➌➃➄➅➆

president meets ˜ τ31

➊➋➌➃➄➅➆

Obama met ˜ τ32

Stack 3

slide-39
SLIDE 39

Integrate Parser into Phrase-based Decoder

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

s1

1

s2

1

s3

1

e1 r1

2

r2

2

r3

2

s1

2

s2

2

s3

2

e2 r1

3

r2

3

r3

3

s1

3

s2

3

s3

3

e3 r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 r1

6

r2

6

r3

6

s1

6

s2

6

s3

6

e6 r1

7

r2

7

r3

7

s1

7

s2

7

s3

7

e7 r1

8

r2

8

r3

8

=The =president =meets =the =board =on =Friday ➊➁➂➃➄➅➆

s the ˜ τ11

➊➋➂➃➄➅➆

the president ˜ τ21

➊➋➌➃➄➅➆

president meets ˜ τ31

➊➋➌➍➄➅➆

meets the ˜ τ41

slide-40
SLIDE 40

Integrate Parser into Phrase-based Decoder

➊➋➌➃➄➅➆

president meets ˜ τ31

➊➋➌➃➄➏➐

the board ˜ τ51 s1

3

s2

3

s3

3

e3 r1

4

r2

4

r3

4

s1

4

s2

4

s3

4

e4 r1

5

r2

5

r3

5

s1

5

s2

5

s3

5

e5 =meets =the =board

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-41
SLIDE 41

Direct Maximum Entropy Model of Translation

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

ˆ e = argmax

e

exp

j

λjhj(e, f) λ = Set of j feature weights h =            Phrase-based translation model n-gram LM Distortion model . . . Syntactic LM P(˜ τ th)

Stack 0 ➀➁➂➃➄➅➆

s ˜ τ0

➊➁➂➃➄➅➆

s the ˜ τ11

Stack 1 ➊➋➂➃➄➅➆

the president ˜ τ21

Stack 2 ➊➋➌➃➄➅➆

president meets ˜ τ31

Stack 3

slide-42
SLIDE 42

Does an Incremental Syntactic LM Help Translation?

That’s nice... but will it make my BLEU score go up?

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-43
SLIDE 43

Perplexity Results

Language models trained on WSJ Treebank corpus LM In-domain Perplexity Out-of-domain Perplexity WSJ 5-gram LM 232 1262 WSJ Syntactic LM 385 529

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-44
SLIDE 44

Perplexity Results

Language models trained on WSJ Treebank corpus LM In-domain Perplexity Out-of-domain Perplexity WSJ 5-gram LM 232 1262 WSJ Syntactic LM 385 529 Interpolated WSJ 5-gram + WSJ SynLM 209 225

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-45
SLIDE 45

Perplexity Results

Language models trained on WSJ Treebank corpus ...and n-gram model for larger English Gigaword corpus. LM In-domain Perplexity Out-of-domain Perplexity WSJ 5-gram LM 232 1262 WSJ Syntactic LM 385 529 Interpolated WSJ 5-gram + WSJ SynLM 209 225 Gigaword 5-gram 258 312 Interpolated Gigaword 5-gram + WSJ SynLM 222 123

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-46
SLIDE 46

Does an Incremental Syntactic LM Help Translation?

That’s nice... but will it make my BLEU score go up?

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-47
SLIDE 47

Does an Incremental Syntactic LM Help Translation?

Moses with LM(s) BLEU Using n-gram LM only 18.78 Using n-gram LM + Syntactic LM 19.78

Experiment

NIST OpenMT 2008 Urdu-English data set Moses with standard phrase-based translation model Tuning and testing restricted to sentences ≤ 20 words long Results reported on devtest set n-gram LM is WSJ 5-gram LM

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-48
SLIDE 48

Does an Incremental Syntactic LM Help Translation?

Moses with LM(s) BLEU Using n-gram LM only 18.78 Using n-gram LM + Syntactic LM 19.78

Experiment

NIST OpenMT 2008 Urdu-English data set Moses with standard phrase-based translation model Tuning and testing restricted to sentences ≤ 20 words long Results reported on devtest set n-gram LM is WSJ 5-gram LM

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-49
SLIDE 49

Summary

Straightforward general framework for incorporating any Incremental Syntactic LM into Phrase-based Translation We used an Incremental HHMM Parser as Syntactic LM

Syntactic LM shows substantial decrease in perplexity on

  • ut-of-domain data over n-gram LM when trained on same

data Syntactic LM interpolated with n-gram LM shows even greater decrease in perplexity on both in-domain and

  • ut-of-domain data, even when n-gram LM is trained on

substantially larger corpus +1 BLEU on Urdu-English task with Syntactic LM

All code is open source and integrated into Moses

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-50
SLIDE 50

Questions? Incremental Syntactic Language Models for Phrase-based Translation

Lane Schwartz Chris Callison-Burch William Schuler Stephen Wu

Air Force Research Lab lane.schwartz@wpafb.af.mil Johns Hopkins University ccb@cs.jhu.edu Ohio State University schuler@ling.ohio-state.edu Mayo Clinic wu.stephen@mayo.edu

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-51
SLIDE 51

This looks a lot like CCG

Our parser performs some CCG-style operations: Forward function application

NP/NN NN ⇒ NP

Type raising

NP ⇒ S/VP

Type raising in conjunction with forward function composition

DT ⇒ NP/NN VP/NP NP/NN ⇒ VP/NN

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-52
SLIDE 52

Why not just use CCG?

No probablistic version of incremental CCG Our parser is constrained (we don’t have backward composition) We do use those components of CCG (forward function application and forward function composition) which are useful for probabilistic incremental parsing

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-53
SLIDE 53

Speed Results

Mean per-sentence decoding time Sentence length Moses +SynLM beam=50 +SynLM beam=2000 10 0.2 sec 9 min 19 min 20 0.5 sec 20 min 43 min 30 0.9 sec 29 min 62 min 40 1.1 sec 35 min 76 min Parser beam sizes are indicated for the syntactic LM Parser runs in linear time, but we’re parsing all paths through the Moses lattice as they are generated by the decoder More informed pruning, but slower decoding

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

slide-54
SLIDE 54

Phrase-Based Translation with Syntactic LM

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz

Definition

e

def

= string of n target language words e1. . .en et

def

= the first t words in e, where t≤n τt

def

= set of all incremental parses of et ˜ τt

def

= subset of parses τt that remain after parser pruning e argmax

τ

P(τ | e) ˆ τ ˜ τt−1 δ et ˜ τt

slide-55
SLIDE 55

Acknowledgments

This research was supported by NSF CAREER/PECASE award 0447685, NSF grant IIS-0713448, and the European Commission through the EuroMatrixPlus project. Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the sponsors or the United States Air Force. Paper cleared for public release (Case Number 88ABW-2010-6489) on 10 Dec 2010. Presentation cleared for public release (Case Number 88ABW-2011-2970) on 26 May 2011.

Motivation Syntactic LM Decoder Integration Results Questions? Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz