Neural Grammatical Error Correction with Finite State Transducers - - PowerPoint PPT Presentation

neural grammatical error correction with finite state
SMART_READER_LITE
LIVE PREVIEW

Neural Grammatical Error Correction with Finite State Transducers - - PowerPoint PPT Presentation

Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne Department of Engineering Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher


slide-1
SLIDE 1

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne Department of Engineering

slide-2
SLIDE 2

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

Informal introduction to finite state transducers

  • FSTs are graph structures with start state and final state
  • Arcs are annotated with:
  • An input symbol
  • An output symbol
  • A weight
  • The FST transduces an input string 𝑡1 to an output string 𝑡2 iff. there is a

path from the start to the final state with:

  • 𝑡1 is the concatenation of all input symbols
  • 𝑡2 is the concatenation of all output symbols
  • The cost of this mapping is the (minimal) sum of arc weights
slide-3
SLIDE 3

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

Example FSTs

  • Maps 𝑡1 = 𝑏𝑐𝑑 to itself
slide-4
SLIDE 4

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

Example FSTs

  • Maps 𝑡1 = 𝑏𝑐𝑑 to itself

Start state Input symbol Output symbol Final state

slide-5
SLIDE 5

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

Example FSTs

  • Maps 𝑡1 = 𝑏𝑐𝑑 to itself

Start state Input and output symbol Final state

slide-6
SLIDE 6

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

Example FSTs

  • Maps 𝑡1 = 𝑏𝑐𝑑 to itself
  • Maps any string consisting only of 𝑏 symbols to itself
slide-7
SLIDE 7

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

Example FSTs

  • Represents an 𝑜-best list

𝜗: empty input/output symbol

slide-8
SLIDE 8

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

Example FSTs

  • Represents an 𝑜-best list
  • After determinization, 𝜗-removal, minimization, weight pushing

𝜗: empty input/output symbol

slide-9
SLIDE 9

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

FST composition

  • Composition: Combines two FSTs 𝑈

1 and 𝑈2 to a new FST 𝑈 1 ∘ 𝑈 2

  • If 𝑈

1 maps 𝑡1 to 𝑡2 and 𝑈 2 maps 𝑡2 to 𝑡3, then 𝑈 1 ∘ 𝑈 2 maps 𝑡1 to 𝑡3.

  • The cost is the (minimum) sum of the path costs in 𝑈

1 and 𝑈 2.

slide-10
SLIDE 10

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

FST composition examples

  • Composition and weights

𝑈

1

𝑈2 𝑈

1 ∘ 𝑈2

slide-11
SLIDE 11

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

FST composition examples

  • Counting transducers

𝑈

1

𝑈

1 ∘ 𝑈2

𝑈2

slide-12
SLIDE 12

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

FST composition examples

  • Language models

𝑈

1

𝑈

1 ∘ 𝑈2

𝑈2

slide-13
SLIDE 13

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

FST composition examples

  • 1:1 corrections

𝑈

1

𝑈2 𝑈

1 ∘ 𝑈2

slide-14
SLIDE 14

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

FST-based unsupervised grammatical error correction

𝐽 (Input) 𝐹 (Edit) 𝑄 (Penalization) 𝑀 (5-gram LM) …

slide-15
SLIDE 15

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

FST-based unsupervised grammatical error correction

𝐽 ∘ 𝐹

  • 𝐽: Input
  • 𝐹: Edit
  • 𝑄: Penalization
  • 𝑀: 5-gram LM

𝐽 ∘ 𝐹 ∘ 𝑄 𝐽 ∘ 𝐹 ∘ 𝑄 ∘ 𝑀: Non-neural unsupervised GEC with 5-gram LM scores

slide-16
SLIDE 16

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

FST-based neural unsupervised GEC

  • Idea: Use the constructed FSTs to constrain the output
  • f a neural LM
  • Neural sequence models normally use subwords or

characters rather than words.

  • Build transducer 𝑈 that maps full words to subwords (byte-pair

encoding, BPE)

  • Constrain neural LM with 𝐽 ∘ 𝐹 ∘ 𝑄 ∘ 𝑀 ∘ 𝑈
  • For constrained neural decoding we use our SGNMT decoder

http://ucam-smt.github.io/sgnmt/html/

  • 𝐽: Input
  • 𝐹: Edit
  • 𝑄: Penalization
  • 𝑀: 5-gram LM
  • 𝑈: Tokenization

(word → BPE)

slide-17
SLIDE 17

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

Results (unsupervised)

Systems are tuned with respect to metric highlighted in grey.

slide-18
SLIDE 18

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

FST-based neural supervised GEC

  • If annotated training data is available:
  • Input 𝐽 is a (Moses) SMT lattice rather than a single sentence
  • In addition to the <corr> token, we use an <mcorr> token to count the

edits by the SMT system.

  • We use an ensemble of a neural language model and a neural

machine translation model.

slide-19
SLIDE 19

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

FST-based supervised grammatical error correction

𝐽 (Input SMT lattice) 𝐽 ∘ 𝐹 𝐽 ∘ 𝐹 ∘ 𝑄 ∘ 𝑀 ∘ 𝑈: Constraint for neural ensembles

  • 𝐽: Input
  • 𝐹: Edit
  • 𝑄: Penalization
  • 𝑀: 5-gram LM
  • 𝑈: Tokenization

(word → BPE)

slide-20
SLIDE 20

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

Results (supervised)

Systems are tuned with respect to metric highlighted in grey.

slide-21
SLIDE 21

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

Results (supervised)

slide-22
SLIDE 22

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

Thanks

slide-23
SLIDE 23

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne

BACKUP

slide-24
SLIDE 24

Neural Grammatical Error Correction with Finite State Transducers

Felix Stahlberg, Christopher Bryant, and Bill Byrne