Representing Huge Translation Models Statistical Machine - - PowerPoint PPT Presentation

representing huge translation models statistical machine
SMART_READER_LITE
LIVE PREVIEW

Representing Huge Translation Models Statistical Machine - - PowerPoint PPT Presentation

Representing Huge Translation Models Statistical Machine Translation parallel text + alignment Statistical Machine Translation extract rules parallel text + alignment Statistical Machine Translation score extract rules rules parallel


slide-1
SLIDE 1

Representing Huge Translation Models

slide-2
SLIDE 2

Statistical Machine Translation

parallel text + alignment

slide-3
SLIDE 3

Statistical Machine Translation

extract rules parallel text + alignment

slide-4
SLIDE 4

Statistical Machine Translation

extract rules score rules parallel text + alignment

slide-5
SLIDE 5

Statistical Machine Translation

extract rules score rules

联合国 !安全 !理事会 !的 !五个 !常任 !理事 !国都

load rules into memory decoder parallel text + alignment

slide-6
SLIDE 6

Statistical Machine Translation

extract rules score rules

联合国 !安全 !理事会 !的 !五个 !常任 !理事 !国都

load rules into memory decoder parallel text + alignment number of rules depends on corpus size...

slide-7
SLIDE 7

Statistical Machine Translation

extract rules score rules

联合国 !安全 !理事会 !的 !五个 !常任 !理事 !国都

load rules into memory decoder parallel text + alignment ... and model complexity

slide-8
SLIDE 8

Statistical Machine Translation

parallel text + alignment extract rules score rules

联合国 !安全 !理事会 !的 !五个 !常任 !理事 !国都

load filtered rules into memory decoding algorithm filter rules for test set

slide-9
SLIDE 9

Baseline Translation Model

  • Hierarchical Phrase-based translation (Chiang 2007)
  • 1M parallel sentences (27M words)
  • GIZA++ alignments (Och & Ney 2003, Koehn et al. 2003)
  • alignments are dense
  • Heuristics used to restrict number of extracted rules
  • 67M rules, 6.1Gb of data
  • cf. 225M (Zens & Ney 2007), 55M (DeNeefe et al. 2007)
slide-10
SLIDE 10

Some Possible Improvements

  • 3.5M sentences (2.5M out-of-domain), 100M words
  • Discriminatively trained alignments (Ayan & Dorr 2006)
  • Key difference: alignments are sparse
  • Loose phrase extraction (Ayan & Dorr 2006)
slide-11
SLIDE 11

Some Possible Improvements

  • 3.5M sentences (2.5M out-of-domain), 100M words
  • Discriminatively trained alignments (Ayan & Dorr 2006)
  • Key difference: alignments are sparse
  • Loose phrase extraction (Ayan & Dorr 2006)
slide-12
SLIDE 12

Some Possible Improvements

  • 3.5M sentences (2.5M out-of-domain), 100M words
  • Discriminatively trained alignments (Ayan & Dorr 2006)
  • Key difference: alignments are sparse
  • Loose phrase extraction (Ayan & Dorr 2006)
slide-13
SLIDE 13

Some Possible Improvements

  • 3.5M sentences (2.5M out-of-domain), 100M words
  • Discriminatively trained alignments (Ayan & Dorr 2006)
  • Key difference: alignments are sparse
  • Loose phrase extraction (Ayan & Dorr 2006)
slide-14
SLIDE 14

Some Possible Improvements

  • 3.5M sentences (2.5M out-of-domain), 100M words
  • Discriminatively trained alignments (Ayan & Dorr 2006)
  • Key difference: alignments are sparse
  • Loose phrase extraction (Ayan & Dorr 2006)
slide-15
SLIDE 15

Some Possible Improvements

  • Rule extraction time: 77 CPU days
  • does not include sorting or scoring!
  • Rules counted: 20 billion
  • 2 orders of magnitude larger than state of the art
  • Estimated unique rules: 6.6 billion
  • Estimated extract file size: 917Gb
  • Estimated phrase table size: 600Gb
slide-16
SLIDE 16

The Problem

  • Current models are bounded by resource limitations.
  • We’re already pushing the edge of what’s possible.
  • Parallel data aren’t getting any smaller.
  • Models aren’t getting any less complex.
slide-17
SLIDE 17

The Solution

  • Translation by pattern matching.
  • Novel pattern matching algorithms.
  • Exploit ideas developed in bioinformatics, IR
  • Support for tera-scale translation models.
slide-18
SLIDE 18

Idea: Translation by Pattern Matching

(Callison-Burch et al. 05, Zhang & Vogel 05)

联合国 !安全 !理事会 !的 !五个 !常任 !理事 !国都

decoding algorithm pattern matching algorithm parallel text + alignment in memory sentence- specific rules extract and score

slide-19
SLIDE 19

it persuades him and it disheartens him

Exact Pattern Matching

Input Pattern

slide-20
SLIDE 20

it persuades him and it disheartens him

Exact Pattern Matching

Input Pattern =Query Pattern

slide-21
SLIDE 21

it persuades him and it disheartens him

Pattern Matching for Phrase-Based MT

Input Pattern

slide-22
SLIDE 22

it persuades him and disheartens it persuades persuades him him and and it it disheartens disheartens him it persuades him persuades him and him and it and it disheartens it disheartens him it persuades him and persuades him and it him and it disheartens and it disheartens him it persuades him and it persuades him and it disheartens him and it disheartens him

Pattern Matching for Phrase-Based MT

it persuades him and it disheartens him

Input Pattern Query Patterns

slide-23
SLIDE 23

Suffix Arrays

it makes him and it mars him , it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Text T

slide-24
SLIDE 24

Suffix Arrays

it makes him and it mars him , it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

it mars him , it sets him on and it takes him off . # 4

Text T Suffix 4

slide-25
SLIDE 25

Suffix Arrays

it makes him and it mars him , it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

it makes him and it mars him , it sets him on and it takes him ... makes him and it mars him , it sets him on and it takes him off . # him and it mars him , it sets him on and it takes him off . # and it mars him , it sets him on and it takes him off . # it mars him , it sets him on and it takes him off . # mars him , it sets him on and it takes him off . # him , it sets him on and it takes him off . # , it sets him on and it takes him off . # 1 2 3 4 5 6 7 ...

slide-26
SLIDE 26

Suffix Arrays

it makes him and it mars him , it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

3 12 2 15 10 6 4 and it mars him , it sets him on and it takes him off . # and it takes him off . # him and it mars him , it sets him on and it takes him off . # him off . # him on and it takes him off . # him , it sets him on and it takes him off . # it makes him and it mars him , it sets him on and it takes him ... it mars him , it sets him on and it takes him off . # ...

slide-27
SLIDE 27

Suffix Arrays

it makes him and it mars him , it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

and it mars him , it sets him on and it takes him off . # and it takes him off . # him and it mars him , it sets him on and it takes him off . # him off . # him on and it takes him off . # him , it sets him on and it takes him off . # it makes him and it mars him , it sets him on and it takes him ... it mars him , it sets him on and it takes him off . # ... 3 12 2 15 10 6 4 ...

slide-28
SLIDE 28

Suffix Arrays

it makes him and it mars him . it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

3 12 2 15 10 6 4 8 13 1 5 16 11 9 14 7 17 18

Text T Suffix Array SA

slide-29
SLIDE 29

Suffix Arrays

it makes him and it mars him . it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Text T Suffix Array SA Query Pattern w

him and it 3 12 2 15 10 6 4 8 13 1 5 16 11 9 14 7 17 18

slide-30
SLIDE 30

Suffix Arrays

it makes him and it mars him . it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Text T Suffix Array SA Query Pattern w

him and it 3 12 2 15 10 6 4 8 13 1 5 16 11 9 14 7 17 18

slide-31
SLIDE 31

Suffix Arrays

it makes him and it mars him . it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Text T Suffix Array SA Query Pattern w

him and it 3 12 2 15 10 6 4 8 13 1 5 16 11 9 14 7 17 18

slide-32
SLIDE 32

Suffix Arrays

it makes him and it mars him . it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Text T Suffix Array SA Query Pattern w

him and it 3 12 2 15 10 6 4 8 13 1 5 16 11 9 14 7 17 18

slide-33
SLIDE 33

Suffix Arrays

it makes him and it mars him . it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Text T Suffix Array SA Query Pattern w

him and it

O(|w| log |T|)

3 12 2 15 10 6 4 8 13 1 5 16 11 9 14 7 17 18

slide-34
SLIDE 34

Suffix Arrays

it makes him and it mars him . it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Text T Suffix Array SA Query Pattern w

him and it

(Manber & Myers, 93) O(|w| log |T|) O(|w| + log |T|)

3 12 2 15 10 6 4 8 13 1 5 16 11 9 14 7 17 18

slide-35
SLIDE 35

Suffix Arrays

it makes him and it mars him . it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Text T Suffix Array SA Query Pattern w

him and it

(Manber & Myers, 93) O(|w| log |T|) O(|w| + log |T|) O(|w|) (Abouelhoda et al., 04)

3 12 2 15 10 6 4 8 13 1 5 16 11 9 14 7 17 18

slide-36
SLIDE 36

Suffix Arrays

it makes him and it mars him . it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Text T Suffix Array SA Query Pattern w

him and it

(Manber & Myers, 93) O(|w| log |T|) O(|w| + log |T|) O(|w|) (Abouelhoda et al., 04)

3 12 2 15 10 6 4 8 13 1 5 16 11 9 14 7 17 18

  • n baseline model:

0.009 seconds/sentence (not including extraction/scoring)

slide-37
SLIDE 37

Problem: Phrases with Gaps

  • Hierarchical phrase-based translation (Chiang 2005, 2007)
  • Quirk et al. 2005, Simard et al. 2005, DeNeefe et al. 2007

it persuades him and it disheartens him it X him

Source Phrase Input

slide-38
SLIDE 38

Hierarchical Phrases: Phrases with Gaps

  • Hierarchical phrase-based translation (Chiang 2005, 2007)
  • Quirk et al. 2005, Simard et al. 2005, DeNeefe et al. 2007

it persuades him and it disheartens him it X him

Source Phrase Input

slide-39
SLIDE 39

Hierarchical Phrases: Phrases with Gaps

  • Hierarchical phrase-based translation (Chiang 2005, 2007)
  • Quirk et al. 2005, Simard et al. 2005, DeNeefe et al. 2007

it persuades him and it disheartens him it X him

Source Phrase Input

slide-40
SLIDE 40

Hierarchical Phrases: Phrases with Gaps

  • Hierarchical phrase-based translation (Chiang 2005, 2007)
  • Quirk et al. 2005, Simard et al. 2005, DeNeefe et al. 2007

it persuades him and it disheartens him it X him

Source Phrase Input

slide-41
SLIDE 41

Hierarchical Phrases: Phrases with Gaps

  • Hierarchical phrase-based translation (Chiang 2005, 2007)
  • Quirk et al. 2005, Simard et al. 2005, DeNeefe et al. 2007

it persuades him and it disheartens him it X and X him

Source Phrase Input

slide-42
SLIDE 42

Given an input sentence, efficiently find all hierarchical phrase-based translation rules for that sentence in the training corpus.

Problem Statement

slide-43
SLIDE 43

it persuades him and it disheartens him

Pattern Matching for Hierachical PBMT

Input Pattern

slide-44
SLIDE 44

it persuades him and disheartens it persuades persuades him him and and it it disheartens disheartens him it persuades him persuades him and him and it and it disheartens it disheartens him it persuades him and persuades him and it him and it disheartens and it disheartens him it persuades him and it persuades him and it disheartens him and it disheartens him

Pattern Matching for Hierarchical PBMT

it persuades him and it disheartens him

Input Pattern Query Patterns

slide-45
SLIDE 45

Pattern Matching for Hierarchical PBMT

it persuades him and it disheartens him

Input Pattern Query Patterns

it X and it X it it X disheartens it X him persuades X it persuades X disheartens persuades X him it persuades X it it persuades X disheartens it persuades X him it X and it it X it disheartens it X disheartens him it X and X him persuades him X disheartens persuades him X him persuades X it disheartens persuades X disheartens him him and X him him X disheartens him it persuades him X disheartens it persuades him X him it persuades X it disheartens it persuades X disheartens him

slide-46
SLIDE 46

Pattern Matching for Hierarchical PBMT

it persuades him and it disheartens him

Input Pattern Query Patterns

it X and it disheartens it X it disheartens him persuades him and X him persuades him X disheartens him persuades X it disheartens him it persuades him and X him it persuades him X disheartens him it persuades X it disheartens him it X and it disheartens him

slide-47
SLIDE 47

Pattern Matching for Hierarchical PBMT

it persuades him and it disheartens him

Input Pattern Query Patterns

it X and it disheartens it X it disheartens him persuades him and X him persuades him X disheartens him persuades X it disheartens him it persuades him and X him it persuades him X disheartens him it persuades X it disheartens him it X and it disheartens him

This is a variant of approximate pattern matching (Navarro ‘01)

slide-48
SLIDE 48

Pattern Matching with Gaps

3 12 2 15 10 6 4 8 13 and it mars him , it sets him ... and it takes him off . # him and it mars him . it sets ... him off . # him on and it takes him off . # him , it sets him on and it ... it makes him and it mars ... it mars him , it sets him on ... it sets him on and it takes ... it takes him off . # makes him and it mars him ... 1 him X it

Query pattern

...

α

slide-49
SLIDE 49

Pattern Matching with Gaps

him X it

Query pattern α

3 12 2 15 10 6 4 8 13 1 ... and it mars him , it sets him ... and it takes him off . # him and it mars him . it sets ... him off . # him on and it takes him off . # him , it sets him on and it ... it makes him and it mars ... it mars him , it sets him on ... it sets him on and it takes ... it takes him off . # makes him and it mars him ...

slide-50
SLIDE 50

Pattern Matching with Gaps

him X it

Query pattern α

3 12 2 15 10 6 4 8 13 1 ... and it mars him , it sets him ... and it takes him off . # him and it mars him . it sets ... him off . # him on and it takes him off . # him , it sets him on and it ... it makes him and it mars ... it mars him , it sets him on ... it sets him on and it takes ... it takes him off . # makes him and it mars him ...

slide-51
SLIDE 51

Pattern Matching with Gaps

him X it

Query pattern

him it

α Subpatterns wi

3 12 2 15 10 6 4 8 13 1 ... and it mars him , it sets him ... and it takes him off . # him and it mars him . it sets ... him off . # him on and it takes him off . # him , it sets him on and it ... it makes him and it mars ... it mars him , it sets him on ... it sets him on and it takes ... it takes him off . # makes him and it mars him ...

slide-52
SLIDE 52

Pattern Matching with Gaps

him X it

Query pattern

him it

α Subpatterns wi

3 12 2 15 10 6 4 8 13 1 ... and it mars him , it sets him ... and it takes him off . # him and it mars him . it sets ... him off . # him on and it takes him off . # him , it sets him on and it ... it makes him and it mars ... it mars him , it sets him on ... it sets him on and it takes ... it takes him off . # makes him and it mars him ...

slide-53
SLIDE 53

Pattern Matching with Gaps

him X it

Query pattern

him it

α Subpatterns wi ni Occurrences

3 12 2 15 10 6 4 8 13 1 ... and it mars him , it sets him ... and it takes him off . # him and it mars him . it sets ... him off . # him on and it takes him off . # him , it sets him on and it ... it makes him and it mars ... it mars him , it sets him on ... it sets him on and it takes ... it takes him off . # makes him and it mars him ...

slide-54
SLIDE 54

Pattern Matching with Gaps

3 12 2 15 10 6 4 8 13 1 ... and it mars him , it sets him ... and it takes him off . # him and it mars him . it sets ... him off . # him on and it takes him off . # him , it sets him on and it ... it makes him and it mars ... it mars him , it sets him on ... it sets him on and it takes ... it takes him off . # makes him and it mars him ... 2 15 10 6 4 8 13

slide-55
SLIDE 55

Pattern Matching with Gaps

2 15 10 6 4 8 13

slide-56
SLIDE 56

Pattern Matching with Gaps

2 15 10 6 4 8 13

(2, 4) (2, 8) (2, 13) (6, 8) (6, 13) (10, 13)

slide-57
SLIDE 57

Pattern Matching with Gaps

2 15 10 6 4 8 13

RILMS (Rahman et al., 06) (2, 4) (2, 8) (2, 13) (6, 8) (6, 13) (10, 13)

slide-58
SLIDE 58

Pattern Matching with Gaps

2 15 10 6 4 8 13

RILMS (Rahman et al., 06) O(

  • i

ni) linear in number of occurrences of subpatterns: (2, 4) (2, 8) (2, 13) (6, 8) (6, 13) (10, 13)

slide-59
SLIDE 59

221

seconds

Baseline Timing Result

per sentence compare: 0.009 seconds per sentence for contiguous phrases

slide-60
SLIDE 60

137 5 27

  • α=w1X...XwI

I

  • i=1

(|wi| + log |T| + ni)

  • w

(|w| + log |T|) 2825 3 5 27 82069 contiguous discontiguous

Complexity Analysis

slide-61
SLIDE 61

137 5 27

  • α=w1X...XwI

I

  • i=1

(|wi| + log |T| + ni)

  • w

(|w| + log |T|) 2825 3 5 27 82069 contiguous discontiguous

Complexity Analysis

slide-62
SLIDE 62

Exploiting Redundancy

it persuades him and it disheartens him

Input Pattern Query Patterns

it X and it X it it X disheartens it X him persuades X it persuades X disheartens persuades X him it persuades X it it persuades X disheartens it persuades X him it X and it it X it disheartens it X disheartens him it X and X him persuades him X disheartens persuades him X him persuades X it disheartens persuades X disheartens him him and X him him X disheartens him it persuades him X disheartens it persuades him X him it persuades X it disheartens it persuades X disheartens him

slide-63
SLIDE 63

Exploiting Redundancy

it persuades him and it disheartens him

Input Pattern Query Patterns

it X and it X it it X disheartens it X him persuades X it persuades X disheartens persuades X him it persuades X it it persuades X disheartens it persuades X him it X and it it X it disheartens it X disheartens him it X and X him persuades him X disheartens persuades him X him persuades X it disheartens persuades X disheartens him him and X him him X disheartens him it persuades him X disheartens it persuades him X him it persuades X it disheartens it persuades X disheartens him

slide-64
SLIDE 64

Exploiting Redundancy

it persuades X disheartens him Query Pattern

slide-65
SLIDE 65

Exploiting Redundancy

it persuades X disheartens him Query Pattern it persuades X disheartens Maximal Prefix (Zhang & Vogel 2005)

slide-66
SLIDE 66

Exploiting Redundancy

it persuades X disheartens him Query Pattern it persuades X disheartens persuades X disheartens him Maximal Prefix Maximal Suffix

slide-67
SLIDE 67

Prefix Tree with Suffix Links

it persuades him X him him persuades X him him

slide-68
SLIDE 68

221 Baseline

seconds/ sentence

Timing Results

slide-69
SLIDE 69

177 221 Baseline Prefix Tree

seconds/ sentence

Timing Results

slide-70
SLIDE 70

137 5 27

  • α=w1X...XwI

I

  • i=1

(|wi| + log |T| + ni)

  • w

(|w| + log |T|) 2825 3 5 27 82069 contiguous discontiguous

Complexity Analysis

slide-71
SLIDE 71

137 5 27

  • α=w1X...XwI

I

  • i=1

(|wi| + log |T| + ni)

  • w

(|w| + log |T|) 2825 3 5 27 82069 contiguous discontiguous

Complexity Analysis

slide-72
SLIDE 72

computations (ranked by time) cumulative time (s)

Empirical Analysis

slide-73
SLIDE 73

Distribution of Patterns in Training Data

Frequency Pattern types (in descending order of frequency)

slide-74
SLIDE 74

Distribution of Patterns in Training Data

Frequency Pattern types (in descending order of frequency)

slide-75
SLIDE 75

Analysis of Problem

  • The expensive computations involve at least one frequent
  • subpattern. There are two cases.
  • A frequent pattern paired with an infrequent pattern
  • Two frequent patterns paired with each other
slide-76
SLIDE 76

Frequent × Infrequent Subpatterns

slide-77
SLIDE 77

Frequent × Infrequent Subpatterns

slide-78
SLIDE 78

Frequent × Infrequent Subpatterns

slide-79
SLIDE 79

Frequent × Infrequent Subpatterns

slide-80
SLIDE 80

Double Binary Search

Baeza-Yates, 04

slide-81
SLIDE 81

Double Binary Search

Baeza-Yates, 04 Queryset Q Dataset D

slide-82
SLIDE 82

Double Binary Search

Baeza-Yates, 04 Queryset Q Dataset D

slide-83
SLIDE 83

Double Binary Search

Baeza-Yates, 04 Queryset Q Dataset D

slide-84
SLIDE 84

Double Binary Search

Baeza-Yates, 04 Queryset Q Dataset D

slide-85
SLIDE 85

Double Binary Search

Baeza-Yates, 04 Queryset Q Dataset D

slide-86
SLIDE 86

Double Binary Search

Baeza-Yates, 04 Queryset Q Dataset D

slide-87
SLIDE 87

Double Binary Search

Baeza-Yates, 04 Queryset Q Dataset D complexity: |Q| log |D| Upper bound

slide-88
SLIDE 88

Obtaining Sorted Sets

slide-89
SLIDE 89

Obtaining Sorted Sets

Sort via Stratified Tree (van Emde Boas et al. 1977)

slide-90
SLIDE 90

Obtaining Sorted Sets

Problem: complexity increases to O(|Q| log |D| + (|Q| + |D|) log log |T|) Sort via Stratified Tree (van Emde Boas et al. 1977)

slide-91
SLIDE 91

Obtaining Sorted Sets

Problem: complexity increases to Solution: cache sorted set in prefix tree O(|Q| log |D| + (|Q| + |D|) log log |T|) Sort via Stratified Tree (van Emde Boas et al. 1977)

slide-92
SLIDE 92

177 221 Baseline Prefix Tree + double binary

seconds/ sentence

Timing Results

slide-93
SLIDE 93

174 177 221 Baseline Prefix Tree + double binary

seconds/ sentence

Timing Results

slide-94
SLIDE 94

Obtaining Sorted Sets

Sort via Stratified Tree Problem: sort complexity is still very high for very frequent patterns

slide-95
SLIDE 95

Obtaining Sorted Sets

Solution: precompute the inverted index for 1000 most frequent contiguous patterns

slide-96
SLIDE 96

174 177 221 Baseline Prefix Tree + double binary

seconds/ sentence

Timing Results

slide-97
SLIDE 97

44 174 177 221 Baseline Prefix Tree + double binary

seconds/ sentence

Timing Results

+ inverted indices

slide-98
SLIDE 98

Frequent × Frequent Subpatterns

slide-99
SLIDE 99

Frequent × Frequent Subpatterns

Problem: There is no clever algorithm to solve this problem

slide-100
SLIDE 100

Solution: Precomputation

it makes him and it mars him . it sets him on and it takes him off . # it makes him and it mars him . it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Text

slide-101
SLIDE 101

Solution: Precomputation

it makes him and it mars him . it sets him on and it takes him off . #

Most Frequent Patterns

it makes him and it mars him . it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Text

it (4) him (4)

Precomputed Pattern Matches

it X him him X it it X it him X him

slide-102
SLIDE 102

Solution: Precomputation

it makes him and it mars him . it sets him on and it takes him off . #

Most Frequent Patterns

it makes him and it mars him . it sets him on and it takes him off . #

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Text

it (4) him (4)

Precomputed Pattern Matches

it X him him X it it X it him X him

(0, 2) (0, 4) (0, 6) (0, 8) (2, 4) (2, 6) (2, 8) (2, 10) (4, 6) (4, 8) (4, 10) (4, 13) (6, 8) (6, 10) (6, 13) (6, 15) (8, 10) (8, 13) (8, 15) (10, 13) (10, 15) (13, 15)

slide-103
SLIDE 103

44 174 177 221 Baseline Prefix Tree + double binary

seconds/ sentence

Timing Results

+ inverted indices

slide-104
SLIDE 104

1 44 174 177 221 Baseline Prefix Tree + double binary

seconds/ sentence

Timing Results

+ inverted indices + precomp

slide-105
SLIDE 105

Analysis of Fixed Memory Usage

  • Source Text: |T|
  • Suffix Array: |T|
  • Alignments: |T|
  • Target Text: |T|
  • Total Cost: 4 |T|
  • For 27M words: about 700M
  • including indices for 1000 words: about 2.1 Gb
  • for 100 words: 1.1Gb, increases time to 1.6 secs/sent
slide-106
SLIDE 106

Longer Spans, Longer Phrases

15 20 25 30 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 15 20 25 30 35 1 2 3 4 5 6 7 8 9 10

Maximum Span Length Maximum Phrase Length BLEU BLEU

slide-107
SLIDE 107

The Tera-Scale Translation Model

  • Task: NIST Chinese-English 2005
  • Baseline Model: 30.7
  • Tera-Scale Model: 32.6
  • All modifications contribute to overall score
  • With better language model and number translation:
  • Baseline Model: 31.9
  • Tera-Scale Model: 34.5
slide-108
SLIDE 108

Open Questions

  • Can we improve speed?
  • Can we improve memory use? Compressed self-indexes?
  • Uses for arbitrarily large translation models?
  • Context-sensitive models (Chan et al. 2007, Carpuat &

Wu 2007)

  • Factored models (Koehn et al. 2007)
  • Syntax-based model (DeNeefe et al. 2007)
  • What other algorithms can we use from bioinformatics?
slide-109
SLIDE 109

Thanks

Acknowledgements: David Chiang, Chris Dyer, Philip Resnik