Capitalization Cues Improve Dpendency Grammar Induction Valentin I. - - PowerPoint PPT Presentation

capitalization cues improve
SMART_READER_LITE
LIVE PREVIEW

Capitalization Cues Improve Dpendency Grammar Induction Valentin I. - - PowerPoint PPT Presentation

Capitalization Cues Improve Dpendency Grammar Induction Valentin I. Spitkovsky with Daniel Jurafsky (Stanford University) and Hiyan Alshawi (Google Inc.) Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 1 / 10


slide-1
SLIDE 1

Capitalization Cues Improve

Dpendency Grammar Induction Valentin I. Spitkovsky with Daniel Jurafsky (Stanford University) and Hiyan Alshawi (Google Inc.)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 1 / 10

slide-2
SLIDE 2

Problem Unsupervised Learning

Problem: Grammar Induction is Hard

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 2 / 10

slide-3
SLIDE 3

Problem Unsupervised Learning

Problem: Grammar Induction is Hard

Major challenges:

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 2 / 10

slide-4
SLIDE 4

Problem Unsupervised Learning

Problem: Grammar Induction is Hard

Major challenges: non-convex objectives

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 2 / 10

slide-5
SLIDE 5

Problem Unsupervised Learning

Problem: Grammar Induction is Hard

Major challenges: non-convex objectives poor correlations between likelihood and accuracy

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 2 / 10

slide-6
SLIDE 6

Problem Unsupervised Learning

Problem: Grammar Induction is Hard

Major challenges: non-convex objectives

(Gimpel and Smith, 2012)

poor correlations between likelihood and accuracy

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 2 / 10

slide-7
SLIDE 7

Problem Unsupervised Learning

Problem: Grammar Induction is Hard

Major challenges: non-convex objectives

(Gimpel and Smith, 2012)

poor correlations between likelihood and accuracy

(Pereira and Schabes, 1992; Elworthy, 1994; Merialdo, 1994; Liang and Klein, 2008; Spitkovsky et al., 2009–2011)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 2 / 10

slide-8
SLIDE 8

Problem Unsupervised Learning

Problem: Grammar Induction is Hard

Major challenges: non-convex objectives

(Gimpel and Smith, 2012)

poor correlations between likelihood and accuracy

(Pereira and Schabes, 1992; Elworthy, 1994; Merialdo, 1994; Liang and Klein, 2008; Spitkovsky et al., 2009–2011)

◮ e.g., optimizers run away from supervised MLE solutions Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 2 / 10

slide-9
SLIDE 9

Problem Unsupervised Learning

Problem: Grammar Induction is Hard

Major challenges: non-convex objectives

(Gimpel and Smith, 2012)

poor correlations between likelihood and accuracy

(Pereira and Schabes, 1992; Elworthy, 1994; Merialdo, 1994; Liang and Klein, 2008; Spitkovsky et al., 2009–2011)

◮ e.g., optimizers run away from supervised MLE solutions

(to the tune of 20 points of accuracy)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 2 / 10

slide-10
SLIDE 10

Problem Unsupervised Learning

Problem: Grammar Induction is Hard

Major challenges: non-convex objectives

(Gimpel and Smith, 2012)

poor correlations between likelihood and accuracy

(Pereira and Schabes, 1992; Elworthy, 1994; Merialdo, 1994; Liang and Klein, 2008; Spitkovsky et al., 2009–2011)

◮ e.g., optimizers run away from supervised MLE solutions

(to the tune of 20 points of accuracy)

flaws in evaluation

(Schwartz et al., 2011)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 2 / 10

slide-11
SLIDE 11

Problem Unsupervised Learning

Problem: Grammar Induction is Hard

Major challenges: non-convex objectives

(Gimpel and Smith, 2012)

poor correlations between likelihood and accuracy

(Pereira and Schabes, 1992; Elworthy, 1994; Merialdo, 1994; Liang and Klein, 2008; Spitkovsky et al., 2009–2011)

◮ e.g., optimizers run away from supervised MLE solutions

(to the tune of 20 points of accuracy)

flaws in evaluation

(Schwartz et al., 2011)

Partial solutions:

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 2 / 10

slide-12
SLIDE 12

Problem Unsupervised Learning

Problem: Grammar Induction is Hard

Major challenges: non-convex objectives

(Gimpel and Smith, 2012)

poor correlations between likelihood and accuracy

(Pereira and Schabes, 1992; Elworthy, 1994; Merialdo, 1994; Liang and Klein, 2008; Spitkovsky et al., 2009–2011)

◮ e.g., optimizers run away from supervised MLE solutions

(to the tune of 20 points of accuracy)

flaws in evaluation

(Schwartz et al., 2011)

Partial solutions: train on more / better data

(Mareˇ cek and Zabokrtsk´ y, 2012)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 2 / 10

slide-13
SLIDE 13

Problem Unsupervised Learning

Problem: Grammar Induction is Hard

Major challenges: non-convex objectives

(Gimpel and Smith, 2012)

poor correlations between likelihood and accuracy

(Pereira and Schabes, 1992; Elworthy, 1994; Merialdo, 1994; Liang and Klein, 2008; Spitkovsky et al., 2009–2011)

◮ e.g., optimizers run away from supervised MLE solutions

(to the tune of 20 points of accuracy)

flaws in evaluation

(Schwartz et al., 2011)

Partial solutions: train on more / better data

(Mareˇ cek and Zabokrtsk´ y, 2012)

test many data sets / languages

(fight noise with CLT)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 2 / 10

slide-14
SLIDE 14

Problem Unsupervised Learning

Problem: Grammar Induction is Hard

Major challenges: non-convex objectives

(Gimpel and Smith, 2012)

poor correlations between likelihood and accuracy

(Pereira and Schabes, 1992; Elworthy, 1994; Merialdo, 1994; Liang and Klein, 2008; Spitkovsky et al., 2009–2011)

◮ e.g., optimizers run away from supervised MLE solutions

(to the tune of 20 points of accuracy)

flaws in evaluation

(Schwartz et al., 2011)

Partial solutions: train on more / better data

(Mareˇ cek and Zabokrtsk´ y, 2012)

test many data sets / languages

(fight noise with CLT)

employ less ad-hoc initializers

(“eat your own dog food”)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 2 / 10

slide-15
SLIDE 15

Problem Unsupervised Learning

Problem: Grammar Induction is Hard

Major challenges: non-convex objectives

(Gimpel and Smith, 2012)

poor correlations between likelihood and accuracy

(Pereira and Schabes, 1992; Elworthy, 1994; Merialdo, 1994; Liang and Klein, 2008; Spitkovsky et al., 2009–2011)

◮ e.g., optimizers run away from supervised MLE solutions

(to the tune of 20 points of accuracy)

flaws in evaluation

(Schwartz et al., 2011)

Partial solutions: train on more / better data

(Mareˇ cek and Zabokrtsk´ y, 2012)

test many data sets / languages

(fight noise with CLT)

employ less ad-hoc initializers

(“eat your own dog food”)

constrain search space

(structure is underdetermined)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 2 / 10

slide-16
SLIDE 16

Idea New Cue

Idea: Use Capitalization as Parsing Cues

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 3 / 10

slide-17
SLIDE 17

Idea New Cue

Idea: Use Capitalization as Parsing Cues

Partial bracketing constraints:

(Pereira and Schabes, 1992)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 3 / 10

slide-18
SLIDE 18

Idea New Cue

Idea: Use Capitalization as Parsing Cues

Partial bracketing constraints:

(Pereira and Schabes, 1992)

semantic annotations

(Naseem and Barzilay, 2011)

punctuation marks

(Ponvert et al., 2010)

web markup

(Spitkovsky et al., 2010)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 3 / 10

slide-19
SLIDE 19

Idea New Cue

Idea: Use Capitalization as Parsing Cues

Partial bracketing constraints:

(Pereira and Schabes, 1992)

semantic annotations

(Naseem and Barzilay, 2011)

punctuation marks

(Ponvert et al., 2010)

web markup

(Spitkovsky et al., 2010)

... defined over raw text (no POS tags).

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 3 / 10

slide-20
SLIDE 20

Example Very WSJ

Example:

(no punctuation, etc. cues)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 4 / 10

slide-21
SLIDE 21

Example Very WSJ

Example:

(no punctuation, etc. cues)

[NP Jay Stevens] of [NP Dean Witter] actually cut his per-share earnings estimate to [NP $9] from [NP $9.50] for [NP 1989] and to [NP $9.50] from [NP $10.35] in [NP 1990] because he decided sales would be even weaker than he had expected.

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 4 / 10

slide-22
SLIDE 22

Example Still WSJ

Example:

(less WSJ-ish)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 5 / 10

slide-23
SLIDE 23

Example Still WSJ

Example:

(less WSJ-ish)

[NP Jurors] in [NP U.S. District Court] in [NP Miami] cleared [NP Harold Hershhenson], a former executive vice president; [NP John Pagones], a former vice president; and [NP Stephen Vadas] and [NP Dean Ciporkin], who had been engineers with [NP Cordis].

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 5 / 10

slide-24
SLIDE 24

Analysis English

Analysis:

(English PTB)

Mostly noun phrases (96%):

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 6 / 10

slide-25
SLIDE 25

Analysis English

Analysis:

(English PTB)

Mostly noun phrases (96%): Apple II World War I Mayor William H. Hudnut III International Business Machines Corp. Alexandria, Va

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 6 / 10

slide-26
SLIDE 26

Analysis English

Analysis:

(English PTB)

Mostly noun phrases (96%): Apple II World War I Mayor William H. Hudnut III International Business Machines Corp. Alexandria, Va Some proper adjectives (5%);

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 6 / 10

slide-27
SLIDE 27

Analysis English

Analysis:

(English PTB)

Mostly noun phrases (96%): Apple II World War I Mayor William H. Hudnut III International Business Machines Corp. Alexandria, Va Some proper adjectives (5%); First-person pronoun, I (2%).

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 6 / 10

slide-28
SLIDE 28

Analysis English

Analysis:

(English PTB)

Mostly noun phrases (96%): Apple II World War I Mayor William H. Hudnut III International Business Machines Corp. Alexandria, Va Some proper adjectives (5%); First-person pronoun, I (2%). — Yields more accurate dependency parsing constraints than either markup or punctuation (for WSJ).

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 6 / 10

slide-29
SLIDE 29

Experiments Multi-Lingual

Experiments:

(CoNLL 2006/7)

Data:

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 7 / 10

slide-30
SLIDE 30

Experiments Multi-Lingual

Experiments:

(CoNLL 2006/7)

Data:

◮ 14 languages with case information Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 7 / 10

slide-31
SLIDE 31

Experiments Multi-Lingual

Experiments:

(CoNLL 2006/7)

Data:

◮ 14 languages with case information ◮ not Spanish or Basque (because of post-processing) Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 7 / 10

slide-32
SLIDE 32

Experiments Multi-Lingual

Experiments:

(CoNLL 2006/7)

Data:

◮ 14 languages with case information ◮ not Spanish or Basque (because of post-processing) ◮ not Japanese, Chinese or Arabic... Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 7 / 10

slide-33
SLIDE 33

Experiments Multi-Lingual

Experiments:

(CoNLL 2006/7)

Data:

◮ 14 languages with case information ◮ not Spanish or Basque (because of post-processing) ◮ not Japanese, Chinese or Arabic...

Model:

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 7 / 10

slide-34
SLIDE 34

Experiments Multi-Lingual

Experiments:

(CoNLL 2006/7)

Data:

◮ 14 languages with case information ◮ not Spanish or Basque (because of post-processing) ◮ not Japanese, Chinese or Arabic...

Model:

◮ DBM-1

(Spitkovsky et al., 2012)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 7 / 10

slide-35
SLIDE 35

Experiments Multi-Lingual

Experiments:

(CoNLL 2006/7)

Data:

◮ 14 languages with case information ◮ not Spanish or Basque (because of post-processing) ◮ not Japanese, Chinese or Arabic...

Model:

◮ DBM-1

(Spitkovsky et al., 2012)

◮ first dependency-and-boundary model

(see EMNLP)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 7 / 10

slide-36
SLIDE 36

Experiments Multi-Lingual

Experiments:

(CoNLL 2006/7)

Data:

◮ 14 languages with case information ◮ not Spanish or Basque (because of post-processing) ◮ not Japanese, Chinese or Arabic...

Model:

◮ DBM-1

(Spitkovsky et al., 2012)

◮ first dependency-and-boundary model

(see EMNLP)

Training:

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 7 / 10

slide-37
SLIDE 37

Experiments Multi-Lingual

Experiments:

(CoNLL 2006/7)

Data:

◮ 14 languages with case information ◮ not Spanish or Basque (because of post-processing) ◮ not Japanese, Chinese or Arabic...

Model:

◮ DBM-1

(Spitkovsky et al., 2012)

◮ first dependency-and-boundary model

(see EMNLP)

Training:

◮ vanilla EM Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 7 / 10

slide-38
SLIDE 38

Experiments Multi-Lingual

Experiments:

(CoNLL 2006/7)

Data:

◮ 14 languages with case information ◮ not Spanish or Basque (because of post-processing) ◮ not Japanese, Chinese or Arabic...

Model:

◮ DBM-1

(Spitkovsky et al., 2012)

◮ first dependency-and-boundary model

(see EMNLP)

Training:

◮ vanilla EM ◮ controls: uniform Viterbi init

(Cohen and Smith, 2010)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 7 / 10

slide-39
SLIDE 39

Experiments Multi-Lingual

Experiments:

(CoNLL 2006/7)

Data:

◮ 14 languages with case information ◮ not Spanish or Basque (because of post-processing) ◮ not Japanese, Chinese or Arabic...

Model:

◮ DBM-1

(Spitkovsky et al., 2012)

◮ first dependency-and-boundary model

(see EMNLP)

Training:

◮ vanilla EM ◮ controls: uniform Viterbi init

(Cohen and Smith, 2010)

◮ capitalization: constrained sampling of initial parse trees Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 7 / 10

slide-40
SLIDE 40

Experiments Results

Results:

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 8 / 10

slide-41
SLIDE 41

Experiments Results

Results:

2+ increase in accuracy

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 8 / 10

slide-42
SLIDE 42

Experiments Results

Results:

2+ increase in accuracy (on average, 42.8 → 45)

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 8 / 10

slide-43
SLIDE 43

Experiments Results

Results:

2+ increase in accuracy (on average, 42.8 → 45)

◮ over a state-of-the-art baseline Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 8 / 10

slide-44
SLIDE 44

Experiments Results

Results:

2+ increase in accuracy (on average, 42.8 → 45)

◮ over a state-of-the-art baseline ◮ with various different constraints Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 8 / 10

slide-45
SLIDE 45

Experiments Results

Results:

2+ increase in accuracy (on average, 42.8 → 45)

◮ over a state-of-the-art baseline ◮ with various different constraints ◮ helps in training and during inference Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 8 / 10

slide-46
SLIDE 46

Experiments Results

Results:

2+ increase in accuracy (on average, 42.8 → 45)

◮ over a state-of-the-art baseline ◮ with various different constraints ◮ helps in training and during inference ◮ and also in combination with punctuation Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 8 / 10

slide-47
SLIDE 47

Experiments Results

Results:

2+ increase in accuracy (on average, 42.8 → 45)

◮ over a state-of-the-art baseline ◮ with various different constraints ◮ helps in training and during inference ◮ and also in combination with punctuation

but, most of the gain is from just two languages...

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 8 / 10

slide-48
SLIDE 48

Experiments Results

Results:

2+ increase in accuracy (on average, 42.8 → 45)

◮ over a state-of-the-art baseline ◮ with various different constraints ◮ helps in training and during inference ◮ and also in combination with punctuation

but, most of the gain is from just two languages...

◮ Italian (+11) and Greek (+18) Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 8 / 10

slide-49
SLIDE 49

Experiments Results

Results:

2+ increase in accuracy (on average, 42.8 → 45)

◮ over a state-of-the-art baseline ◮ with various different constraints ◮ helps in training and during inference ◮ and also in combination with punctuation

but, most of the gain is from just two languages...

◮ Italian (+11) and Greek (+18) ◮ worst impact on English (-0.02) Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 8 / 10

slide-50
SLIDE 50

Experiments Results

Results:

2+ increase in accuracy (on average, 42.8 → 45)

◮ over a state-of-the-art baseline ◮ with various different constraints ◮ helps in training and during inference ◮ and also in combination with punctuation

but, most of the gain is from just two languages...

◮ Italian (+11) and Greek (+18) ◮ worst impact on English (-0.02), so much for inspiration... Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 8 / 10

slide-51
SLIDE 51

Experiments Results

Results:

2+ increase in accuracy (on average, 42.8 → 45)

◮ over a state-of-the-art baseline ◮ with various different constraints ◮ helps in training and during inference ◮ and also in combination with punctuation

but, most of the gain is from just two languages...

◮ Italian (+11) and Greek (+18) ◮ worst impact on English (-0.02), so much for inspiration... ◮ still, virtually no harm — even in the worst case! Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 8 / 10

slide-52
SLIDE 52

Experiments Conclusion

Conclusion:

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 9 / 10

slide-53
SLIDE 53

Experiments Conclusion

Conclusion:

informative signal, but requires further investigation

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 9 / 10

slide-54
SLIDE 54

Experiments Conclusion

Conclusion:

informative signal, but requires further investigation

◮ very preliminary results... Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 9 / 10

slide-55
SLIDE 55

Experiments Conclusion

Conclusion:

informative signal, but requires further investigation

◮ very preliminary results... ◮ cues may be more useful as features! Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 9 / 10

slide-56
SLIDE 56

Experiments Conclusion

Conclusion:

informative signal, but requires further investigation

◮ very preliminary results... ◮ cues may be more useful as features!

miscellaneous observations:

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 9 / 10

slide-57
SLIDE 57

Experiments Conclusion

Conclusion:

informative signal, but requires further investigation

◮ very preliminary results... ◮ cues may be more useful as features!

miscellaneous observations:

◮ transitions between scripts Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 9 / 10

slide-58
SLIDE 58

Experiments Conclusion

Conclusion:

informative signal, but requires further investigation

◮ very preliminary results... ◮ cues may be more useful as features!

miscellaneous observations:

◮ transitions between scripts ⋆ e.g., for Arabic, CJK, numerals, etc. Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 9 / 10

slide-59
SLIDE 59

Experiments Conclusion

Conclusion:

informative signal, but requires further investigation

◮ very preliminary results... ◮ cues may be more useful as features!

miscellaneous observations:

◮ transitions between scripts ⋆ e.g., for Arabic, CJK, numerals, etc. ◮ interaction with punctuation / “operator” precedence Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 9 / 10

slide-60
SLIDE 60

Experiments Conclusion

Conclusion:

informative signal, but requires further investigation

◮ very preliminary results... ◮ cues may be more useful as features!

miscellaneous observations:

◮ transitions between scripts ⋆ e.g., for Arabic, CJK, numerals, etc. ◮ interaction with punctuation / “operator” precedence ⋆ e.g., Alexandria, Va Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 9 / 10

slide-61
SLIDE 61

Experiments Conclusion

Conclusion:

informative signal, but requires further investigation

◮ very preliminary results... ◮ cues may be more useful as features!

miscellaneous observations:

◮ transitions between scripts ⋆ e.g., for Arabic, CJK, numerals, etc. ◮ interaction with punctuation / “operator” precedence ⋆ e.g., Alexandria, Va

  • vs. Kawasaki Heavy Industries Ltd.,

Mitsubishi Heavy Industries Ltd. and ...

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 9 / 10

slide-62
SLIDE 62

Experiments Conclusion

Conclusion:

informative signal, but requires further investigation

◮ very preliminary results... ◮ cues may be more useful as features!

miscellaneous observations:

◮ transitions between scripts ⋆ e.g., for Arabic, CJK, numerals, etc. ◮ interaction with punctuation / “operator” precedence ⋆ e.g., Alexandria, Va

  • vs. Kawasaki Heavy Industries Ltd.,

Mitsubishi Heavy Industries Ltd. and ...

◮ properties of first (and last) words Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 9 / 10

slide-63
SLIDE 63

Experiments Thanks! Questions?

Thanks!

No questions at this time...

Spitkovsky et al. (Stanford & Google) Capitalization WILS (2012-06-07) 10 / 10