m otivating e xample 2
play

M OTIVATING E XAMPLE 2 Other languages display still more variation - PowerPoint PPT Presentation

C OMPOSITIONAL M ORPHOLOGY FOR W ORD R EPRESENTATIONS AND L ANGUAGE M ODELLING Jan Botha , Phil Blunsom ICML 2014, Beijing M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE W HAT WE SEE The king finally abdicated after years of


  1. C OMPOSITIONAL M ORPHOLOGY FOR W ORD R EPRESENTATIONS AND L ANGUAGE M ODELLING Jan Botha , Phil Blunsom ICML 2014, Beijing

  2. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE W HAT WE SEE The king finally abdicated after years of unkingly conduct .

  3. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE W HAT WE SEE The king finally abdicated after years of unkingly conduct . Wait what – unkingly?

  4. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE W HAT WE SEE The king finally abdicated after years of unkingly conduct . Wait what – unkingly? unkingly 2n’kINli a word you have probably never seen, but still understand

  5. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE W HAT WE SEE The king finally abdicated after years of unkingly conduct . Wait what – unkingly? unkingly 2n’kINli a word you have probably never seen, but still understand ⇒ compositional morphology in action

  6. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE W HAT WE SEE The king finally abdicated after years of unkingly conduct . Wait what – unkingly? unkingly 2n’kINli a word you have probably never seen, but still understand ⇒ compositional morphology in action W HAT OUR MODELS SEE ( MOSTLY ) 10 2 95 529 11 88 21 50 74 239

  7. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE W HAT WE SEE The king finally abdicated after years of unkingly conduct . Wait what – unkingly? unkingly 2n’kINli a word you have probably never seen, but still understand ⇒ compositional morphology in action W HAT OUR MODELS SEE ( MOSTLY ) 10 2 95 529 11 88 21 50 74 239

  8. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE 2 Other languages display still more variation C ZECH T URKISH PRODUCTIVE DERIVATION Avrupa (Europe) CONJUGATION Avrupalı (of Europe) cistit (to clean) ˇ Avrupalıla¸ s (become of Europe) cistím ˇ Avrupalıla¸ stır (to Europeanise) cistíš ˇ Avrupalıla¸ stırama (be unable to Europeanise) cistí ˇ Avrupalıla¸ stıramadık (we were unable to Europeanise) cistíme ˇ . . . cistíte ˇ cistil ˇ cištˇ ˇ en cisti ˇ cistˇ ˇ ete cistˇ ˇ eme

  9. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE 2 Other languages display still more variation C ZECH T URKISH PRODUCTIVE DERIVATION Avrupa (Europe) CONJUGATION Avrupalı (of Europe) cistit (to clean) ˇ Avrupalıla¸ s (become of Europe) cistím ˇ Avrupalıla¸ stır (to Europeanise) cistíš ˇ Avrupalıla¸ stırama (be unable to Europeanise) cistí ˇ Avrupalıla¸ stıramadık (we were unable to Europeanise) cistíme ˇ . . . cistíte ˇ cistil ˇ cištˇ ˇ en ⇒ we should model morphemes! cisti ˇ cistˇ ˇ ete cistˇ ˇ eme

  10. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS R EPRESENTING WORDS ◮ Discrete set? {a, aardvark, . . . , account, accounted, accounting, . . . }

  11. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS R EPRESENTING WORDS ◮ Discrete set? {a, aardvark, . . . , account, accounted, accounting, . . . } ◮ Vector space? x 2 accounted account a aardvark x 1

  12. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E XTRACT FROM C OLLOBERT & W ESTON E MBEDDINGS

  13. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E XTRACT FROM C OLLOBERT & W ESTON E MBEDDINGS

  14. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E XTRACT FROM C OLLOBERT & W ESTON E MBEDDINGS

  15. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M ORPHEME VECTORS Existing word vectors already capture some morphology. ◮ − banks − − − − → bank ≈ − − → kings − − − − → king ≈ − − → queens − − − − − → − − → queen (Mikolov et al. 2013)

  16. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M ORPHEME VECTORS Existing word vectors already capture some morphology. ◮ − banks − − − − → bank ≈ − − → kings − − − − → king ≈ − − → queens − − − − − → − − → queen (Mikolov et al. 2013) Logical extension: ◮ − kings ≈ − − − → king + − − → → - s ◮ − unkingly ≈ − − − − − → un - + − → king + − − → → - ly

  17. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M ORPHEME VECTORS Existing word vectors already capture some morphology. ◮ − banks − − − − → bank ≈ − − → kings − − − − → king ≈ − − → queens − − − − − → − − → queen (Mikolov et al. 2013) Logical extension: ◮ − kings ≈ − − − → king + − − → → - s ◮ − unkingly ≈ − − − − − → un - + − → king + − − → → - ly H OW TO ... ◮ obtain morpheme vectors ◮ compose morpheme vectors ◮ do it all within a language model usable in an MT decoder

  18. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M ORPHOLOGICAL COMPOSITION AS ADDITION Literally, word = sum of its parts?

  19. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M ORPHOLOGICAL COMPOSITION AS ADDITION Literally, word = sum of its parts? Problems: hang + − − − → over � = − − → over + − − → − → ◮ bag of morphemes: hang greenhouse � = − − − − − − − − → green + − − − → − − → ◮ non-compositionality: house

  20. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M ORPHOLOGICAL COMPOSITION AS ADDITION Literally, word = sum of its parts? Problems: − hang + − − → over � = − − → over + − − → − → ◮ bag of morphemes: hang − greenhouse � = − − − − − − − → green + − − − → − − → ◮ non-compositionality: house P RAGMATIC S OLUTION include word identity as component too: − − − − − − − → green stem + − − − − → − − → greenhouse ≡ house stem − − − − − → − → un pre + − king stem + − − → → unkingly ≡ ly suf

  21. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M ORPHOLOGICAL COMPOSITION AS ADDITION Literally, word = sum of its parts? Problems: − hang + − − → over � = − − → over + − − → − → ◮ bag of morphemes: hang greenhouse � = − − − − − − − − → green + − − − → − − → ◮ non-compositionality: house P RAGMATIC S OLUTION include word identity as component too: − greenhouse ≡ − − − − − − − → greenhouse id + − − − − − − − → green stem + − − − → − − → house stem unkingly ≡ − − − − − − → unkingly id + − − − − − → → un pre + − king stem + − − → → ly suf

  22. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS S IMPLEST VECTOR - BASED PROBABILISTIC LM LBL (Log-bilinear model) (Mnih & Hinton, 2007; Mnih & Teh, 2012) “colorless green ideas sleep furiously .”

  23. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS A DD MORPHEME VECTORS INSIDE LM LBL ++ “colorless green ideas sleep furiously .”

  24. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS C OMPUTATIONAL E FFICIENCY Problem: Each probability query requires normalisation over vocabulary. ◮ O ( vocab size ) ◮ rich morphology ⇒ large vocabulary

  25. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS C OMPUTATIONAL E FFICIENCY Problem: Each probability query requires normalisation over vocabulary. ◮ O ( vocab size ) ◮ rich morphology ⇒ large vocabulary S OLUTION : D ECOMPOSE MODEL USING WORD CLASSES � � � � word | history = class ( word ) | history P P � � × P word | class ( word ) , history ◮ use unsupervised Brown-clustering √ ◮ each LM query becomes 2 × O ( vocab size ) ⇒ fast enough for MT-decoding

  26. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E VALUATION O VERVIEW Setup ◮ 4-gram models ◮ Czech, English, French, German, Spanish, Russian ◮ train on 20–50m tokens ◮ large vocabularies (exclude 5% of singletons)

  27. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E VALUATION O VERVIEW Setup ◮ 4-gram models ◮ Czech, English, French, German, Spanish, Russian ◮ train on 20–50m tokens ◮ large vocabularies (exclude 5% of singletons) Three evaluation contexts: ◮ Perplexity on test data ◮ Word similarity rating ◮ Machine translation

  28. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E VALUATION O VERVIEW Three evaluation contexts: ◮ Perplexity on test data ◮ Word similarity rating ◮ Machine translation

  29. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS P ERPLEXITY I MPROVEMENTS BY L ANGUAGE CLBL → CLBL ++ 683 → 643 6 422 → 404 313 → 300 4 281 → 273 % 207 → 203 232 → 227 2 0 CS DE EN ES FR RU

  30. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS P ERPLEXITY I MPROVEMENTS ON G ERMAN CLBL → CLBL ++ (B REAK - DOWN BY TOKEN FREQUENCY ) 20 15 % 10 5 0 0 < 10 1 < 10 2 < 10 3 < 10 4 < 10 5 < 10 6 < 10 7 Bins of test token frequency

  31. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E VALUATION O VERVIEW Three evaluation contexts: ◮ Perplexity on test data ◮ Word similarity rating ◮ Machine translation

  32. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E VALUATION O VERVIEW Three evaluation contexts: ◮ Perplexity on test data ◮ Word similarity rating ◮ Machine translation

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend