mikolov s language models
play

Mikolovs Language Models: Distributed Representations of Sentences - PowerPoint PPT Presentation

Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Mikolovs Language Models: Distributed Representations of Sentences and Documents Recurrent Neural


  1. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Mikolov’s Language Models: Distributed Representations of Sentences and Documents Recurrent Neural Language Model Tomas Mikolov 1 May 16, 2014 1 Google Inc1 Tomas Mikolov 2 Mikolov’s Language Models:

  2. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Table of contents 1 Motivation 2 Introduction and Background 3 Paragraph Embeddings 4 Performance 5 Linguistic Regularities in Continuous Space Word Representations Tomas Mikolov 3 Mikolov’s Language Models:

  3. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Motivation Quoth Tomas Mikolov, http://www.fit.vutbr.cz/ imikolov/rnnlm/google.pdf Statistical language models assign probabilities to word sequences Meaningful sentences should be more likely than ambiguous ones Language modeling is an artificial intelligence problem. Tomas Mikolov 4 Mikolov’s Language Models:

  4. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Classical Ngram Models Figure: Text Modeling using Markov Chains, Claude Shannon (1984) max P ( w i | w i − 1 , ... ) (1) Where each w i representation is a 1-N encoding. Tomas Mikolov 5 Mikolov’s Language Models:

  5. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Neural Representation of Words Neural Language Model Bengio et al, 2006. Figure: Word2Vec , Tomas Mikolov Tomas Mikolov 6 Mikolov’s Language Models:

  6. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Beyond Word Embeddings Recursive Deep Tensor Models Socher et. al. Figure: Recursive Tree Structure , Richard Socher 2013 Tomas Mikolov 7 Mikolov’s Language Models:

  7. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Beyond Word Embeddings Recurrent Neural Network Language Model Mikolov et. al. Figure: Recurrent NN , Tomas Mikolov 2010 Tomas Mikolov 8 Mikolov’s Language Models:

  8. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Beyond Word Embeddings Character-Level Recognition Figure: Text Understanding from Scratch , Zhang, LeCun 2015 Tomas Mikolov 9 Mikolov’s Language Models:

  9. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Algorithm Overview Figure: Paragraph Embedding, Learning Model , Tomas Mikolov 2013 Tomas Mikolov 10 Mikolov’s Language Models:

  10. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Algorithmic Overview Part 1. Word embeddings. Given sentence w 1 , w 2 , w 3 ... : T − k max 1 X log p ( w t | w t − k , ..., w t + k ) (2) T t = k where e y wt p ( w t | w t − k , ..., w t + k ) = (3) P i e y i Tomas Mikolov 11 Mikolov’s Language Models:

  11. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Algorithmic Overview Parameters for Step 1: U , b y = b + Uh ( w t − k , ..., w t + k ; W ) (4) Tomas Mikolov 12 Mikolov’s Language Models:

  12. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Algorithmic Overview Part II. Joint Word and Paragraph y = b + Uh ( w t − k , ..., w t + k ; W , D ) (5) W ∈ R p × N D ∈ R p × M p × ( M + N ) Tomas Mikolov 13 Mikolov’s Language Models:

  13. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Algorithm Overview Figure: Distributed Memory Model Tomas Mikolov 14 Mikolov’s Language Models:

  14. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Algorithm Overview Figure: Distributed Bag of Words Model Model Tomas Mikolov 15 Mikolov’s Language Models:

  15. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Sentiment Analysis Figure: Stanford Sentiment Treebank Dataset Tomas Mikolov 16 Mikolov’s Language Models:

  16. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Sentiment Analysis Figure: iMDB Dataset Tomas Mikolov 17 Mikolov’s Language Models:

  17. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Model Figure: Recurrent NN , Tomas Mikolov 2010 Tomas Mikolov 18 Mikolov’s Language Models:

  18. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Components: input : x ( t ) = w ( t ) + s ( t − 1) ⇣ X ⌘ hidden : s j ( t ) = f x i ( t ) ∗ u ji i ⇣ X ⌘ output : y k ( t ) = g s j ( t ) ∗ v kj j where f is sigmoid and g is softmax. Tomas Mikolov 19 Mikolov’s Language Models:

  19. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Spatial Meaning: Vector O ff set Method for Running Linguistic Analogy Questions: y = x b − x a + x c x w y w ∗ = arg max || x w |||| y || w Tomas Mikolov 20 Mikolov’s Language Models:

  20. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Results Tomas Mikolov 21 Mikolov’s Language Models:

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend