discriminative language models
play

Discriminative Language Models Prof. Sameer Singh CS 295: - PowerPoint PPT Presentation

Discriminative Language Models Prof. Sameer Singh CS 295: STATISTICAL NLP WINTER 2017 January 26, 2017 Based on slides from Noah Smith, Richard Socher, and everyone else they copied from. Language Models Probability of a Sentence Is a


  1. Discriminative Language Models Prof. Sameer Singh CS 295: STATISTICAL NLP WINTER 2017 January 26, 2017 Based on slides from Noah Smith, Richard Socher, and everyone else they copied from.

  2. Language Models Probability of a Sentence • Is a given sentence something you would expect to see? • Syntactically (grammar) and Semantically (meaning) Probability of the Next Word • Predict what comes next for a given sequence of words. • Think of it as V‐way classification CS 295: STATISTICAL NLP (WINTER 2017) 2

  3. Outline Discriminative Language Models Feed‐forward Neural Networks Recurrent Neural Networks Upcoming.. CS 295: STATISTICAL NLP (WINTER 2017) 3

  4. Outline Discriminative Language Models Feed‐forward Neural Networks Recurrent Neural Networks Upcoming.. CS 295: STATISTICAL NLP (WINTER 2017) 4

  5. Logistic Regression Model CS 295: STATISTICAL NLP (WINTER 2017) 5

  6. N‐Grams as Logistic Reg. CS 295: STATISTICAL NLP (WINTER 2017) 6

  7. Other features… CS 295: STATISTICAL NLP (WINTER 2017) 7

  8. Outline Discriminative Language Models Feed‐forward Neural Networks Recurrent Neural Networks Upcoming.. CS 295: STATISTICAL NLP (WINTER 2017) 8

  9. Logistic Reg. w/ Embeddings CS 295: STATISTICAL NLP (WINTER 2017) 9

  10. Neural Networks CS 295: STATISTICAL NLP (WINTER 2017) 10

  11. Activation Functions sigmoid softmax tanh And many others… ReLUs, PReLUs, ELU, step, max, and so on.. CS 295: STATISTICAL NLP (WINTER 2017) 11

  12. Why do they work? https://colah.github.io CS 295: STATISTICAL NLP (WINTER 2017) 12

  13. Why do they work? z x2 y x1 CS 295: STATISTICAL NLP (WINTER 2017) 13

  14. Simulated Example https://github.com/clab/cnn/blob/master/examples/xor.cc CS 295: STATISTICAL NLP (WINTER 2017) 14

  15. Simple Feedforward NN LM Bigram Model CS 295: STATISTICAL NLP (WINTER 2017) 15

  16. Simple Feedforward NN LM N‐gram Model CS 295: STATISTICAL NLP (WINTER 2017) 16

  17. Deep Feedforward NN LM Bengio et al. 2003 CS 295: STATISTICAL NLP (WINTER 2017) 17

  18. Outline Discriminative Language Models Feed‐forward Neural Networks Recurrent Neural Networks Upcoming.. CS 295: STATISTICAL NLP (WINTER 2017) 18

  19. Sequence View of Simple NNs CS 295: STATISTICAL NLP (WINTER 2017) 19

  20. Recurrent Neural Networks CS 295: STATISTICAL NLP (WINTER 2017) 20

  21. Example: “I love food” love food <eos> love food I CS 295: STATISTICAL NLP (WINTER 2017) 21

  22. Power of RNNs: Characters! http://karpathy.github.io/2015/05/21/rnn‐effectiveness/ CS 295: STATISTICAL NLP (WINTER 2017) 22

  23. Char‐RNNs: Shakespeare! CS 295: STATISTICAL NLP (WINTER 2017) 23

  24. Char‐RNNs: Wikipedia! CS 295: STATISTICAL NLP (WINTER 2017) 24

  25. Char‐RNNs: Linux Code! CS 295: STATISTICAL NLP (WINTER 2017) 25

  26. Extension: Stacking CS 295: STATISTICAL NLP (WINTER 2017) 26

  27. Extension: Bidirectional RNNs CS 295: STATISTICAL NLP (WINTER 2017) 27

  28. Deep Bidirectional RNNs CS 295: STATISTICAL NLP (WINTER 2017) 28

  29. Extension: GRUs Gated Recurrent Units CS 295: STATISTICAL NLP (WINTER 2017) 29

  30. Extension: GRUs Gated Recurrent Units CS 295: STATISTICAL NLP (WINTER 2017) 30

  31. Estimating Parameters Beyond the scope of the course • Lots of tricks, heuristics, “domain knowledge” • Lot of engineering for efficiency, e.g. GPUs • New training algorithms being proposed every year • sometimes, architecture‐specific • Lots of available tools you can use! • Tensorflow, Torch, Keras, MxNET, etc. CS 295: STATISTICAL NLP (WINTER 2017) 31

  32. Outline Discriminative Language Models Feed‐forward Neural Networks Recurrent Neural Networks Upcoming.. CS 295: STATISTICAL NLP (WINTER 2017) 32

  33. Homework 1 so far… Public Private CS 295: STATISTICAL NLP (WINTER 2017) 33

  34. Ruslan Salakhutdinov Professor at Carnegie Mellon University Director of Artificial Intelligence, Apple Inc. Learning Deep Unsupervised and Multimodal Models Location : DBH 6011 Time : 11am ‐ 12pm Date: January 27, 2017 Meeting with PhD students, will post on Piazza CS 295: STATISTICAL NLP (WINTER 2017) 34

  35. Upcoming… • Homework 1 is due tonight: January 26, 2017 • Write‐up, data, and code for Homework 2 is up Homework • Homework 2 is due: February 9, 2017 • Proposal is due: February 7, 2017 (~2 weeks) Project • Only 2 pages CS 295: STATISTICAL NLP (WINTER 2017) 35

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend