modern nlp
play

Modern NLP for Pre-Modern Practitioners Joel Grus #QConAI - PowerPoint PPT Presentation

Modern NLP for Pre-Modern Practitioners Joel Grus #QConAI @joelgrus #2019 "True self-control is waiting until the movie starts to eat your popcorn." the movie True until is waiting self-control starts to eat your popcorn.


  1. Modern NLP for Pre-Modern Practitioners Joel Grus #QConAI @joelgrus #2019

  2. "True self-control is waiting until the movie starts to eat your popcorn."

  3. the movie True until is waiting self-control starts to eat your popcorn.

  4. Natural Language Understanding is Hard

  5. B u t W e ' r e G e t t i n g B e t t e r A t I t * * as measured by performance on tasks we're getting better at

  6. As Measured by Performance on Tasks We're Getting Better at* * tasks that would be easy if we were good at natural language understanding and that we therefore use to measure our progress toward natural language understanding

  7. About Me

  8. Obligatory Plug for AllenNLP

  9. A Handful of Tasks That Would Be Easy if We Were Good at Natural Language Understanding

  10. Parsing

  11. Named-Entity Recognition

  12. Coreference Resolution

  13. Machine Translation

  14. Summarization Attend QCon.ai.

  15. Text Classification

  16. Machine Comprehension

  17. Machine Comprehension?

  18. Textual Entailment

  19. Winograd Schemas The conference organizer disinvited the speaker conference organizer because he feared a boring talk. The conference organizer disinvited the speaker speaker because he proposed a boring talk.

  20. Language Modeling

  21. And many others!

  22. If you were good at natural language understanding, you'd also be pretty good at these tasks

  23. So if computers get good at each of these tasks, then...

  24. (I Am Being Unfair) Each of these tasks is Likely they are getting us valuable on its own merits closer to actual natural language understanding

  25. Pre-Modern NLP

  26. Lots of Linguistics

  27. Grammars S S -> NP VP NP VP VP -> VBZ ADJP NP VBZ ADJP NP -> JJ NN JJ NN VBZ ADJP ADJP -> JJ JJ NN VBZ JJ JJ -> "Artificial" NN -> "intelligence" VBZ -> "is" JJ -> "dangerous" Artificial intelligence is dangerous

  28. Hand-Crafted Features

  29. Rule-Based Systems

  30. Modern NLP

  31. Theme 1: Neural Nets and Low-Dimensional Representations

  32. Theme 2: Putting Things in

  33. Theme 3:

  34. Theme 4:

  35. Theme 5: Transfer Learning

  36. Word Vectors

  37. Joel is attending an artificial intelligence conference. artificial 0 0 0 0 0 0 0 0 0 1 0 0 0 0 ... 0 embedding .3 .6 .1 .2 2.3 prediction .01 0 0 .9 0 0 0 0 0 .05 0 0 0 0 ... 0 intelligence

  38. Using Word Vectors ? ?

  39. Using Word Vectors N V

  40. Using Word Vectors J N The official department heads all quit .

  41. bites dog man

  42. Using Context for Sequence Labeling N V

  43. Using Context for Sequence Classification

  44. Recurrent Neural Networks

  45. LSTMs and GRUs

  46. Bidirectionality

  47. Generative Character-Level Modeling

  48. Convolutional Networks

  49. Sequence-to-Sequence Models

  50. Attention

  51. Large "Unsupervised" Language Models

  52. Contextual Embeddings

  53. Contextual Embeddings The Seahawks football today

  54. word2vec

  55. ELMo

  56. ELMo

  57. "NLP's ImageNet moment"

  58. Self-Attention

  59. RNN vs CNN vs Self-Attention

  60. The Transformer ("Attention Is All You Need")

  61. OpenAI GPT, or Transformer Decoder Language Model

  62. One Model to Rule Them All?

  63. The GLUE Benchmark

  64. BERT

  65. Task 1: Masked Language Modeling Joel is giving a [MASK] talk at a [MASK] in San Francisco interesting conference exciting meetup derivative rave pedestrian coffeehouse snooze-worthy WeWork ... ...

  66. Task 2: Next Sentence Prediction [CLS] Joel is giving a talk. [SEP] The [CLS] Joel is giving a talk. [SEP] The audience is enthralled. [SEP] audience is falling asleep. [SEP] 99% is_next_sentence 1% is_next_sentence 1% is_not_next_sentence 99% is_not_next_sentence

  67. BERT for downstream tasks

  68. GPT-2

  69. 1.5 billion parameters

  70. PRETRAINED LANGUAGE MODEL Is GPT-2 Dangerous?

  71. How Can You Use These In Your Work?

  72. Use Pretrained Word Vectors

  73. Better Still, Use Pretrained Contextual Embeddings

  74. Use Pretrained BERT to Build Great Classifiers

  75. PRETRAINED LANGUAGE Use GPT-2 MODEL (small) (if you dare)

  76. I'm fine-tuning a transformer model! In Conclusion NLP is cool ● Modern NLP is solving really hard ● problems (And is changing really really quickly) ● Lots of really smart people with lots of ● data and lots of compute power have trained models that you can just download and use So take advantage of their work! ●

  77. Thanks! I'll tweet out the slides: @joelgrus ● read the speaker notes ○ they have lots of links ○ I sometimes blog: joelgrus.com ● AI2: allenai.org ● AllenNLP: allennlp.org ● GPT-2 Explorer: gpt2.apps.allenai.org ● podcast: adversariallearning.com ●

  78. Appendix

  79. References http://ruder.io/a-review-of-the-recent-history-of-nlp/ https://ankit-ai.blogspot.com/2019/03/future-of-natural-language-processing.html https://lilianweng.github.io/lil-log/2019/01/31/generalized-language-models.html#openai-gpt

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend