Introduction to NLP Diyi Yang Some slides borrowed from Yulia - - PowerPoint PPT Presentation

introduction to nlp
SMART_READER_LITE
LIVE PREVIEW

Introduction to NLP Diyi Yang Some slides borrowed from Yulia - - PowerPoint PPT Presentation

CS 4650/7650: Natural Language Processing Introduction to NLP Diyi Yang Some slides borrowed from Yulia Tsvetkov at CMU and Noah Smith at UW 1 Welcome! Website: https://www.cc.gatech.edu/classes/AY2021/cs7650_fall Piazza:


slide-1
SLIDE 1

CS 4650/7650: Natural Language Processing

Introduction to NLP

Diyi Yang

1

Some slides borrowed from Yulia Tsvetkov at CMU and Noah Smith at UW

slide-2
SLIDE 2

Welcome!

2

Website: https://www.cc.gatech.edu/classes/AY2021/cs7650_fall Piazza: piazza.com/gatech/fall2020/cs7650cs4650 Staff Email List: cs4650-7650-f20-staff@googlegroups.com

slide-3
SLIDE 3

Welcome!

3

slide-4
SLIDE 4

TA Office Hours

4

slide-5
SLIDE 5

Hybrid Mode

¡ Lectures Online ¡ Course Materials Online ¡ TA Office Hours Online ¡ Q&A with instructor in person (optional, TBD)

5

slide-6
SLIDE 6

Grading

¡ Homework Assignments (55%) ¡ Take-home Midterm (15%) ¡ Project Survey (20%) ¡ Quiz (10%)

6

slide-7
SLIDE 7

Late Polices

¡ Late Policy

¡ 5 late days to use over the duration of the semester for homework assignments only.

There are no restrictions on how the late days can be used (e.g., all 5 can be used on one homework). Using late days will not affect your grade. But homework submitted late after all 5 late days have been used will receive no credit.

¡ No make-up exam

¡ Unless under emergency situation

7

slide-8
SLIDE 8

Survey Paper (Project)

¡ Survey on a NLP topic ¡ 2-3 students per team ¡ 2-page survey proposal (2%) ¡ 8-page final survey report (12%) ¡ Incorporating feedback (6%)

8

slide-9
SLIDE 9

Other Information

¡ Course Contacts:

¡ Webpage: materials and announcements ¡ Piazza: discussion forum ¡ Homework questions: Piazza, TAs’ office hours

¡ Computing Resources:

¡ Experiments can take up to hours, even with efficient computation ¡ Recommendation: start assignments early

9

slide-10
SLIDE 10

Introduction to NLP

10

slide-11
SLIDE 11

Communication With Machines

~50-70s ~80s today

11

slide-12
SLIDE 12

12

slide-13
SLIDE 13

Conversational Agents

Conversational agents contain:

  • Speech recognition
  • Language analysis
  • Dialogue processing
  • Information retrieval
  • Text to speech

13

slide-14
SLIDE 14

14

slide-15
SLIDE 15

15

slide-16
SLIDE 16

Question Answering

¡ What does “divergent” mean? ¡ What year was Abraham Lincoln born? ¡ How many states were in the United

States that year?

¡ How much Chinese silk was exported

to England in the end of the 18th century?

¡ What do scientists think about the

ethics of human cloning?

16

slide-17
SLIDE 17

Machine Translation

17

slide-18
SLIDE 18

Natural Language Processing Applications

¡ Machine Translation ¡ Information Retrieval ¡ Question Answering ¡ Dialogue Systems ¡ Information Extraction ¡ Summarization ¡ Sentiment Analysis ¡ ...

Core Technologies

¡ Language modeling ¡ Part-of-speech tagging ¡ Syntactic parsing ¡ Named-entity recognition ¡ Word sense disambiguation ¡ Semantic role labeling ¡ ...

NLP lies at the intersection of computational linguistics and machine learning.

18

slide-19
SLIDE 19

Level Of Linguistic Knowledge

19

slide-20
SLIDE 20

Phonetics, Phonology

¡ Pronunciation Modeling

20

slide-21
SLIDE 21

Words

¡ Language Modeling ¡ Tokenization ¡ Spelling correction

21

slide-22
SLIDE 22

Morphology

¡ Morphology analysis ¡ Tokenization ¡ Lemmatization

22

slide-23
SLIDE 23

Part of Speech

¡ Part of speech tagging

23

slide-24
SLIDE 24

Syntax

¡ Syntactic parsing

24

slide-25
SLIDE 25

Semantics

¡ Named entity recognition ¡ Word sense disambiguation ¡ Semantic role labeling

25

slide-26
SLIDE 26

Discourse

26

slide-27
SLIDE 27

“ Why Do We Care About This ”

27

slide-28
SLIDE 28

Where Are We Now?

28

slide-29
SLIDE 29

Where Are We Now?

VS

29

slide-30
SLIDE 30

Where Are We Now?

30

slide-31
SLIDE 31

31

slide-32
SLIDE 32

Why NLP is Hard?

1.

Ambiguity

2.

Scale

3.

Sparsity

  • 4. Variation

5.

Expressivity

  • 6. Unmodeled Variables

7.

Unknown representations

32

slide-33
SLIDE 33

Why NLP is Hard?

1.

Ambiguity

2.

Scale

3.

Sparsity

  • 4. Variation

5.

Expressivity

  • 6. Unmodeled Variables

7.

Unknown representations

33

slide-34
SLIDE 34

Ambiguity

¡ Ambiguity at multiple levels

¡ Word senses: bank (finance or river ?) ¡ Part of speech: chair (noun or verb ?) ¡ Syntactic structure: I can see a man with a telescope ¡ Multiple: I made her duck

34

slide-35
SLIDE 35

35

slide-36
SLIDE 36

Ambiguity and Scale

36

slide-37
SLIDE 37

The Challenges of “Words”

¡ Segmenting text into words ¡ Morphological variation ¡ Words with multiple meanings: bank, mean ¡ Domain-specific meanings: latex ¡ Multiword expressions: make a decision, take out, make up

37

slide-38
SLIDE 38

Part of Speech Tagging

38

slide-39
SLIDE 39

Part of Speech Tagging

39

slide-40
SLIDE 40

Part of Speech Tagging

40

slide-41
SLIDE 41

Syntax

41

slide-42
SLIDE 42

Morphology + Syntax

A ship-shipping ship, shipping shipping-ships

42

slide-43
SLIDE 43

Semantics

¡ Every fifteen minutes a woman in this country gives birth.. Our job is to find

this woman, and stop her! – Groucho Marx

43

slide-44
SLIDE 44

Semantics

¡ Every fifteen minutes a woman in this country gives birth. Our job is to find

this woman, and stop her! – Groucho Marx

44

slide-45
SLIDE 45

Syntax + Semantics

¡ We saw the woman with the telescope wrapped in paper.

¡ Who has the telescope? ¡ Who or what is wrapped in paper? ¡ An even of perception, or an assault?

45

slide-46
SLIDE 46

Syntax + Semantics

¡ We saw the woman with the telescope wrapped in paper.

¡ Who has the telescope? ¡ Who or what is wrapped in paper? ¡ An even of perception, or an assault?

46

slide-47
SLIDE 47

Corpora

¡ A corpus is a collection of text

¡ Often annotated in some way ¡ Sometimes just lots of text

¡ Examples

¡ Penn Treebank: 1M words of parsed WSJ ¡ Canadian Hansards: 10M+ words of French/English sentences ¡ Yelp reviews ¡ The Web!

48

Rosetta Stone

slide-48
SLIDE 48

Statistical NLP

¡ Like most other parts of AI, NLP is dominated by statistical methods

¡ Typically more robust than rule-based methods ¡ Relevant statistics/probabilities are learned from data ¡ Normally requires lots of data about any particular phenomenon

49

slide-49
SLIDE 49

Why NLP is Hard?

1.

Ambiguity

2.

Scale

3.

Sparsity

  • 4. Variation

5.

Expressivity

  • 6. Unmodeled Variables

7.

Unknown representations

50

slide-50
SLIDE 50

Sparsity

¡ Sparse data due to Zipf’s Law ¡ Example: the frequency of different

words in a large text corpus

51

slide-51
SLIDE 51

Sparsity

¡ Order words by frequency. What is the frequency of nth ranked word?

52

slide-52
SLIDE 52

Sparsity

¡ Regardless of how large our

corpus is, there will be a lot of infrequent words

¡ This means we need to find

clever ways to estimate probabilities for things we have rarely or never seen

53

slide-53
SLIDE 53

Why NLP is Hard?

1.

Ambiguity

2.

Scale

3.

Sparsity

  • 4. Variation

5.

Expressivity

  • 6. Unmodeled Variables

7.

Unknown representations

54

slide-54
SLIDE 54

Variation

¡ Suppose we train a part of speech tagger or a parser on the Wall Street Journal ¡ What will happen if we try to use this tagger/parser for social media?

¡ “ikr smh he asked fir yo last name so he can add u on fb lololol”

55

slide-55
SLIDE 55

Variation

56

slide-56
SLIDE 56

Why NLP is Hard?

1.

Ambiguity

2.

Scale

3.

Sparsity

  • 4. Variation

5.

Expressivity

  • 6. Unmodeled Variables

7.

Unknown representations

57

slide-57
SLIDE 57

Expressivity

¡ Not only can one form have different meanings (ambiguity) but the same meaning

can be expressed with different forms:

¡ She gave the book to Tom vs. She gave Tom the book ¡ Some kids popped by vs. A few children visited ¡ Is that window still open? vs. Please close the window

58

slide-58
SLIDE 58

Unmodeled Variables

World knowledge

I dropped the glass on the floor and it broke I dropped the hammer on the glass and it broke

59

slide-59
SLIDE 59

Unmodeled Representation

Very difficult to capture what is !, since we don’t even know how to represent the knowledge a human has/needs:

¡ What is the “meaning” of a word or sentence? ¡ How to model context? ¡ Other general knowledge?

60

slide-60
SLIDE 60

Desiderate for NLP Models

¡ Sensitivity to a wide range of phenomena and constraints in human language ¡ Generality across languages, modalities, genres, styles ¡ Strong formal guarantees (e.g., convergence, statistical efficiency, consistency) ¡ High accuracy when judged against expert annotations or test data ¡ Ethical

61

slide-61
SLIDE 61

Symbolic and Probabilistic NLP

62

slide-62
SLIDE 62

Probabilistic and Connectionist NLP

63

slide-63
SLIDE 63

NLP vs. Machine Learning

¡ To be successful, a machine learner needs bias/assumptions; for NLP, that might

be linguistic theory/representations.

¡ ! is not directly observable. ¡ Symbolic, probabilistic, and connectionist ML have all seen NLP as a source of

inspiring applications.

64

slide-64
SLIDE 64

NLP vs. Linguistics

¡ NLP must contend with NL data as found in the world ¡ NLP ≈ computational linguistics ¡ Linguistics has begun to use tools originating in NLP!

65

slide-65
SLIDE 65

Fields with Connections to NLP

¡ Machine learning ¡ Deep Learning ¡ Linguistics (including psycho-, socio-, descriptive, and theoretical) ¡ Cognitive science ¡ Information theory ¡ Data science ¡ Political science ¡ Psychology ¡ Economics ¡ Education

66

slide-66
SLIDE 66

Today’s Applications

¡ Conversational agents ¡ Information extraction and question answering ¡ Machine translation ¡ Opinion and sentiment analysis ¡ Social media analysis ¡ Visual understanding ¡ Essay evaluation ¡ Mining legal, medical, or scholarly literature

67

slide-67
SLIDE 67

Factors Changing NLP Landscape

1.

Increases in computing power

  • 2. The rise of the web, then the social web
  • 3. Advances in machine learning
  • 4. Advances in understanding of language in social context

68

slide-68
SLIDE 68

What is this Class?

¡ Linguistic Issues

¡ What are the range of language phenomena? ¡ What are the knowledge sources that let us disambiguate? ¡ What representations are appropriate? ¡ How do you know what to model and what not to model?

¡ Statistical Modeling Methods

¡ Increasingly complex model structures ¡ Learning and parameter estimation ¡ Efficient inference: dynamic programming, search ¡ Deep neural networks for NLP: LSTM, CNN, Seq2seq

69

slide-69
SLIDE 69

Outline of Topics

¡ Words and Sequences

¡ Text classifications ¡ Probabilistic language models ¡ Vector semantics and word embeddings ¡ Sequence labeling: POS tagging, NER

¡ Parsers ¡ Applications

¡ Machine translation, Question Answering, Dialog Systems ¡ Text Generation, Summarization

70

slide-70
SLIDE 70

Readings

¡ Books:

¡ Primary text: Jurafsky and Martin, Speech and Language

Processing, 2nd or 3 rd Edition

¡ https://web.stanford.edu/~jurafsky/slp3/ ¡ Also: Eisenstein, Natural Language Processing ¡ https://github.com/jacobeisenstein/gt-nlp-

class/blob/master/notes/eisenstein-nlp-notes.pdf

71

slide-71
SLIDE 71

What’s Next?

¡ Text Classification

72