Natural Language Processing Fall 2018 Frank Ferraro Natural - - PowerPoint PPT Presentation

natural language processing
SMART_READER_LITE
LIVE PREVIEW

Natural Language Processing Fall 2018 Frank Ferraro Natural - - PowerPoint PPT Presentation

CMSC 473/673 Natural Language Processing Fall 2018 Frank Ferraro Natural language processing ITE 358 ferraro@umbc.edu Semantics Monday: 2:15-3 Tuesday: 11:00-11:30 Vision & language processing by appointment Learning with low-to-no


slide-1
SLIDE 1

CMSC 473/673 Natural Language Processing Fall 2018

slide-2
SLIDE 2

Frank Ferraro

ITE 358 ferraro@umbc.edu Monday: 2:15-3 Tuesday: 11:00-11:30 by appointment Natural language processing Semantics Vision & language processing Learning with low-to-no supervision

slide-3
SLIDE 3

Caroline Kery

Location TBD ckery1@umbc.edu Tuesday: 2-3:30pm Thursday: 1-2:30pm by appointment Semantic parsing Active learning Data visualization Analysis of educational data

slide-4
SLIDE 4

December 2016

slide-5
SLIDE 5

August 2018

slide-6
SLIDE 6

Potential Applications

ASR (automatic speech recognition) Machine translation Natural language generation Document labeling/classification Document summarization Corpus exploration Relation/information extraction Entity identification

slide-7
SLIDE 7

Automatic speech recognition

slide-8
SLIDE 8

SPORTS

Document classification

slide-9
SLIDE 9

Machine translation

slide-10
SLIDE 10

https://cdn.arstechnica.net/wp-content/uploads/2015/11/Screen-Shot-2015-11-02-at-9.11.40-PM-640x543.png

Natural language generation

slide-11
SLIDE 11

Document summarization

slide-12
SLIDE 12

Corpus exploration

slide-13
SLIDE 13

Pat and Chandler agreed on a plan. He said Pat would try the same tactic again.

Relation extraction

slide-14
SLIDE 14

Pat and Chandler agreed on a plan. He said Pat would try the same tactic again.

Entity identification

slide-15
SLIDE 15

Pat and Chandler agreed on a plan. He said Pat would try the same tactic again.

is “he” the same person as “Chandler?”

?

Entity identification

slide-16
SLIDE 16

Course Goals

Be introduced to some of the core problems and solutions of NLP (big picture)

slide-17
SLIDE 17
slide-18
SLIDE 18

Course Goals

Be introduced to some of the core problems and solutions of NLP (big picture) Learn different ways that success and progress can be measured in NLP

slide-19
SLIDE 19

Natural Language Processing tensorflow

slide-20
SLIDE 20

Course Goals

Be introduced to some of the core problems and solutions of NLP (big picture) Learn different ways that success and progress can be measured in NLP Relate to statistics, machine learning, and linguistics Implement NLP programs

slide-21
SLIDE 21

Course Goals

Be introduced to some of the core problems and solutions of NLP (big picture) Learn different ways that success and progress can be measured in NLP Relate to statistics, machine learning, and linguistics Implement NLP programs Read and analyze research papers Practice your (written) communication skills

slide-22
SLIDE 22

http://www.qwantz.com/index.php?comic=170

slide-23
SLIDE 23

http://www.qwantz.com/index.php?comic=170

slide-24
SLIDE 24

Natural Language Processing ≈ Computational Linguistics

slide-25
SLIDE 25

Natural Language Processing ≈ Computational Linguistics

science focus computational bio computational chemistry computational X

slide-26
SLIDE 26

Natural Language Processing ≈ Computational Linguistics

science focus computational bio computational chemistry computational X build a system to translate create a QA system engineering focus

slide-27
SLIDE 27

Natural Language Processing ≈ Computational Linguistics

Machine learning

slide-28
SLIDE 28

Natural Language Processing ≈ Computational Linguistics

Machine learning Information Theory

slide-29
SLIDE 29

Natural Language Processing ≈ Computational Linguistics

Machine learning Information Theory Data Science

slide-30
SLIDE 30

Natural Language Processing ≈ Computational Linguistics

Machine learning Information Theory Data Science Systems Engineering

slide-31
SLIDE 31

Natural Language Processing ≈ Computational Linguistics

Machine learning Information Theory Data Science Systems Engineering Logic Theory of Computation

slide-32
SLIDE 32

Natural Language Processing ≈ Computational Linguistics

Machine learning Information Theory Data Science Systems Engineering Logic Theory of Computation

Linguistics

slide-33
SLIDE 33

Natural Language Processing ≈ Computational Linguistics

Machine learning Information Theory Data Science Systems Engineering Logic Theory of Computation

Linguistics Cognitive Science Psychology

slide-34
SLIDE 34

Natural Language Processing ≈ Computational Linguistics

Machine learning Information Theory Data Science Systems Engineering Logic Theory of Computation Linguistics Cognitive Science Psychology Political Science Digital Humanities Education

slide-35
SLIDE 35

Natural Language Processing ≈ Computational Linguistics

science focus computational bio computational chemistry computational X build a system to translate create a QA system engineering focus

these views can co-exist peacefully

slide-36
SLIDE 36

What Are Words?

Linguists don’t agree (Human) Language-dependent White-space separation is a sometimes okay (for written English longform) Social media? Spoken vs. written? Other languages?

slide-37
SLIDE 37

What Are Words? Tokens vs. Types

The film got a great opening and the film went on to become a hit .

Type: an element of the vocabulary. Token: an instance of that type in running text. How many of each?

slide-38
SLIDE 38

Terminology: Tokens vs. Types

The film got a great opening and the film went on to become a hit .

Tokens

  • The
  • film
  • got
  • a
  • great
  • pening
  • and
  • the
  • film
  • went
  • n
  • to
  • become
  • a
  • hit
  • .

Types

  • The
  • film
  • got
  • a
  • great
  • pening
  • and
  • the
  • went
  • n
  • to
  • become
  • hit
  • .
slide-39
SLIDE 39

Terminology: Tokens vs. Types

The film got a great opening and the film went on to become a hit .

Tokens

  • The
  • film
  • got
  • a
  • great
  • pening
  • and
  • the
  • film
  • went
  • n
  • to
  • become
  • a
  • hit
  • .

Types

  • The
  • film
  • got
  • a
  • great
  • pening
  • and
  • the
  • went
  • n
  • to
  • become
  • hit
  • .
slide-40
SLIDE 40

http://www.qwantz.com/index.php?comic=170

slide-41
SLIDE 41

Adapted from Jason Eisner, Noah Smith

slide-42
SLIDE 42
  • rthography

Adapted from Jason Eisner, Noah Smith

slide-43
SLIDE 43
  • rthography

morphology

Adapted from Jason Eisner, Noah Smith

slide-44
SLIDE 44
  • rthography

morphology

Adapted from Jason Eisner, Noah Smith

lexemes

slide-45
SLIDE 45
  • rthography

morphology

Adapted from Jason Eisner, Noah Smith

lexemes syntax

slide-46
SLIDE 46
  • rthography

morphology

Adapted from Jason Eisner, Noah Smith

lexemes syntax semantics

slide-47
SLIDE 47
  • rthography

morphology

Adapted from Jason Eisner, Noah Smith

lexemes syntax semantics pragmatics

slide-48
SLIDE 48
  • rthography

morphology

Adapted from Jason Eisner, Noah Smith

lexemes syntax semantics pragmatics discourse

slide-49
SLIDE 49

Adapted from Jason Eisner, Noah Smith

NLP + Latent Modeling

explain what you see/annotate with things “of importance” you don’t

  • rthography

morphology lexemes syntax semantics pragmatics discourse

  • bserved text
slide-50
SLIDE 50
  • rthography

morphology lexemes syntax semantics pragmatics discourse

slide-51
SLIDE 51
  • rthography

morphology lexemes syntax semantics pragmatics discourse

VISION AUDIO

prosody intonation color

slide-52
SLIDE 52

Language is Productive

slide-53
SLIDE 53
slide-54
SLIDE 54

Watergate

slide-55
SLIDE 55

Troopergate Watergate  Bridgegate Deflategate

slide-56
SLIDE 56

Language is Ambiguous

slide-57
SLIDE 57

Ambiguity

Kids Make Nutritious Snacks

slide-58
SLIDE 58

Ambiguity

Kids Make Nutritious Snacks Kids Prepare Nutritious Snacks Kids Are Nutritious Snacks

sense ambiguity

slide-59
SLIDE 59

Ambiguity

British Left Waffles on Falkland Islands

slide-60
SLIDE 60

Ambiguity

British Left Waffles on Falkland Islands British Left Waffles on Falkland Islands British Left Waffles on Falkland Islands

lexical ambiguity

slide-61
SLIDE 61

Part of Speech Tagging

British Left Waffles on Falkland Islands British Left Waffles on Falkland Islands British Left Waffles on Falkland Islands

Adjective Noun Verb Noun Verb Noun lexical ambiguity

slide-62
SLIDE 62

Parts of Speech

Classes of words that behave like one another in “similar” contexts Pronunciation (stress) can differ: object (noun: OB-ject) vs. object (verb: ob-JECT) It can help improve the inputs to other systems (text-to-speech, syntactic parsing)

slide-63
SLIDE 63

Ambiguity

Pat saw Chris with the telescope on the hill. I ate the meal with friends.

slide-64
SLIDE 64

Ambiguity

Pat saw Chris with the telescope on the hill. I ate the meal with friends.

syntactic ambiguity

slide-65
SLIDE 65

Language Can Be Surprising

slide-66
SLIDE 66

Garden Path Sentences

slide-67
SLIDE 67

Garden Path Sentences The

slide-68
SLIDE 68

Garden Path Sentences The old

slide-69
SLIDE 69

Garden Path Sentences The old man

slide-70
SLIDE 70

Garden Path Sentences The old man the

slide-71
SLIDE 71

Garden Path Sentences The old man the boat

slide-72
SLIDE 72

Garden Path Sentences The old man the boat .

slide-73
SLIDE 73

Garden Path Sentences The old man the boat .

slide-74
SLIDE 74

Garden Path Sentences

The complex houses married and single soldiers and their families.

slide-75
SLIDE 75

Garden Path Sentences

The complex houses married and single soldiers and their families.

slide-76
SLIDE 76

Garden Path Sentences

The rat the cat the dog chased killed ate the malt.

slide-77
SLIDE 77

Garden Path Sentences

The rat that the cat the dog chased killed ate the malt.

slide-78
SLIDE 78

Garden Path Sentences

The rat that the cat that the dog chased killed ate the malt.

slide-79
SLIDE 79

Garden Path Sentences

The rat that the cat that the dog chased killed ate the malt.

slide-80
SLIDE 80

Garden Path Sentences

The rat that the cat that the dog chased killed ate the malt.

slide-81
SLIDE 81

Garden Path Sentences

The rat that the cat that the dog chased killed ate the malt.

slide-82
SLIDE 82

Garden Path Sentences

[The rat [the cat [the dog chased] killed] ate the malt].

Language can have recursive patterns Syntactic parsing can help identify those

slide-83
SLIDE 83

Syntactic Parsing

I ate the meal with friends

NP VP VP NP PP S

Syntactic parsing: perform a “meaningful” structural analysis according to grammatical rules

slide-84
SLIDE 84

Syntactic Parsing Can Help Disambiguate

I ate the meal with friends

NP VP VP NP PP S

slide-85
SLIDE 85

Syntactic Parsing Can Help Disambiguate

I ate the meal with friends

NP VP VP NP PP S NP VP S VP NP PP NP

slide-86
SLIDE 86

Clearly Show Ambiguity… But Not Necessarily All Ambiguity

I ate the meal with friends

NP VP VP NP PP S

I ate the meal with gusto I ate the meal with a fork

slide-87
SLIDE 87

Discourse Processing

John stopped at the donut store.

Courtesy Jason Eisner

slide-88
SLIDE 88

Discourse Processing

John stopped at the donut store.

Courtesy Jason Eisner

slide-89
SLIDE 89

Discourse Processing

John stopped at the donut store before work.

Courtesy Jason Eisner

slide-90
SLIDE 90

Discourse Processing

John stopped at the donut store on his way home.

Courtesy Jason Eisner

slide-91
SLIDE 91

Discourse Processing

John stopped at the donut shop. John stopped at the trucker shop. John stopped at the mom & pop shop. John stopped at the red shop.

Courtesy Jason Eisner

slide-92
SLIDE 92

Discourse Processing through Coreference

I spread the cloth on the table to protect it. I spread the cloth on the table to display it.

Courtesy Jason Eisner

slide-93
SLIDE 93

I spread the cloth on the table to protect it. I spread the cloth on the table to display it.

Courtesy Jason Eisner

Discourse Processing through Coreference

slide-94
SLIDE 94

I spread the cloth on the table to protect it. I spread the cloth on the table to display it.

Courtesy Jason Eisner

Discourse Processing through Coreference

slide-95
SLIDE 95

http://www.qwantz.com/index.php?comic=170

slide-96
SLIDE 96

Three people have been fatally shot, and five people, including a mayor, were seriously wounded as a result of a Shining Path attack today.

slide-97
SLIDE 97

Three people have been fatally shot, and five people, including a mayor, were seriously wounded as a result of a Shining Path attack today.

score( )

slide-98
SLIDE 98

Three people have been fatally shot, and five people, including a mayor, were seriously wounded as a result of a Shining Path attack today.

pθ( )

slide-99
SLIDE 99

pθ(X)

probabilistic model

  • bjective

F(θ)

slide-100
SLIDE 100

Gradient Ascent

θ2 θ1

slide-101
SLIDE 101

Gradient Ascent

θ2 θ1

slide-102
SLIDE 102

Gradient Ascent

θ2 θ1

slide-103
SLIDE 103

Gradient Ascent

“gradient of F with respect to θ”

θ2 θ1

slide-104
SLIDE 104

Gradient Ascent

“gradient of F with respect to θ” gradient: a vector of derivatives, each with respect to θk while holding all other variables constant

θ2 θ1

slide-105
SLIDE 105

http://www.qwantz.com/index.php?comic=170

slide-106
SLIDE 106

http://universaldependencies.org/

part-of-speech & syntax for > 120 languages

slide-107
SLIDE 107

From Syntax to Shallow Semantics

http://corenlp.run/ (constituency & dependency) https://github.com/hltcoe/predpatt http://openie.allenai.org/ http://www.cs.rochester.edu/research/knext/browse/ (constituency trees) http://rtw.ml.cmu.edu/rtw/

Angeli et al. (2015)

“Open Information Extraction” a sampling of efforts

slide-108
SLIDE 108

Semantic Projection

slide-109
SLIDE 109

Administrivia

slide-110
SLIDE 110

Grading

Component 473 673 Five Assignments 45% 30% Midterm 10% 10% Graduate Paper

  • 30%

Course Project 45% 30%

slide-111
SLIDE 111

Final Grades

≥ Letter 90 A 80 B 70 C 65 D F ≥ Letter 90 A- 80 B- 70 C- 65 D F

473 673

slide-112
SLIDE 112

https://www.csee.umbc.edu/courses/undergraduate/473/f18

slide-113
SLIDE 113

Online Discussions

https://piazza.com/umbc/fall2018/cmsc473673

slide-114
SLIDE 114

Important Dates

slide-115
SLIDE 115

Late Policy

Everyone has a budget of 10 late days If you have them left: assignments turned in after the deadline will be graded and recorded, no questions asked If you don’t have any left: still turn assignments

  • in. They could count in your favor in borderline

cases

slide-116
SLIDE 116

Late Policy

Everyone has a budget of 10 late days Use them as needed throughout the course They’re meant for personal reasons and emergencies Do not procrastinate

slide-117
SLIDE 117

Late Policy

Everyone has a budget of 10 late days Contact me privately if an extended absence will occur