Natural Language Processing Lecture 27: Conclusion Levels of - - PowerPoint PPT Presentation
Natural Language Processing Lecture 27: Conclusion Levels of - - PowerPoint PPT Presentation
Natural Language Processing Lecture 27: Conclusion Levels of Linguistc nowledge spoken phonetcs writen orthography phonology shallower morphology syntax semantcs deeper pragmatcs discourse
Levels of Linguistc nowledge
phonetcs phonology
- rthography
morphology syntax semantcs pragmatcs discourse “shallower” “deeper” spoken writen
uygarlastramadıklarımızdanmıssınızcasına
“(behaving) as if you are among those whom we could not civilize”
uygarlastramadıklarımızdanmıssınızcasına
“(behaving) as if you are among those whom we could not civilize” uygar “civilized” +las “become” +tr “cause to” +ama “not able” +dık past partciple +lar plural +ımız frst person plural possessive (“our”) +dan second person plural (“y’all”) +mıs past +sınız ablatve case (“from/among”) +casına fnite verb → adverb (“as if”)
Finite-State Automaton
- Q: a fnite set of states
- q0
Q: a special start state ∈
- F
Q: a set of fnal states ⊆
- Σ: a fnite alphabet
- Transitons:
- Encodes a set of strings that can be recognized by
following paths from q0 to some state in F.
qi qj
s
Σ* ∈
... ...
Levels of Linguistc nowledge
phonetcs phonology
- rthography
morphology syntax semantcs pragmatcs discourse “shallower” “deeper” spoken writen
ambiguity
Noisy Channel
source source channel
y x
decode
What you want What you see
Noisy Channel
source source channel
y x
decode
Cats meow
- fen
NN VB RB
Noisy Channel
source source channel
y x
decode
你好吗? How are you?
Noisy Channel
source source channel
y x
decode
Okay, Google
Startng and Stopping
Unigram model:
...
Bigram model:
...
Trigram model:
...
Language Modeling Questons
- Why do we use context?
- What does smoothing do, and why is it
necessary?
- What do we use to evaluate language
models?
Tagging
Broad POS categories
closed classes
- pen classes
nouns verbs adjectves adverbs prepositons determiners pronouns conjunctons auxiliary verbs partcles numerals
Syntax
Parsing
- C Y vs. Earley’s Algorithm
– Both dynamic programming – CNF vs. general forms
C Y Algorithm: Chart
Noun, Verb
- VP,S
- S
book
Det NP
- NP
this
Noun
- fmight
Prep PP
through
PNoun, NP
Houston
C Y Equatons C Y Equatons
Semantcs
Where’s the beef?
Sentences from the brown corpus. Extracted from the concordancer in The Compleat Lexical Tutor, htp://www.lextutor.ca/
chicken
Synsets for dog (n)
- S: (n) dog, domestc dog, Canis familiaris (a member of the genus Canis
(probably descended from the common wolf) that has been domestcated by man since prehistoric tmes; occurs in many breeds) "the dog barked all night"
- S: (n) frump, dog (a dull unatractve unpleasant girl or woman) "she got a
reputaton as a frump"; "she's a real dog"
- S: (n) dog (informal term for a man) "you lucky dog"
- S: (n) cad, bounder, blackguard, dog, hound, heel (someone who is
morally reprehensible) "you dirty dog"
- S: (n) frank, frankfurter, hotdog, hot dog, dog, wiener, wienerwurst,
weenie (a smooth-textured sausage of minced beef or pork usually smoked; ofen served on a bread roll)
- S: (n) pawl, detent, click, dog (a hinged catch that fts into a notch of a
ratchet to move a wheel forward or prevent it from moving backward)
- S: (n) andiron, fredog, dog, dog-iron (metal supports for logs in a
freplace) "the andirons were too hot to touch"
22
Entty Linking
Mary picked up the ball. She threw it to me.
Semantc oles
PropBank is a set of verb-sense-specifc “frames” with informal descriptons for their arguments. Consider the word “Agree”
- ARG0: agreer
- ARG1: propositon
- ARG2: other entty agreeing
[The group] ARG0 agreed [it wouldn’t make an ofer]ARG1. Usually [John] ARG0 agrees [with Mary on everything] ARG2.
“Fall (move downward)” in PropBank
- arg1: logical subject, patent, thing falling
- arg2: extent, amount fallen
- arg3: startng point
- arg4: ending point
- argM-loc: medium
Sales fell to $251.2 million from $278.8 million. The average junk bond fell by 4.2%. The meteor fell through the atmosphere, crashing into Cambridge.
M L #1: First-Order Logic
DressCode(ThePorch) Serves(UnionGrill, AmericanFood) estaurant(UnionGrill) Have(Speaker, FiveDollars) ^ ¬ Have(Speaker, LotOfTime) ∀x Person(x) Have(x, FiveDollars) ⇒ ∃x,y Person(x) ^ estaurant(y) ^ ¬HasVisited(x,y) Functon Predicates
First Order Logic: Advantages
- Flexible
- Well-understood
- Widely used
EM
- We ofen have unlabeled or incomplete data
- EM is an for learning without labels, e.g.,
“classifcaton” without classes
- Pick ra
ndom centroids!
- Itera
te the following :!
- Use centroids to la
bel the da ta !
- Com
pute centroids using the la beled da ta !
- Keep doing
this until la bels don’t cha ng e
E-step M-step
NLP Uses NLP Uses
Answer questions using the Web Answer questions using the Web Translate documents from one language to another Translate documents from one language to another Do library research; summarize Do library research; summarize Manage messages intelligently Manage messages intelligently Help make informed decisions Help make informed decisions Follow directions given by any user Follow directions given by any user Fix your spelling or grammar Fix your spelling or grammar Grade exams Grade exams Write poems or novels Write poems or novels Listen and give advice Listen and give advice Estimate public opinion Estimate public opinion Read everything and make predictions Read everything and make predictions Interactively help people learn Interactively help people learn Help disabled people Help disabled people Help refugees/disaster victims Help refugees/disaster victims Document or reinvigorate indigenous languages Document or reinvigorate indigenous languages
More NLP ...
- Language Technologies Minor
– 4 LT courses plus LT project
- 5th year Masters in Language Technologies
More NLP Courses
- 11-492/692 Speech Processing
– Fall: Alan W Black – Practcal Systems for Speech
- 11-711 Algorithms and NLP
– Fall: Yulia Tsvetkov, obert Frederking – esearch oriented
- 11-727 Computatonal Semantcs
– Spring: Ed Hovy, Teruko Mitamura
More NLP Courses
- 11-747 Neural Networks for NLP
– Spring: Graham Neubig
- 11-830 Computatonal Ethics for NLP
– Spring: Yulia Tsvetkov, Alan W Black
- 11-777 Advanced Multmodal ML
– Fall: Louis-Philippe Morency – Visual, Gesture, Speech
- Most Neural Net Classing