SI425 : NLP
Set 10 Syntax and Parsing
Fall 2020 : Chambers
SI425 : NLP Set 10 Syntax and Parsing Fall 2020 : Chambers Syntax - - PowerPoint PPT Presentation
SI425 : NLP Set 10 Syntax and Parsing Fall 2020 : Chambers Syntax Grammar, or syntax: The kind of implicit knowledge of your native language that you had mastered by the time you were 3 years old Not the kind of stuff you were later
Set 10 Syntax and Parsing
Fall 2020 : Chambers
you had mastered by the time you were 3 years old
2
3
“I saw the man on the hill with a telescope.”
4
5
Linguists like to argue
transformational syntax, X-bar theory, principles and parameters, government and binding, GPSG, HPSG, LFG, relational grammar, minimalism.... And on and on.
6
Why should you care?
7
8
interjection, pronoun, conjunction, etc.
number, nature, and universality of these
9
N noun chair, bandwidth, pacing V verb study, debate, munch ADJ adjective purple, tall, ridiculous ADV adverb unfortunately, slowly P preposition
PRO pronoun I, me, mine DET determiner the, a, that, those
10
marker to each word in a collection. word tag the DET koala N put V the DET keys N
P the DET table N
11
He will refuse to lead. There is lead in the refuse.
12
play a role in grammar)
with respect to new items
13
Examples:
14
15
Almost all NLPers use these.
16
17
POS tag for a particular instance of a word. This can change the entire parse tree.
These examples from Dekang Lin
18
Label each word with its Part of Speech tag!
(look back 2 slides at the POS tag list for help)
19
Noun Phrase “the big blue ball”
20
to behave in similar ways
21
constituent members?
(follows or precedes)?
22
English...
can all precede verbs.
23
Try some constituency tests!
1. Is this a Verb phrase or Noun phrase? Why?
1. Is this a Verb phrase or Noun phrase? Why?
1. Can this be used as an adjective? Why?
24
up with right set of constituents and the rules that govern how they combine...
correspond to a modern linguistic theory of grammar).
25
So…we’ll make CFG rules for all valid noun phrases.
26
27
number of terminals and non-terminals on the right.
28
29
30
either analysis or synthesis engines
31
that accounts for that string
string
the string
32
grammar and returning parse tree(s) for that string
33
S → NP VP
S → VP
S → Aux NP VP
S → WH-NP Aux NP VP
34
35
NP → Det Nominal
inside this one rule.
36
37
38
modifiers of the head.
39
among various constituents that take part in a rule or set
in NPs have to agree in their number.
This flight Those flights *This flights *Those flight
40
following constituents which we’ll call arguments.
41
rules.
according to the sets of VP rules that they participate in.
transitive/intransitive.
42
43
parameters
number, position and types.
44
formally express these facts
45
arguments that don’t go together
a valid NP
46
agreement.
all the verb/VP classes.
47
constraints explodes the number of rules in our grammar.
48
English.
staying within the CFG framework.
us out of the CFG framework (beyond its formal power)
49
50
implicitly capturing these nasty constraints with probabilities.
51
been paired with a parse tree.
necessary.
tagset, and a grammar and instructions for how to deal with particular grammatical constructions.
52
Most well known part is the Wall Street Journal section of the Penn TreeBank.
▪1 M words from the 1987-1989 Wall Street Journal.
53
you’ll have a grammar with decent coverage.
54
they tend to avoid recursion.
Among them...
55
56
given language
57