1
Part-of-Speech Tagging Part-of-Speech Tagging
Berlin Chen 2003
References:
- 1. Speech and Language Processing, chapter 8
- 2. Foundations of Statistical Natural Language Processing, chapter 10
Part-of-Speech Tagging Part-of-Speech Tagging Berlin Chen 2003 - - PowerPoint PPT Presentation
Part-of-Speech Tagging Part-of-Speech Tagging Berlin Chen 2003 References: 1. Speech and Language Processing, chapter 8 2. Foundations of Statistical Natural Language Processing, chapter 10 1 Review Tagging (part-of-speech tagging)
1
References:
2
The/AT representative/NN put/VBD chairs/NNS on/IN the/AT table/NN
3
The/DT grand/JJ jury/NN commented/VBD on/IN a/DT number/NN of/IN other/JJ topics/NNS ./.
4
5
ambiguous
(However, the probabilities of tags associated a word are not equal → many ambiguous tokens are easy to disambiguate)
6
VB DT NN . Book that flight . VBZ DT NN VB NN ? Does that flight serve dinner ?
7
sequence probability)
8
9
Pavlov had shown that salivation … Pavlov PAVLOV N NOM SG PROPER had HAVE V PAST VFIN SVO HAVE PCP2 SVO shown SHOW PCP2 SVOO SVO SV that ADV PRON DEM SG DET CENTRAL DEM SG CS salivation N NOM SG An example for The ENGTOWL tagger
A set of 1,100 constraints can be applied to the input sentence
10
past participle
11
ADV Compliment A NUM
12
i i i N-gram HMM tagger
tag sequence probability word/lexical likelihood
13
14
i i j j i
1 −
j i i j j i
1
−
tag sequence probability word/lexical likelihood
15
j i i j j i j j i j i j j i i j i i j j i i i i j j i i j j i
1 1 1 1 1 1 1 1
− − − − − − − −
The same for all tags The probability of a word
16
0.34 0.00003 0.021 0.00041 Pretend that the previous word has already tagged
17
1 1 2 1 ,..., 2 , 1 1 2 1 1 1 1 2 1 ,..., 2 , 1 2 1 1 1 2 1 ,..., 2 , 1
= − = − −
n i i i i i n t t t n i n i i i i n t t t n n n n t t t T T T
The probability of a word
18
w1
Tag State w2 wi wn
1 2 i n-1 n
Word Sequence wn-1
t1 tj tJ tj+1 tj-1 t1 tj tJ tj+1 tj-1 t1 tj tJ tj+1 tj-1 t1 tj tJ tj+1 tj-1 t1 tj tJ tj+1 tj-1
1
π
MAX MAX
1 − j
π
j
π
1 + j
π
J
π
19
1 1 1 1 1 1 1 + ≤ ≤ − ≤ ≤ −
i i i n J j n k j i J j i j i k j i i i k k
20
= = − − n i i i n i i i i n t t t
1 3 1 2 1 2 1 ,.., 2 , 1
i i i i i i i i i i i i i i
1 2 1 2 1 2
− − − − − −
Smoothing is needed !
21
w1
Tag State w2 wi wn
1 2 i n-1 n
Word Sequence wn-1
MAX
with tag history t1 with tag history tj with tag history tJ
J copies of tag states
22
23
24
25
So, race will be initially coded as NN (label every word with its most-likely tag) P(NN|race)=0.98 P(VB|race)=0.02
Refer to the correct tag Information of each word, and find the tag of race in “1” is wrong Learn/pick a most suitable transformation rule: (by examining every possible transformation) Change NN to VB while the previous tag is TO expected/VBN to/To race/NN → expected/VBN to/To race/VB Rewrite rule:
26
Rules learned by Brill’s original tagger Brill’s templates. Each begins with “Change tag a to tag b when ….”
Modal verbs (should, can,…) Verb, past participle Verb, 3sg, Present
27
28
The GET_BEST_INSTANCE procedure in the example algorithm is “Change tag from X to Y if the previous tag is Z”.
for all combinations
Get best instance for each transformation
29
would/MD n’t/RB Children/NNS ‘s/POS in terms of (in/II31 terms/II32 of/II33)
30
Nouns or Verbs
i i t
31
i i i i i
The first N letters of the word The last N letters of the word
32
33
34
35
United, States, of, America → “United States of America” secondary, education → “secondary education” Book publishing, publishing of books
36
37
1 1 1 1 − + − − + −
n N n n n n n N n n
= =
l j l j k k j j i
c c C c c C c c P c C w C c w P
Constraints: a word may
category
38
ㄒ一ㄥ ㄓㄥ ㄩㄢ ㄩㄢ ㄓㄤ ㄐㄩㄝ ㄉㄧㄥ ㄈㄟ ㄏㄜ ㄙ 興 行 鄭 政 興政 行政 院 園 院長 院 漲 長 園長 園 決 覺 掘 決定 確定 定 訂 訂費 廢 非 費 和 核 合 賜 四 行政院 政院 合適