Part-of-Speech Tagging
Informatics 2A: Lecture 16 Shay Cohen
School of Informatics University of Edinburgh
29 October 2015
1 / 45
Part-of-Speech Tagging Informatics 2A: Lecture 16 Shay Cohen - - PowerPoint PPT Presentation
Part-of-Speech Tagging Informatics 2A: Lecture 16 Shay Cohen School of Informatics University of Edinburgh 29 October 2015 1 / 45 Last class We discussed the POS tag lexicon When do words belong to the same class? Three criteria What
1 / 45
2 / 45
3 / 45
1 Have you read ‘The Wind in the Willows’? (noun) 2 The clock has stopped. Please wind it up. (verb) 3 The students tried to protest. (verb) 4 The students’ protest was successful. (noun) 4 / 45
5 / 45
6 / 45
1 14 tokens, 6 types 2 14 tokens, 7 types 3 14 tokens, 8 types 4 None of the above. 7 / 45
8 / 45
9 / 45
10 / 45
11 / 45
12 / 45
13 / 45
14 / 45
15 / 45
16 / 45
17 / 45
18 / 45
19 / 45
20 / 45
21 / 45
22 / 45
23 / 45
24 / 45
25 / 45
26 / 45
27 / 45
28 / 45
1 |T| tag sequences 2 n tag sequences 3 |T| × n tag sequences 4 |T|n tag sequences 29 / 45
30 / 45
31 / 45
32 / 45
T ′ Table(T ′, i) × p(T|T ′) × p(wi+1|T)
33 / 45
34 / 45
35 / 45
1 Create probability matrix, with one column for each
2 We proceed by filling cells, column by column. 3 The entry in column i, row j will be the probability of the
36 / 45
37 / 45
38 / 45
39 / 45
40 / 45
41 / 45
42 / 45
43 / 45
1 Assign each token all its possible tags. 2 Apply rules that eliminate all tags for a token that are
44 / 45
45 / 45