Towards a Computational Semantic Analyzer for Urdu
Annette Hautli Miriam Butt
Department of Linguistics, University of Konstanz
9th Workshop on Asian Linguistic Resources, IJCNLP ’11
1 / 35
Towards a Computational Semantic Analyzer for Urdu Annette Hautli - - PowerPoint PPT Presentation
Towards a Computational Semantic Analyzer for Urdu Annette Hautli Miriam Butt Department of Linguistics, University of Konstanz 9th Workshop on Asian Linguistic Resources, IJCNLP 11 1 / 35 Motivation Advances in the computational
1 / 35
1
2
2 / 35
1
2
3 / 35
1
2
4 / 35
◮ Treebank-based PCFG parser (Abbas, 2002) ◮ Urdu dependency parser trained with MaltParser (Ali and Hussain,
◮ Urdu ParGram grammar based on LFG (Butt and King 2004,
◮ Emille corpus (Baker et al., 2004) ◮ “Experiences in Building Urdu Wordnet” (Adeeba and Hussain, 2011) ◮ Urdu WordNet based on Hindi WordNet (Ahmed and Hautli, 2009) ◮ Automatic collection of Urdu multiwords (Hautli and Sulger, 2011) ◮ Development of a lexical resource for Urdu verbs 5 / 35
◮ Treebank-based PCFG parser (Abbas, 2002) ◮ Urdu dependency parser trained with MaltParser (Ali and Hussain,
◮ Urdu ParGram grammar based on LFG (Butt and King 2004,
◮ Emille corpus (Baker et al., 2004) ◮ “Experiences in Building Urdu Wordnet” (Adeeba and Hussain, 2011) ◮ Urdu WordNet based on Hindi WordNet (Ahmed and Hautli, 2009) ◮ Automatic collection of Urdu multiwords (Hautli and Sulger, 2011) ◮ Development of a lexical resource for Urdu verbs 6 / 35
7 / 35
8 / 35
9 / 35
10 / 35
11 / 35
12 / 35
CS 1: ROOT Sadj S KP NP PRON us K nE KP NP N t3ul AbEb K mEN KP NP N sEb VCmain V kHAyA "us nE t3ul AbEb mEN sEb kHAyA" 'kHA<[1:vuh], [26:sEb]>' PRED 'vuh' PRED pronoun NSYN NTYPE CASE erg, NUM sg, PERS 3 1 SUBJ 'sEb' PRED count COMMON NSEM common NSYN NTYPE CASE nom, GEND masc, NUM sg, PERS 3 26 OBJ 't3ul AbEb' PRED location PROPER-TYPE PROPER NSEM proper NSYN NTYPE + SPECIFIC SEM-PROP CASE loc, NUM sg, PERS 3 7 ADJUNCT + AGENTIVE LEX-SEM ASPECT perf, MOOD indicative TNS-ASP CLAUSE-TYPE decl, PASSIVE -, VTYPE main 58
13 / 35
14 / 35
◮ SUBJ(%1,%2) ==> subj(%1,%2).
15 / 35
16 / 35
17 / 35
18 / 35
19 / 35
20 / 35
◮ Assignment of thematic roles to the grammatical functions ◮ kHA ‘to eat’: subj → Agent
◮ VerbNet information is stored in a database which can be accessed by
◮ The xfr rules replace the grammatical functions with the thematic
21 / 35
22 / 35
23 / 35
24 / 35
25 / 35
26 / 35
27 / 35
28 / 35
29 / 35
CS 1: ROOT Sadj S KP NP PRON vuh KP NP N t3ul AbEb K mEN VCmain V kHA Vmod pAyA
"vuh t3ul AbEb mEN sEb kHA pAyA" 'pA<[56:kHA]>[1:vuh]' PRED 'vuh' PRED pronoun NSYN NTYPE CASE nom, GEND masc, NUM sg, PERS 3 1 SUBJ 'kHA<[1:vuh], [24:sEb]>' PRED [1:vuh] SUBJ 'sEb' PRED count COMMON NSEM common NSYN NTYPE CASE nom, GEND masc, NUM sg, PERS 3 24 OBJ + AGENTIVE LEX-SEM
56 XCOMP 't3ul AbEb ' PRED location PROPER-TYPE PROPER NSEM proper NSYN NTYPE + SPECIFIC SEM-PROP CASE loc, NUM sg, PERS 3 5 ADJUNCT
LEX-SEM ASPECT perf, MOOD indicative TNS-ASP CLAUSE-TYPE decl, VTYPE main 74
30 / 35
31 / 35
32 / 35
33 / 35
◮ Hindi TreeBank could provide some semantic information
34 / 35
◮ Lexical information from WordNet ◮ Verb frames from a verb resource
◮ spatial expressions ◮ modality constructions 35 / 35
36 / 35