Constituency-based Hyponymy Extraction COMP 762 Chianyu Liu, - - PowerPoint PPT Presentation
Constituency-based Hyponymy Extraction COMP 762 Chianyu Liu, - - PowerPoint PPT Presentation
Constituency-based Hyponymy Extraction COMP 762 Chianyu Liu, 260576898 Hyponym and Hypernym Describes a type-of relationship E.g. Meronym : describes a part-whole relationship Homonym : a word that has two unrelated
Hyponym and Hypernym
- Describes a “type-of” relationship
- E.g.
- Meronym: describes a part-whole relationship
- Homonym: a word that has two unrelated meanings
- Polyseme: a word with two related meanings
Constituency-based parse tree
- Ordered, rooted tree that represents the syntactic structure of a text
- Breaks a sentence, S, into sub-phrases (e.g. NP, VP, PP etc.)
○ Nodes are phrases ○ Leafs are words ○ Edges are unlabeled
- Vs. Dependency-based parse tree
- http://nlp.stanford.edu:8080/parser/index.jsp
Tregex
- An utility for identifying patterns in trees
- Like regular expressions for strings
- Use symbols to denote relations
○ A < B: A is the parent of B ○ A << B: A is an ancestor of B ○ A$B: A and B are siblings
Tregex Example
Pattern Matching
- Hyponymy categories defined in the paper
Pattern Description Example HKO Hypernym -> Keywords -> Hyponym …, such as … OKH Hyponym -> Keyworks -> Hypernym … are considered as … HO Hyponym -> Hyponym Section header KHO Keywords -> Hypernym -> Hyponym Following types of …
WordNet
- A large lexical database of English
- Nouns, verbs, adjectives and adverbs are grouped into sets of synsets
- Synsets are interlinked by means of conceptual-semantic and lexical relations
- The most frequently encoded relation among synsets is the
super-subordinate relation (i.e. hypernym and hyponym)
- http://wordnetweb.princeton.edu/perl/webwn
Ontology
- A specification of a conceptualization
- Describes the representation and relationships that can exist for entities
(objects, properties, etc.) in a particular domain
- Ontology is a standard to represent knowledge, and enables knowledge to be
shared and reused
References
- G. Andrew, “The Wonderful World of Tregex,” in Stanford: Natural Language Processing.
- M. C. Evans, J. Bhatia, S. Wadkar, and T. D. Breaux, “An Evaluation of Constituency-Based Hyponymy Extraction from Privacy
Policies,” 2017 IEEE 25th International Requirements Engineering Conference (RE), 2017.
- P. University, “What is WordNet?,” Princeton University, 17-Mar-2015. [Online]. Available: https://wordnet.princeton.edu/.
- S. Gole, “Part of speech tagging using OpenNLP,” Sager Gole's Blog, 18-Jun-2015.