“SEMANTICS”
Matt Post IntroHLT class 23 October 2019
SEMANTICS Matt Post IntroHLT class 23 October 2019 Semantic - - PowerPoint PPT Presentation
SEMANTICS Matt Post IntroHLT class 23 October 2019 Semantic Roles Syntax describes the grammatical relationships between words and phrases But there are many different ways to express a particular meaning These
Matt Post IntroHLT class 23 October 2019
words and phrases
– But there are many different ways to express a
particular meaning
2
“scaffolding for meaning”
who did what to whom and when and where and how?
3
pragmatics semantics syntax morphology
A linguistic hierarchy
4
– answer the question “who did what to whom etc”
4
– answer the question “who did what to whom etc” – store answer in a machine-usable way
4
– answer the question “who did what to whom etc” – store answer in a machine-usable way
4
– answer the question “who did what to whom etc” – store answer in a machine-usable way
– specifying some representation for meaning
4
– answer the question “who did what to whom etc” – store answer in a machine-usable way
– specifying some representation for meaning – specifying a representation for word relationships
4
– answer the question “who did what to whom etc” – store answer in a machine-usable way
– specifying some representation for meaning – specifying a representation for word relationships – mapping the words to these representations
4
– answer the question “who did what to whom etc” – store answer in a machine-usable way
– specifying some representation for meaning – specifying a representation for word relationships – mapping the words to these representations
4
sentences
5
– a specific cat? – all cats? – Platonic ideal of a cat? – concept of a cat? (“cat” → CAT)
6 Much of today’s lecture is borrowed from Philipp Koehn: http://www.inf.ed.ac.uk/teaching/courses/emnlp/
– She pays 3% interest on the loan. – He showed a lot of interest in the painting. – Microsoft purchased a controlling interest in Google. – It is in the national interest to invade the Bahamas. – I only have your best interest in mind. – Playing chess is one of my interests. – Business interests lobbied for the legislation.
7
– What is the relationship among these words? – {organization, team, group, association, conglomeration,
institution, establishment, consortium, federation, agency, coalition, alliance, league, club, confederacy, syndicate, society, corporation}
– organisation?
– {member, part, piece}
8
9
9
– many related meanings (polysemy)
9
– many related meanings (polysemy) – different meanings (homonymy)
9
– many related meanings (polysemy) – different meanings (homonymy)
9
– many related meanings (polysemy) – different meanings (homonymy)
– same / similar meanings (synonymy)
9
– many related meanings (polysemy) – different meanings (homonymy)
– same / similar meanings (synonymy) – opposite or contrary meaning (antonymy)
9
– IS-A(animal, cat)
– HAS-PART(cat, paw) – IS-PART-OF(paw, cat)
– IS-MEMBER-OF(professor, faculty) – HAS-MEMBER(faculty, professor)
10
11
12
12
12
solving that is reminiscent of earlier symbolic approaches to AI
13
– Zins: financial charge paid for load (Wordnet sense 4) – Anteil: stake in a company (Wordnet sense 6) – Interesse: all other senses
– English security, safety, confidence
– French fleuve, rivière
14
particular sense of a word?
– Map to Wordnet or a foreign word sense – Map to a real-life instance of the sense
– search – machine translation
15
10
WSD as supervised learning problem
– She pays 3% interest/INTEREST-MONEY on the loan. – He showed a lot of interest/INTEREST-CURIOSITY in the painting.
– given a corpus tagged with senses – define features that indicate one sense over another – learn a model that predicts the correct sense given the features
– Naive Bayes, related to HMM – Transformation-based learning – Maximum entropy learning
Philipp Koehn EMNLP Lecture 11 11 February 2008
11
Simple features
– plant life – manufacturing plant – assembly plant – plant closure – plant species
– animal – equipment – employee – automatic
Philipp Koehn EMNLP Lecture 11 11 February 2008
12
More features
Philipp Koehn EMNLP Lecture 11 11 February 2008
13
Training data for supervised WSD
– bi-annual competition on WSD – provides annotated corpora in many languages
– create artificial corpus by artificially conflate words – example: replace all occurrences of banana and door with banana-door
– translated texts aligned at the sentence level – translation indicates sense
Philipp Koehn EMNLP Lecture 11 11 February 2008
14
Naive Bayes
argmaxSp(S|F) = argmaxSp(F|S)p(F) (1)
p(F) = Y
fi∈F
p(fi|S) (2)
by maximum likelihood estimation
Philipp Koehn EMNLP Lecture 11 11 February 2008
15
Decision list
– two senses per word – rules of the form: collocation → sense – example: manufacturing plant → PLANT-FACTORY – rules are ordered, most reliable rules first – when classifying a test example, step through the list, make decision on first rule that applies
log ✓p(senseA|collocationi) p(senseB|collocationi) ◆ (3) Smoothing is important
Philipp Koehn EMNLP Lecture 11 11 February 2008
16
Bootstrapping
– a short decision list – words from dictionary definition
Philipp Koehn EMNLP Lecture 11 11 February 2008
BERT)
– Information Extraction (Oct. 28) – Information Retrieval (Oct. 30) – Distributional Semantics (Nov. 4)
23
to the core question of identifying word relationships?
24
fndrupal/
fndrupal/node/5549
26
– Input:
I saw the bird with the telescope.
– Output:
[I]AGENT saw [the bird]THEME with [the telescope]INSTR
– Spans annotated with roles
27
28
– Binary decision: is this span a role
– Categorical decision: what role is it
29
30
– span of words – correct label – host of features
31
32
path(ate→He) = VB↑VP↑S↓NP
33
There is nothing you can teach a man like Mr. Collard positive or negative? You could do worse than to buy the Cinetech 12.9 Camera! Chris gave Pat a pat on the
meaning
– Extensive statistical and machine learning techniques
were used to solve them
– These dominated research for the past few decades
until ~2015
can be operationalized
need for many of these
34