Compositional and Distributional Models of Meaning for Natural - - PowerPoint PPT Presentation
Compositional and Distributional Models of Meaning for Natural - - PowerPoint PPT Presentation
Compositional and Distributional Models of Meaning for Natural Language Stephen Clark Natural Language and Information Processing Research Group University of Cambridge Computer Laboratory Oxford October 2010 C & C tools Intro Parsing
C&C tools
Intro Parsing CCG 2
Natural Language Processing (NLP)
- The branch of AI concerned with the automatic analysis,
generation and understanding of natural language text
- Paradigm shift in NLP in the early 1990s
- move from knowledge-heavy to data-driven approaches
- We now have usable language technology, e.g. Google translate
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 3
Two Success Stories
- Practical natural language parsing
- robust, efficient, accurate parsers based on ML from corpora
- Distributional lexical semantics
- word meanings based on data-driven linguistics and lots of text
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 4
Today’s Talk
- Syntactic parsing (leading to compositional semantics)
- Distributional lexical semantics
- Combining the two approaches
- theoretical advances in semantics leading to better LT
- Talk will be from a practical language technology perspective
- but will introduce a fundamental theoretical problem relevant to
practice (and this workshop)
- and will serve as an introduction to some of the linguistics talks
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 5
Phrase Structure
S NP DT the NP JJ proposed NNS changes VP RB also VP MD would VP VB allow S NP NNS executives VP TO to VP VP VB report NP NNS exercises PP P
- f
NP NNS
- ptions
ADVP RB early CONJP CC and RB
- ften
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 6
Dependency Structure
John hit the ball with the bat
SUBJ DET PREP DET DOBJ POBJ
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 7
Logical Form
From 1953 to 1955 , 9.8 billion Kent cigarettes with the filters were sold , the company said . _____________ _________________________________________________________________ | x1 | | x2 x3 | |-------------| |-----------------------------------------------------------------| (| company(x1) |A| say(x2) |) | single(x1) | | agent(x2,x1) | |_____________| | theme(x2,x3) | | proposition(x3) | | __________________ ____________ ________________ | | | x4 | | x5 | | x6 x7 x8 | | | x3: |------------------| |------------| |----------------| | | (| card(x4)=billion |;(| filter(x5) |A| with(x4,x5) |)) | | | 9.8(x4) | | plural(x5) | | sell(x6) | | | | kent(x4) | |____________| | patient(x6,x4) | | | | cigarette(x4) | | 1953(x7) | | | | plural(x4) | | single(x7) | | | |__________________| | 1955(x8) | | | | single(x8) | | | | to(x7,x8) | | | | from(x6,x7) | | | | event(x6) | | | |________________| | | event(x2) | |_________________________________________________________________| Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 8
Why Build these Structures?
- We want to know the meaning of the sentence
- Structured representations allow us to access the semantics
- (Arguably) useful for a variety of NLP applications, e.g. Machine
Translation, Question Answering
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 9
Why is Parsing Difficult?
- Obtaining a wide-coverage grammar which can handle arbitrary
real text is challenging
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 9
Why is Parsing Difficult?
- Obtaining a wide-coverage grammar which can handle arbitrary
real text is challenging
- Natural language is surprisingly ambiguous
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 10
Syntactic Ambiguity
NP VP S John V NP PP saw N P NP N DT DT with man the the telescope S NP John VP V saw NP NP DT N the PP P with man NP DT the N telescope
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 11
Ambiguity: the problem is worse than you think
NP VP S John V NP PP ate N P NP N DT DT with pizza the fork S NP John VP V ate NP NP DT N the PP P with pizza NP DT a N a fork
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 12
Ambiguity: the problem is worse than you think
NP VP S John V NP PP ate N P NP N DT DT with pizza the anchovies S NP John VP V ate NP NP DT N the PP P with pizza NP DT the N the anchovies
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 13
Grammars for Natural Language Parsing
- Standard approach is to use a Context Free Grammar
S → NP VP VP → V NP, V NP PP PP → P NP NP → DT N DT → the, a N → cat, dog V → chased, jumped P → over
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 14
Combinatory Categorial Grammar (CCG)
- CCG (Steedman) is a type-driven lexicalised grammar
- An elementary syntactic structure – for ccg a lexical category –
is assigned to each word in a sentence walked: S\NP ‘give me an NP to my left and I return a sentence’
- A small number of rules define how categories can combine
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 15
ccg Lexical Categories
- Atomic categories: S, N , NP, PP, . . . (not many more)
- Complex categories are built recursively from atomic categories
and slashes, which indicate the directions of arguments
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 15
ccg Lexical Categories
- Atomic categories: S, N , NP, PP, . . . (not many more)
- Complex categories are built recursively from atomic categories
and slashes, which indicate the directions of arguments
- Example complex categories for verbs
- intransitive verb: S\NP walked
- transitive verb: (S\NP)/NP respected
- ditransitive verb: ((S\NP)/NP)/NP gave
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 16
A Simple ccg Derivation
interleukin − 10 inhibits production NP (S\NP)/NP NP S\NP S
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 17
A Simple ccg Derivation
interleukin − 10 inhibits production NP (S\NP)/NP NP
>
S\NP S > forward application
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 18
A Simple ccg Derivation
interleukin − 10 inhibits production NP (S\NP)/NP NP
>
S\NP
<
S > forward application < backward application
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 19
A Simple ccg Derivation with Semantics
interleukin − 10 inhibits production NP : (S\NP)/NP : NP : inter′ λx.λy inhibit′(x, y) prod′ S\NP S
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 20
A Simple ccg Derivation with Semantics
interleukin − 10 inhibits production NP : (S\NP)/NP : NP : inter′ λx.λy inhibit′(x, y) prod′
>
S\NP : λy inhibit′(prod′, y) S > forward application
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 21
A Simple ccg Derivation with Semantics
interleukin − 10 inhibits production NP : (S\NP)/NP : NP : inter′ λx.λy inhibit′(x, y) prod′
>
S\NP : λy inhibit′(prod′, y)
<
S : inhibit′(prod′, inter′) > forward application < backward application
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 22
Classical Categorial Grammar
- ‘Classical’ Categorial Grammar only has application rules
- Classical Categorial Grammar is context free
interleukin-10 inhibits production NP (S\NP)/NP NP S\NP S
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 23
Classical Categorial Grammar
- ‘Classical’ Categorial Grammar only has application rules
- Classical Categorial Grammar is context free
interleukin-10 inhibits production NP V NP VP S
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 24
A More Interesting ccg Derivation
The company which Microsoft bought NP/N N (NP\NP)/(S/NP) NP (S\NP)/NP NP S/(S\NP) S/NP NP\NP NP > T type-raising > B forward composition
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 25
A More Interesting ccg Derivation
The company which Microsoft bought NP/N N (NP\NP)/(S/NP) NP (S\NP)/NP
>T
NP S/(S\NP) S/NP NP\NP NP > T type-raising
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 26
A More Interesting ccg Derivation
The company which Microsoft bought NP/N N (NP\NP)/(S/NP) NP (S\NP)/NP
>T
NP S/(S\NP)
>B
S/NP NP\NP NP > T type-raising > B forward composition
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 27
A More Interesting ccg Derivation
The company which Microsoft bought NP/N N (NP\NP)/(S/NP) NP (S\NP)/NP
>T
S/(S\NP)
>B
S/NP
>
NP\NP NP
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
Intro Parsing CCG 28
A More Interesting ccg Derivation
The company which Microsoft bought NP/N N (NP\NP)/(S/NP) NP (S\NP)/NP
> >T
NP S/(S\NP)
>B
S/NP
>
NP\NP
<
NP
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 29
A Treebank of ccg Derivations
NP Marks S[to]\NP (S[dcl]\NP)/(S[to]NP) to persuades (S[to]\NP)/(S[b]\NP) NP ((S[dcl]\NP)/(S[to]\NP))/NP Brooks S[dcl] S[dcl]\NP merge S[b]\NP
- 40k sentences of newspaper text annotated with ccg derivations
- Treebanks like this take years to build
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 30
Inducing a Grammar
NP Marks S[to]\NP (S[dcl]\NP)/(S[to]NP) to persuades (S[to]\NP)/(S[b]\NP) NP ((S[dcl]\NP)/(S[to]\NP))/NP Brooks S[dcl] S[dcl]\NP merge S[b]\NP
- For a lexicalised grammar, the grammar essentially is the lexicon
– plus a small number of manually defined combinatory rules
- Lexicon can be read off the leaves of the trees
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 31
Inducing a Grammar
- ≈ 1 200 lexical category types in the ccg treebank
- this set has very high coverage on unseen newspaper data
- In addition to the grammar, the treebank provides training data
for the statistical models
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 32
Parsing
- Stage 1
- Assign lexical categories to words in the sentence
- Use a finite-state tagger to assign the categories
– based on standard tagging techniques
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 32
Parsing
- Stage 1
- Assign lexical categories to words in the sentence
- Use a finite-state tagger to assign the categories
– based on standard tagging techniques
- Stage 2
- Combine the categories using the combinatory rules
- Can use standard bottom-up chart-parsing algorithm
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 32
Parsing
- Stage 1
- Assign lexical categories to words in the sentence
- Use a finite-state tagger to assign the categories
– based on standard tagging techniques
- Stage 2
- Combine the categories using the combinatory rules
- Can use standard bottom-up chart-parsing algorithm
- Stage 3
- Find the highest scoring derivation according to some model
– e.g. generative model, crf, perceptron
- Viterbi algorithm finds this efficently
- solves the ambiguity problem
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 33
Practical Application
- Parser applied successfully to newspapers, biomedical research
papers, Wikipedia and questions
- Surprisingly fast because of statistical pruning and highly
- ptimised C++ code
- parsed the whole of Wikipedia in a morning using 90 CPUs
- yet output is linguistically expressive
- Accuracy: 83% on labelled grammatical relations
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 34
Summary of Part I
- We have a practical parser that can build (fairly sophisticated)
compositional semantic representations of naturally occurring text
- including first-order logical representations (Bos)
- Can tell us that there is a chasing event, and the dog is doing the
chasing, and the cat is being chased . . .
- But can’t tell us that cats and dogs are similar in some way (both
pets/animals)
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 35
Lexical Semantics
- Lexical semantics: the study of the meanings of words
- Some lexical relations:
- Synonymy: two words are synonymous if they have the same (or
very similar) meaning
- e.g. car/automobile
- Hyperonymy: w1 is a hypernym of w2 if w2 “is-a-kind-of” w1
- e.g. animal is a hypernym of cat
- cat is a hyponym of animal
- Antonymy: antonyms are words with opposite meanings
- e.g. slow/fast, hot/cold
- Meronymy: part-whole relation
- e.g. tire is a part of a car
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 36
Uses of Lexical Relations
- Query expansion for document retrieval
- add synonymous terms to query
- Problems with creating lexical semantic resources by hand
- expensive and time consuming
- difficult to keep up to date
- minimal coverage
- Can these resources be created automatically?
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 37
Distributional and Semantic Similarity
- You shall know a word by the company that it keeps. (Firth,‘57)
- Distributional hypothesis: the meaning of a word can be
represented by the distribution of words appearing in its contexts
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 38
Distributional and Semantic Similarity
- dog and cat are related semantically:
- dog and cat both co-occur with big, small, furry, eat, sleep
– because dogs and cats can be big, small, furry, they eat and sleep
- ship and boat have similar meanings:
- ship and boat appear as the direct object of the verbs sail, clean,
bought; as the object of the adjectives large, clean, expensive – because ships and boats can be sailed, cleaned, bought; large, clean, expensive
- Infer lexical relations automatically from large text collections and
distributional similarity
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 39
Window Methods
- In window methods the context is a fixed-word window either side
- f the headword
- For each headword a vector is created where each component
corresponds to a word in the vocabulary
- Value of each component is a (weighted) frequency of the
number of times the corresponding vocabulary word appears in the context of the headword
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 40
Vector Space for Window Method
eat the sleep dog cat
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 41
Weighting based on Informativeness
eat sleep dog cat the
- E.g. divide frequency by the number of headword contexts in
which context word appears
- Two words w1 and w2 are similar if the same informative words
tend to appear in the contexts of w1 and w2
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 42
Creating a Thesaurus
- Calculate vector for each headword using some corpus
- For a given headword, rank other headword vectors using
similarity measure such as cosine
- For a given headword, return top-N ranked words as synonyms
- Curran used a 2 billion word corpus
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 43
What Relations are Acquired?
- Words related to company include: subsidiary, unit, firm, industry,
business, bank, giant, maker, manufacturer
- Some are hyponyms, e.g. subsidiary is-a-kind-of company
- Others are related semantically, but not clear how, e.g. company
and industry
- for query expansion adding industry for company may be helpful
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 44
Example Output
- introduction: launch, implementation, advent, addition,
adoption, arrival, absence, inclusion, creation, departure, availability, elimination, emergence, use, acceptance, abolition, array, passage, completion, announcement, . . .
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 45
Example Output
- evaluation: assessment, examination, appraisal, review, audit,
analysis, consultation, monitoring, testing, verification, inquiry, inspection, measurement, supervision, certification, checkup, . . .
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 46
Example Output
- context: perspective, significance, framework, implication,
regard, aspect, dimension, interpretation, meaning, nature, importance, consideration, focus, beginning, scope, continuation, relevance, emphasis, backdrop, subject, . . .
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 47
Example Output
- similarity: resemblance, parallel, contrast, flaw, discrepancy,
difference, affinity, aspect, correlation, variation, contradiction, distinction, divergence, commonality, disparity, characteristic, shortcoming, significance, clue, hallmark, . . .
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 48
Example Output
- methods: technique, procedure, means, approach, strategy, tool,
concept, practice, formula, tactic, technology, mechanism, form, alternative, standard, way, guideline, methodology, model, process, . . .
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 49
Example Output
- results: consequence, outcome, effect, finding, evidence,
response, possibility, kind, impact, datum, reason, extent, report, example, series, aspect, account, amount, degree, basis, . . .
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 50
Extending the Vector Space Model beyond Words
- man reads newspaper is similar to boy browses magazine
- Current models take no account of the syntactic relations
between words
- dog bites man is different to man bites dog
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 51
A Compositional Distributional Semantics?
- Combine the strengths of the distributional and compositional
approaches to semantics
- Vector spaces at the word level capture lexical semantic similarity
- Some operator on these vector spaces captures syntactic relations
- Inner product between sentence vectors determines sentence
similarity
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 52
A Compositional Distributional Semantics?
browses woman
subj subj
- bj
reads
- bj
magazine man newspaper
- Is there a natural operator to use?
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 53
Tensor Product
- Tensor products have been used in connectionist Cognitive
Science to model predicate argument bindings (Smolensky)
- Also to model the “pet fish” problem (Aerts and Gabora)
- guppy as a pet fish emerges out of the tensor product combination
- f the individual concepts
- compared to how quantum entities combine
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 54
Conclusion
- Fundamental new problem in the semantics of natural language
- semantics of similarity
- Solution would be of theoretical and practical interest
does John
not
like
=
not
like
not Mary Mary John
meaning vectors of words pregroup grammar
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C&C tools
CCG Parsing Lexical Semantics Sentence Similarity Conclusion 55
Acknowledgements
- Thanks to Stephen Pulman for putting me onto this problem
- This is ongoing work in collaboration with Daoud Clarke, Bob
Coecke, Ed Grefenstette, Peter Hines, Stephen Pulman, Mehrnoosh Sadrzadeh
- Parser is freely available from my web page, and was developed
with James Curran, Julia Hockenmaier and Mark Steedman
Stephen Clark Models of Meaning for Natural Language Oxford, October 2010