Entailment above the word level in distributional semantics Marco - PowerPoint PPT Presentation

Entailment above the word level in distributional semantics Marco Baroni University of Trento Raffaella Bernardi University of Trento Ngoc-Quynh Do EM LCT, Free University of Bozen-Bolzano Chung-chieh Shan Cornell University, University of Tsukuba EACL 25 April 2012

✒ ✒ Summary Entailment among composite phrases rather than nouns. (Cheap training data!) Entailment among logical words rather than content words. (Part of Recognizing Textual Entailment?) Different entailment relations at different semantic types. (Prediction from formal semantics.) 2/17

✒ ✒ Summary Entailment among composite phrases rather than nouns. (Cheap training data!) Entailment among logical words rather than content words. (Part of Recognizing Textual Entailment?) Different entailment relations at different semantic types. (Prediction from formal semantics.) train test AN = = N N = = N big cat cat dog animal 2/17

✒ ✒ Summary Entailment among composite phrases rather than nouns. (Cheap training data!) Entailment among logical words rather than content words. (Part of Recognizing Textual Entailment?) Different entailment relations at different semantic types. (Prediction from formal semantics.) AN = = N N = = N big cat cat dog animal train test QN = = QN QN = = QN many dogs some dogs all cats several cats 2/17

✒ ✒ Summary Entailment among composite phrases rather than nouns. (Cheap training data!) Entailment among logical words rather than content words. (Part of Recognizing Textual Entailment?) Different entailment relations at different semantic types . (Prediction from formal semantics.) AN = = N N = = N train big cat cat dog animal × test QN = = QN QN = = QN many dogs some dogs all cats several cats 2/17

Approaches to semantics “In order to say what a meaning is , we may first ask what a meaning does , and then find something that does that.” —David Lewis 3/17

Approaches to semantics “In order to say what a meaning is , we may first ask what a meaning does , and then find something that does that.” —David Lewis Truth, entailment Every person cried. � Every professor cried. A person cried. � A professor cried. Formal semantics ∀ x . Px → Cx λ g . ∀ x . Px → gx C λ f . λ g . ∀ x . fx → gx P 3/17

Approaches to semantics “In order to say what a meaning is , we may first ask what a meaning does , and then find something that does that.” —David Lewis Concepts, similarity ambulance ∼ battleship ambulance ≁ bookstore Distributional semantics l a c n n i m o i m t d e p y n o t d e a d i l a c i b b b c c a a a a a . . . ambulance 27 10 50 17 130  . . .  battleship 35 0 32 1 25 . . .     bookstore 5 0 6 33 13 . . .   . . . . . . ... . . . . . . . . . . . . 3/17

Distributional semantics for entailment among words For each word w , rank contexts c by descending Pr ( c | w ) > 1. Pr ( c ) “pointwise mutual information” 5/17

Distributional semantics for entailment among words For each word w , rank contexts c by descending Pr ( c | w ) > 1. Pr ( c ) “pointwise mutual information” parent argcount n arglist n arglist j phane n specity n qdisc n carthy n parents-to-be n non-resident j step-parent n tc n ballons n eliza n symptons n adoptive j stepparent n nonresident j home-school n scabrid n petiolule n . . . person anglia n first-mentioned j unascertained j enure v deposit-taking j bonis n iconclass j cotswolds n aforesaid n haver v foresaid j gha n sub-paragraphs n enacted j geest j non-medicinal j sub-paragraph n intimation n arrestment n incumbrance n . . . professor william n extraordinarius n ordinarius n francis n reid n emeritus n emeritus j derwent n regius n laurence n edward n carisoprodol n adjunct j winston n privatdozent j edward j xanax n tenure v cialis n florence n . . . 5/17

Distributional semantics for entailment among words parent-person professor-person Context overlap with word 2 person-parent 3000 professor-parent 2000 person-professor parent-professor 1000 0 0 1000 2000 3000 4000 5000 Context rank of word 1 6/17

Distributional semantics for entailment among words t ⊆ parent-person c e f professor-person r Context overlap with word 2 e p person-parent 3000 professor-parent 2000 person-professor parent-professor 1000 0 0 1000 2000 3000 4000 5000 Context rank of word 1 6/17

Distributional semantics for entailment among words t ⊆ parent-person c e f professor-person r Context overlap with word 2 e p person-parent 3000 professor-parent 2000 person-professor parent-professor 1000 0 0 1000 2000 3000 4000 5000 Context rank of word 1 Better: skew divergence (Lee), balAPinc (Kotlerman et al.), . . . 6/17

Above the word level Phrases have corpus distributions too! N cat AN white cat QN every cat 7/17

Above the word level Phrases have corpus distributions too! But N ≈ AN �≈ QN Syntactic category N cat N AN white cat N QN every cat QP 7/17

Above the word level Phrases have corpus distributions too! But N ≈ AN �≈ QN Syntactic category Semantic type e → t N cat N AN white cat N e → t QN every cat QP ( e → t ) → t 7/17

Above the word level Phrases have corpus distributions too! But N ≈ AN �≈ QN Syntactic category Semantic type e → t N cat N AN white cat N e → t e → t AAN big white cat N QN every cat QP ( e → t ) → t ( e → t ) → t QAN every big cat QP * AQN big every cat * QQN some every cat 7/17

✒ ✒ Our questions Entailment among composite phrases rather than nouns? Entailment among logical words rather than content words? Different entailment relations at different semantic types? train test AN = = N N = = N big cat cat dog animal × QN = = QN QN = = QN many dogs some dogs all cats several cats 8/17

✒ ✒ Our questions Entailment among composite phrases rather than nouns? Entailment among logical words rather than content words? Different entailment relations at different semantic types? AN = = N N = = N big cat cat dog animal × train test QN = = QN QN = = QN many dogs some dogs all cats several cats 8/17

✒ ✒ Our questions Entailment among composite phrases rather than nouns? Entailment among logical words rather than content words? Different entailment relations at different semantic types? AN = = N = = N N N N train big cat cat dog animal × test QN = = QN = = QN QN QN QN many dogs some dogs all cats several cats 8/17

Our semantic space BNC, WackyPedia, ukWaC TreeTagger (Schmid) lemmatized, POS-tagged tokens (2.8G) words and phrases in the same sentence most frequent A, N, V (27K)   AN QN     A #( c , w )     Q   N     (48K) 9/17

Our semantic space BNC, WackyPedia, ukWaC TreeTagger (Schmid) lemmatized, POS-tagged tokens (2.8G) words and phrases in the same sentence most frequent A, N, V (27K) (300)       AN QN       log Pr ( c | w )       U ˜ A PMI SVD #( c , w ) Σ             Q Pr ( c )       N             (48K) 9/17

Our semantic space BNC, WackyPedia, ukWaC TreeTagger (Schmid) lemmatized, POS-tagged tokens (2.8G) words and phrases in the same sentence most frequent A, N, V (27K) (300)       AN QN       log Pr ( c | w )       U ˜ A PMI SVD #( c , w ) Σ             Q Pr ( c )       N             (48K) cosine balAPinc frequency SVM baseline baseline 9/17

Our entailment classifiers     log Pr ( c | w )   PMI     Pr ( c )       10/17

Our entailment classifiers     log Pr ( c | w )   PMI     Pr ( c )       ? ⊆ 10/17

Our entailment classifiers     log Pr ( c | w )   PMI     Pr ( c )       ? ⊆ balAPinc (Kotlerman et al.) 10/17

Our entailment classifiers     log Pr ( c | w )   PMI     Pr ( c )       ? ⊆ 0 ≤ balAPinc ≤ 1 > threshold? 10/17

Our entailment classifiers   Train Test   log Pr ( c | w )   PMI AN � N N � N     QN � QN QN � QN Pr ( c )     AN � N QN � QN   ? ⊆ 0 ≤ balAPinc ≤ 1 > threshold? 10/17

Our entailment classifiers         log Pr ( c | w )     PMI SVD U ˜     Σ     Pr ( c )             ? ⊆ 0 ≤ balAPinc ≤ 1 SVM (cubic) outperformed naïve Bayes, k NN > threshold? 10/17

Entailment above the word level in distributional semantics Marco - PowerPoint PPT Presentation

Entailment above the word level in distributional semantics Marco Baroni University of Trento Raffaella Bernardi University of Trento Ngoc-Quynh Do EM LCT, Free University of Bozen-Bolzano Chung-chieh Shan Cornell University, University of

Compositional Distributional Semantic Models for Semantic Relatedness and Entailment Sidharth

Semantic Entailment and Natural Deduction Alice Gao Lecture 6, September 26, 2017 Entailment

SERVICE ABOVE SELF SERVICE ABOVE SELF SERVICE ABOVE SELF SERVICE ABOVE SELF ROTARY DISTRICT

SERVICE ABOVE SELF SERVICE ABOVE SELF SERVICE ABOVE SELF SERVICE ABOVE SELF ROTARY DISTRICT

above(a,d) rule (2) {X\a,Z\d} on(a,Y), above(Y,d) {Y\b} on(a,Y) above(b,d) answer: Y=b

Cadoli-Schaerf Approximation Anytime Algorithms for logical entailment State of the Art:

Design and Realization of the EXCITEMENT Open Platform for Textual Entailment Gnter Neumann,

Textual Entailment Alina Petrova EMCL TUD, HLT FBK February 22, 2012 Alina Petrova EMCL TUD,

Memory Memory Decoders M bits M bits RWM NVRWM ROM S 0 S 0 Word 0 Word 0 S 1 Word 1 Word

Distributional Semantics The unsupervised modeling of meaning on a large scale Tim Van de Cruys

Dont blame distributional semantics if it cant do entailment Matthijs Westera & Gemma

Word Meaning: Distributional Representations & Word Sense Disambiguation CMSC 723 / LING 723

Distributional Compositionality Intro to Distributional Semantics Raffaella Bernardi University

Linear mixed models with improper priors and flexible distributional assumptions for longitudinal

Statistics and Samples in Distributional Reinforcement Learning Mark Rowland, Robert Dadashi,

Statistics and Samples in Distributional Reinforcement Learning Rowland, Dadashi, Kumar, Munos,

CSC421/2516 Lecture 11: Optimizing the Input Roger Grosse and Jimmy Ba Roger Grosse and Jimmy Ba

For Monday after Spring Break Check website for reading assignment Homework: Chapter

Information Literacy Session #2 Searching Concepts Topics covered Types of searches

Project Seeds Languages & Runtimes for Big Data Reminder Homework 1: Database Cracking

Towards Data-Driven Particle Physics Classifiers Deep Learning in the Natural Sciences University

Thank You For Coming! Foster Homes Are Very Important The Shelter is not big enough to keep

AClib: a Benchmark Library for Algorithm Configuration Frank Hutter, Manuel Lpez-Ibez,

So, you think you want real-time! Eric J. Bruno eric@ericbruno.com Principal Engineer Sun

Entailment above the word level in distributional semantics Marco - PowerPoint PPT Presentation

Entailment above the word level in distributional semantics Marco Baroni University of Trento Raffaella Bernardi University of Trento Ngoc-Quynh Do EM LCT, Free University of Bozen-Bolzano Chung-chieh Shan Cornell University, University of

Compositional Distributional Semantic Models for Semantic Relatedness and Entailment Sidharth

Semantic Entailment and Natural Deduction Alice Gao Lecture 6, September 26, 2017 Entailment

SERVICE ABOVE SELF SERVICE ABOVE SELF SERVICE ABOVE SELF SERVICE ABOVE SELF ROTARY DISTRICT

SERVICE ABOVE SELF SERVICE ABOVE SELF SERVICE ABOVE SELF SERVICE ABOVE SELF ROTARY DISTRICT

above(a,d) rule (2) {X\a,Z\d} on(a,Y), above(Y,d) {Y\b} on(a,Y) above(b,d) answer: Y=b

Cadoli-Schaerf Approximation Anytime Algorithms for logical entailment State of the Art:

Design and Realization of the EXCITEMENT Open Platform for Textual Entailment Gnter Neumann,

Textual Entailment Alina Petrova EMCL TUD, HLT FBK February 22, 2012 Alina Petrova EMCL TUD,

Memory Memory Decoders M bits M bits RWM NVRWM ROM S 0 S 0 Word 0 Word 0 S 1 Word 1 Word

Distributional Semantics The unsupervised modeling of meaning on a large scale Tim Van de Cruys

Dont blame distributional semantics if it cant do entailment Matthijs Westera &amp; Gemma

Word Meaning: Distributional Representations &amp; Word Sense Disambiguation CMSC 723 / LING 723

Distributional Compositionality Intro to Distributional Semantics Raffaella Bernardi University

Linear mixed models with improper priors and flexible distributional assumptions for longitudinal

Statistics and Samples in Distributional Reinforcement Learning Mark Rowland, Robert Dadashi,

Statistics and Samples in Distributional Reinforcement Learning Rowland, Dadashi, Kumar, Munos,

CSC421/2516 Lecture 11: Optimizing the Input Roger Grosse and Jimmy Ba Roger Grosse and Jimmy Ba

For Monday after Spring Break Check website for reading assignment Homework: Chapter

Information Literacy Session #2 Searching Concepts Topics covered Types of searches

Project Seeds Languages &amp; Runtimes for Big Data Reminder Homework 1: Database Cracking

Towards Data-Driven Particle Physics Classifiers Deep Learning in the Natural Sciences University

Thank You For Coming! Foster Homes Are Very Important The Shelter is not big enough to keep

AClib: a Benchmark Library for Algorithm Configuration Frank Hutter, Manuel Lpez-Ibez,

So, you think you want real-time! Eric J. Bruno eric@ericbruno.com Principal Engineer Sun

Dont blame distributional semantics if it cant do entailment Matthijs Westera & Gemma

Word Meaning: Distributional Representations & Word Sense Disambiguation CMSC 723 / LING 723

Project Seeds Languages & Runtimes for Big Data Reminder Homework 1: Database Cracking