Integrating Logical Representations with Probabilistic Information - PowerPoint PPT Presentation

Integrating Logical Representations with Probabilistic Information using Markov Logic Dan Garrette, Katrin Erk, and Raymond Mooney The University of Texas at Austin 1

Overview Some phenomena best modeled through logic , others statistically Aim: a unified framework for both We present first steps towards this goal Basic framework: Markov Logic Technical solutions for phenomena 2

Introduction 3

Semantics Represent the meaning of language Logical Models Probabilistic Models 4

Phenomena Modeled with Logic Standard first-order logic concepts - Negation - Quantification: universal, existential Implicativity / factivity 5

Implicativity / Factivity Presuppose truth or falsity of complement Influenced by polarity of environment 6

Implicativity / Factivity “Ed knows Mary left.” ➡ Mary left “Ed refused to lock the door.” ➡ Ed did not lock the door 7

Implicativity / Factivity “Ed did not forget to ensure that Dave failed.” ➡ Dave failed “Ed hopes that Dave failed.” ➡ ?? 8

Phenomena Modeled Statistically Word Similarity Synonyms Hypernyms / hyponyms 9

Synonymy “The wine left a stain.” ➡ paraphrase: “result in” “He left the children with the nurse.” ➡ paraphrase: “entrust” 10

Hypernymy “The bat flew out of the cave.” ➡ hypernym: “animal” “The player picked up the bat .” ➡ hypernym: “stick” 11

Hypernymy and Polarity vehicle “John owns a car ” boat car truck ➡ John owns a vehicle “ John does not own a vehicle ” vehicle ➡ John does not own a car boat car truck 12

Our Goal A unified semantic representation incorporate logic and probabilities interaction between the two Ability to reason with this representation 13

Our Solution Markov Logic “Softened” first order logic: weighted formulas Judge likelihood of inference 14

Evaluating Understanding How can we tell if our semantic representation is correct? Need a way to measure comprehension Textual Entailment : determine whether one text implies another 15

Textual Entailment premise: iTunes software has seen strong sales in Europe. Yes Yes hypothesis: Strong sales for iTunes in Europe. premise: Oracle had fought to keep the forms from being released No No hypothesis: Oracle released a confidential document 16

Textual Entailment Requires deep understanding of text Allows us to construct test data that targets our specific phenomena 17

Motivation 18

Bos-style Logical RTE Generate rules linking all possible paraphrases Unable to distinguish between good and bad paraphrases 19

Bos-style Logical RTE “The player picked up the bat .” ⊧ “The player picked up the stick ” ⊧ “The player picked up the animal ” 20

Distributional-Only Able to judge similarity Unable to properly handle logical phenomena 21

Our Approach Handle logical phenomena discretely Handle probabilistic phenomena with weighted formulas Do both simultaneously , allowing them to influence each other 22

Background 23

Logical Semantics Semanticists have traditionally represented meaning with formal logic We use Boxer (Bos et al., 2004) to generate Discourse Representation Structures (Kamp and Reyle, 1993) 24

Logical Semantics x0 x0 name named(x0 med(x0, john, per) r) e1 l2 e1 l2 mana manage(e1) event(e1) even agent(e1, x0) agen “John did not manage to leave” theme theme(e1, l2) ¬ ¬ prop proposition(l2) l2: e3 leave(e3) event(e3) agent(e3, x0) 25

Logical Semantics x0 x0 “John did not manage to leave” name named(x0 med(x0, john, per) r) e1 l2 e1 l2 Boxes have existentially quantified variables mana manage(e1) even event(e1) agent(e1, x0) agen theme(e1, l2) theme ...and atomic formulas ¬ ¬ prop proposition(l2) l2: e3 leave(e3) ...and logical operators event(e3) agent(e3, x0) 26

Logical Semantics x0 x0 “John did not manage to leave” name named(x0 med(x0, john, per) r) e1 l2 e1 l2 mana manage(e1) event(e1) even Box structure shows scope agen agent(e1, x0) theme(e1, l2) theme ¬ ¬ prop proposition(l2) l2: e3 leave(e3) Labels allow reference to entire boxes event(e3) agent(e3, x0) 27

Logical Semantics Why use First Order Logic? Powerful, flexible representation Straightforward inference procedure Why Not? Unable to handle uncertainty Natural language is not discrete 28

Distributional Semantics Describe word meaning by its context Representation is a continuous function 29

Distributional Semantics “result in” “The wine left a stain” “leave” “entrust” “He left the children with the nurse” 30

Distributional Semantics Why use Distributional Models? Can predict word-in-context similarity Can be learned in an unsupervised fashion Why Not? Incomplete representation of semantics No concept of negation, quantification, etc 31

Approach 32

Approach Flatten DRS into first order representation Add weighted word-similarity constraints 33

Standard FOL Conversion “John did not manage to leave” x0 x0 named(x0 name med(x0, john, per) r) e1 l2 e1 l2 ∃ x0.(ne_per_john(x0) & ¬ ∃ e1 l2.(manage(e1) & mana manage(e1) event(e1) & event(e1) even agent(e1, x0) & agen agent(e1, x0) theme(e1, l2) & theme(e1, l2) theme ¬ ¬ proposition(l2) & proposition(l2) prop ∃ e3.(leave(e3) & l2: e3 event(e3) & agent(e3, x0)))) leave(e3) event(e3) agent(e3, x0) 34

Standard FOL Conversion “John did not manage to leave” x0 x0 name named(x0 med(x0, john, per) r) e1 l2 e1 l2 ∃ x0.(ne_per_john(x0) & ¬ ∃ e1 l2.(manage(e1) & manage(e1) mana event(e1) & event(e1) even agent(e1, x0) & agen agent(e1, x0) DRT allows the theme(e1, l2) & theme theme(e1, l2) theme proposition to ¬ ¬ proposition(l2) & proposition(l2) prop be labeled as “l2” ∃ e3.(leave(e3) & l2: e3 event(e3) & agent(e3, x0)))) leave(e3) The conversion event(e3) loses track of what agent(e3, x0) “l2” labels 35

Standard FOL Conversion “John forgot to leave” “John left” ∃ x0 e1 l2.(ne_per_john(x0) & ∃ x0 e3.(ne_per_john(x0) & forget(e1) & leave(e3) & event(e1) & event(e3) & agent(e1, x0) & agent(e3, x0)) theme(e1, l2) & proposition(l2) & ∃ e3.(leave(e3) & event(e3) & agent(e3, x0))) 36

Standard FOL Conversion ⊧ “John forgot to leave” “John left” ∃ x0 e1 l2 e3.(ne_per_john(x0) & ∃ x0 e3.(ne_per_john(x0) & forget(e1) & leave(e3) & ⊧ event(e1) & event(e3) & agent(e1, x0) & agent(e3, x0)) theme(e1, l2) & proposition(l2) & leave(e3) & event(e3) & agent(e3, x0)) 37

Our FOL Conversion true(l0) l0: x0 x0 named(l0, ne_per_john, x0) named(x0 name med(x0, john, per) r) not(l0, l1) l1: e1 l2 e1 l2 pred(l1, manage, e1) event(l1, e1) manage(e1) mana rel(l1, agent, e1, x0) even event(e1) rel(l1, theme, e1, l2) agent(e1, x0) agen theme theme(e1, l2) prop(l1, l2) ¬ ¬ prop proposition(l2) pred(l2, leave, e3) l2: e3 event(l2, e3) rel(l2, agent, e3, x0) leave(e3) event(e3) agent(e3, x0) label “l2” is maintained 38

Our FOL Conversion With “connectives” as predicates, rules are needed to capture relationships: ∀ p c.[(true(p) ∧ not(p,c)) → false(c)]] ∀ p c.[(false(p) ∧ not(p,c)) → true(c)]] 39

Implicativity / Factivity Calculate truth values of nested propositions For example, “forget to” is downward entailing in positive contexts: ∀ l1 l2 e.[(pred(l1, “forget”, e) ∧ true(l1) ∧ rel(l1, “theme”, e, l2)) → false(l2)] 40

Word-Similarity sweep “A stadium craze is sweeping the country” synset1: brush move synset2: sail synset3: broom wipe synset4: embroil tangle drag involve synset5: traverse span cover extend synset6: clean synset7: win synset8: continue synset9: swing wield handle manage 41

Word-Similarity “A stadium craze is sweeping the country” manage handle sail cover broom move involve win tangle sweep drag clean span continue extend wipe embroil brush swing wield traverse 42

Word-Similarity “A stadium craze is sweeping the country” rank P = 1/rank W = log(P/(1-P)) paraphrase 1 continue 0.50 0.00 2 move 0.33 -1.00 3 win 0.25 -1.58 4 penalties cover 0.20 -2.00 increase 5 clean 0.17 -2.32 with 6 handle 0.14 -2.58 rank 7 embroil 0.13 -2.81 8 wipe 0.11 -3.00 9 brush 0.10 -3.17 10 traverse 0.09 -3.32 11 sail, span, ... 0.08 -3.46 43

Word-Similarity “A stadium craze is sweeping the country” Inject a rule for every possible paraphrase MLN decides which to use cover ∀ l x.[pred(l, “sweep”, x) ↔ pred(l, “ ”, x)] -2.00 -3.17 brush ∀ l x.[pred(l, “sweep”, x) ↔ pred(l, “ ”, x)] 44

Evaluation 45

Evaluation Executed over 100 hand-written examples Hand-write examples instead of using RTE data to target specific phenomena Examples discussed in this talk are handled correctly by the system 46

Integrating Logical Representations with Probabilistic Information - PowerPoint PPT Presentation

Integrating Logical Representations with Probabilistic Information using Markov Logic Dan Garrette, Katrin Erk, and Raymond Mooney The University of Texas at Austin 1 Overview Some phenomena best modeled through logic , others statistically

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Logical Consequence: From Logical Terms to Semantic Constraints Gil Sagi Munich Center for

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

61A Lecture 16 Announcements String Representations String Representations 4 String

A Linear Logical A Linear Logical A Linear Logical Framework Framework Framework Iliano

Basic Computation logical AND logical OR logical NOT && || ! public static void

Running Probabilistic Running Probabilistic Running Probabilistic Programs Backwards Programs

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Thesis

Probabilistic Computation Lecture 13 BPP vs. PH 1 Recap 2 Recap Probabilistic computation 2

Table of Contents I Probabilistic Reasoning Classical Probabilistic Models Basic Probabilistic

Probabilistic Computation Lecture 12 Flipping coins, taking chances PP, BPP 1 Probabilistic

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Reconstruction

Probabilistic Computation Lecture 13 Understanding BPP 1 Recap 2 Recap Probabilistic

From Probabilistic Circuits to Probabilistic Programs and Back Guy Van den Broeck PROBPROG - Oct

Fourier transform for nilpotent Lie groups Index sets and representations Granada Index sets

Journey to a RTE-free X.509 parser Arnaud Ebalard , Patricia Mouy , and Ryad Benadjila

Outline Vertex Decimation Algorithms Introduction Taxonomy Overview

Embedded C/C++ Programming using Software Components Evgueni Driouk Principal Software Engineer

Monte Carlo Semantics McPIET at RTE-4: Robust Inference and Logical Pattern Processing Based on

Computational Linguistics: Part III: NLP applications: Entailment R AFFAELLA B ERNARDI U NIVERSIT

Generative models for natural language inference DGM4NLP Miguel Rios University of Amsterdam

TREC, TAC, takeoffs, tacks, tasks, and titillations for 2009 Ian Soboroff, NIST

Efficient Training of BERT by Progressively Stacking Linyuan Gong, Di He , Zhuohan Li, Tao Qin,

Integrating Logical Representations with Probabilistic Information - PowerPoint PPT Presentation

Integrating Logical Representations with Probabilistic Information using Markov Logic Dan Garrette, Katrin Erk, and Raymond Mooney The University of Texas at Austin 1 Overview Some phenomena best modeled through logic , others statistically

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Logical Consequence: From Logical Terms to Semantic Constraints Gil Sagi Munich Center for

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

61A Lecture 16 Announcements String Representations String Representations 4 String

A Linear Logical A Linear Logical A Linear Logical Framework Framework Framework Iliano

Basic Computation logical AND logical OR logical NOT &amp;&amp; || ! public static void

Running Probabilistic Running Probabilistic Running Probabilistic Programs Backwards Programs

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Thesis

Probabilistic Computation Lecture 13 BPP vs. PH 1 Recap 2 Recap Probabilistic computation 2

Table of Contents I Probabilistic Reasoning Classical Probabilistic Models Basic Probabilistic

Probabilistic Computation Lecture 12 Flipping coins, taking chances PP, BPP 1 Probabilistic

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Reconstruction

Probabilistic Computation Lecture 13 Understanding BPP 1 Recap 2 Recap Probabilistic

From Probabilistic Circuits to Probabilistic Programs and Back Guy Van den Broeck PROBPROG - Oct

Fourier transform for nilpotent Lie groups Index sets and representations Granada Index sets

Journey to a RTE-free X.509 parser Arnaud Ebalard , Patricia Mouy , and Ryad Benadjila

Outline Vertex Decimation Algorithms Introduction Taxonomy Overview

Embedded C/C++ Programming using Software Components Evgueni Driouk Principal Software Engineer

Monte Carlo Semantics McPIET at RTE-4: Robust Inference and Logical Pattern Processing Based on

Computational Linguistics: Part III: NLP applications: Entailment R AFFAELLA B ERNARDI U NIVERSIT

Generative models for natural language inference DGM4NLP Miguel Rios University of Amsterdam

TREC, TAC, takeoffs, tacks, tasks, and titillations for 2009 Ian Soboroff, NIST

Efficient Training of BERT by Progressively Stacking Linyuan Gong, Di He , Zhuohan Li, Tao Qin,

Basic Computation logical AND logical OR logical NOT && || ! public static void