The big question: How do we infer and reason about meanings of - PowerPoint PPT Presentation

The big question: How do we infer and reason about meanings of sentences? Conceptual importance: Discovering the process of cognition and intelligence. Applications: Automating language-related tasks, such as document search.

The big challenge: Meaning of a sentence � = Collection of meanings of its words. � = { John , likes , Mary } John likes Mary . . . sentence word 1 word 2 word n � = A B Z S = Meaning of a sentence A function of meanings of its words. . . . word 1 word 2 word n A B Z sentence = process depending on S grammatical structure S

Two complementary approaches to meaning 1- The logical or symbolic model  = Meaning of sentence A truth function of its words.  words = ∅ .  2- The vector space or distributional model  = Words Vectors built from context ,  function = ∅ .  . . . word 1 word 2 word n A B Z

Logical vs Vector Space Models (I) Logical Models   Compositional ,         Pros : Model-theoretic semantics (Montague) ,         Automated inferences .      Qualitative (true-false) ,         Not very suitable for real world text ,    Cons :    Says very little about lexical semantics ,            Forgets some of the syntactic structure .  (II) Vector Space Model  Cons: Non-compositional .         Quantitative ,    Pros :    All about lexical semantics .   

A formalism with the best of the two: Compositional & Distributional A function of the − − − − − → Meaning of a sentence = vectors of its words. . . . word 1 word 2 word n A B Z sentence = process depending on S grammatical structure S

Compositional Distributional Models of Meaning Clark, Coecke, Grefenstette, Pulman, Sadrzadeh Computing and Computer Laboratories Oxford and Cambridge

Aim: Understanding this model. Theoretical Preliminaries 0- Some Category Theory 1- Pregroup Grammars 2- Vector Space Models 3- Pregroups and Vector Spaces Categorically 5- Combining the two: Categorical Semantics for Compositional Distributional Models Ed - Concrete: Implementation, Evaluation, Experiments.

Some Category Theory A category has - Objects: A, B, C - Morphisms: f, g, h f ✲ B g ✲ C A B and -The morphisms must compose: f ✲ B and B g ✲ C then ∃ h, A h ✲ C such that h = f ; g . If A - Each object has an identity morphism A 1 A ✲ A 1 B ✲ B B f ✲ B we have This is the unit of composition, i.e. for A 1 A ; f = f ; 1 B = f

Example Objects Morphisms systems processes sets relations sets functions formulas proofs grammatical types grammatical reductions vector spaces linear maps

Sets and Relations Objects: sets A = { x, y } B = { z, w } C = { s, t } Morphisms: Relations f ✲ B A is defined by f ⊆ { ( a, b ) | a ∈ A, b ∈ B } For instance f ✲ B A given by f = { ( x, z ) , ( x, w ) , ( y, z ) } g ✲ C g = { ( z, s ) , ( w, s ) } B given by

Sets and Relations Composition: Composing Relations f ✲ B g ✲ C h ✲ C A ∃ h, A such that h = f ; g In general � � ( a, c ) | ∃ b, ( a, b ) ∈ f & ( b, c ) ∈ g f ; g = For instance in our example f ; g =?

Sets and Relations Identity: Diagonal Relation 1 A = { ( a, a ) | a ∈ A } For our example 1 A = { ( x, x ) , ( y, y ) } 1 B = { ( z, z ) , ( w, w ) } These must satisfy 1 A ; f = f ; 1 B = f For instance compute 1 A ; f = { ( x, x ) , ( y, y ) } ; { ( x, z ) , ( x, w ) , ( y, z ) } and verify that it is = f

Monoidal Category A category with a binary operation called tensor and denoted by ⊗ . This operator acts on two objects and returns their composite A ⊗ B It also acts on morphisms and turns them parallel  f ✲ A Q  A ⊗ B f ⊗ g ✲ Q ⊗ W If then g ✲  B W The tensor has a unit I , that is A ⊗ I = I ⊗ A = A

Sets and Relations There is more than one ⊗ here, but for our purposes, given two sets A, B , we take their tensor product to be cartesian product A ⊗ B = { ( a, b ) | a ∈ A, b ∈ B } For our previous example we have A ⊗ B = { ( x, z ) , ( x, w ) , ( y, z ) , ( y, w ) } I = {∗} The unit is the singleton set A ⊗ I = A × I = { ( a, ∗ ) | a ∈ A } ∼ = { a | a ∈ A } = A Tensor on morphisms is cartesian product of relations.

Diagrammatic Calculus The objects and morphisms of a monoidal category are usually depicted as follows 1 A f g ; f 1 A ⊗ 1 B f ⊗ 1 C f ⊗ g ( f ⊗ g ); h C E D g f g B B D E f g f f C A A B B B C h f A C A B A A

Diagrammatic Calculus The elements within the objects (e.g. elements of a set) can be depicted using the unit I as follows: ψ : I → A π : A → I π ◦ ψ : I → I π π A π ψ = ψ o A ψ A For instance the morphism I → A can be element x of A = { x, y } . x : I → { x, y }

Compact Category A monoidal category where each object A has a left adjoint A l and a right adjoint A r . This means that for each object A , we have 4 morphisms in the category: ǫ l : A l ⊗ A → I ǫ r : A ⊗ A r → I η l : I → A ⊗ A l η r : I → A r ⊗ A Diagrammatically, these morphisms are depicted by: A A l A r A A r l A A A

Compact Category These morphisms should satisfy: ( η l ⊗ 1 A ); (1 A ⊗ ǫ l ) = 1 A (1 A ⊗ η r ); ( ǫ r ⊗ 1 A ) = 1 A (1 A l ⊗ η l ); ( ǫ l ⊗ 1 A l ) = 1 A ( η r ⊗ 1 A r ); (1 A r ⊗ ǫ r ) = 1 A Diagrammatically, these are depicted by: A A A A l l = = l A A A A l r r A A A A = = r r A A A A

Pregroups ( P, ≤ , • , I, ( − ) l , ( − ) r ) ∃ p r ∈ P, ∃ p l ∈ P ∀ p ∈ P, p l • p ≤ I ≤ p • p l p • p r ≤ I ≤ p r • p ⇒ q l ≤ p l , q r ≤ p r Adjoint are unique and anti-tone p ≤ q = I l = I r = I Unit is self adjoint ( p • q ) l = q l • p l ( p • q ) r = q r • p r So is multiplication ( p r ) r � = p � = ( p l ) l Same adjoint do not cancel out ( p l ) r = p = ( p r ) l But opposite adjoints do

Example of a Proof: adjoints are unique. Suppose p has another left adjoint, call it x . This means x • p ≤ I ≤ p • x Now we have x = x • I ≤ x • p • p l = x • p • p l ≤ I • p l = p l Hence x ≤ p l Similarly p l = p l • I ≤ p l • p • x = p l • p • x ≤ I • x = x Hence p l ≤ x

Example of a Proof • is self-dual We want to show the following (also for the right adjoint) ( p • q ) l = q l • p l Compute ( q l • p l ) • ( p • q ) = q l • ( p l • p ) • q ≤ q l • 1 • q = q l • q ≤ I Also ( p • q ) • ( q l • p l ) = p • ( q • q l ) • p l ≥ p • 1 • p l = p • p l ≥ I Hence we have ( q l • p l ) • ( p • q ) ≤ I ≤ ( p • q ) • ( q l • p l ) So q l • p l is the left adjoint to p • q , but so is ( p • q ) l . Since adjoints are unique, we get q l • p l = ( p • q ) l

Examples of a Pregroup (0) A pregroup in which p l = p r = p − 1 is a (po)-group. (1) The set of all unbounded monotone functions on integers. f : Z → Z m ≤ n = ⇒ f ( m ) ≤ f ( n ) m → ∞ = ⇒ f ( m ) → ∞ and The order is defined pointwisely f ≤ g f ( n ) ≤ g ( n ) ∀ n ∈ Z iff The • is function composition and its unit is the identity ( f • g )( n ) = f ( g ( n )) and I ( n ) = n Adjoints are defined canonically, ∨ is max, ∧ is min f r ( x ) = ∨{ y ∈ Z | f ( y ) ≤ x } f l ( x ) = ∧{ y ∈ Z | x ≤ f ( y ) }

Example 1) Take f ( x ) = 2 x . Define adjoints as follows: f r ( x ) = ∨{ y ∈ Z | 2 y ≤ x } f l ( x ) = ∧{ y ∈ Z | x ≤ 2 y } f r ( x ) = ⌊ x/ 2 ⌋ f l ( x ) = ⌊ ( x + 1) / 2 ⌋ and where ⌊ x ⌋ is the biggest integer less than or equal to x . 2) Restrict to N and a nice example is π r ( x ) =? π ( x ) = the x ’th prime π r (5) = 3 π (5) = 11

Application to Linguistics Let Σ be the set of words of a natural language and B their types. Def. A Pregroup dictionary for Σ based on B is a binary relation D ⊆ Σ × T ( B ) where T ( B ) is the free pregroup generated over the partial order B . Def. A Pregroup grammar is a pair G = � D, s � of a pregroup dictionary and a distinguished element s ∈ B . Def. A string of words w 1 . . . w n of Σ is a grammatical sentence if and only if t 1 • · · · • t n ≤ s for ( w i , t i ) an element in D .

Example A simple dictionary has basic types B = { π, o, w, s, q, q, j, σ } π, o, w stand for subject, direct object, indirect object, s, j stand for statement, infinitive of a verb, q, q stand for yes-no and wh questions, σ is an index type. Partial order π ≤ n, o ≤ n . Dictionary likes: π r so l does: π r sj l σ John: π like: σ r jπ l not: σ r jj l σ Mary : o

Examples Compose the types of the constituents John likes Mary → statement ( π r s o l ) π o ≤ s Compute ππ r so l o ≤ 1 so l o ≤ 1 s 1 = s John does not like Mary → statement ( π r sj l σ ) ( σ r jj l σ ) ( σ r jo l ) ≤ π o s Compute: ππ r sj l σσ r jj l σσ r jo l o ≤ 1 sj l 1 jj l 1 j 1 = sj l jj l j ≤ s 11 = s Can you think of a simpler way to compute the above?

Depicting the Reduction Each reduction corresponds to a diagram. John likes Mary John does not like Mary π r sj l σ σ r jj l σ σ r jo l π o π r s o l π o

The big question: How do we infer and reason about meanings of - PowerPoint PPT Presentation

The big question: How do we infer and reason about meanings of sentences? Conceptual importance: Discovering the process of cognition and intelligence. Applications: Automating language-related tasks, such as document search. The big

Rationality, Man and Values Rationality, Man and Values Reason: The act of reasoning Reason:

WORKSHOP Patrick Stapfer / @ryyppy Revision 1.3 About Reason About Reason refmt extra ppx'es

Finding Inter-procedural Bugs at Scale with Infer Jules Villard <jul@fb.com> Facebook London

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Argumentation and human reason Hugo Mercier Institut Jean Nicod CNRS What is reason? Intuition

Infer A static analyzer for catching bugs before you ship Jules Villard jul@fb.com Facebook

God of Peace? Question Question Various approaches Question Various approaches Suggestions

Relational Actions and Planning Agents reason in time. Agents reason about time. Time passes as

Kant's Ethics of Duty Kant's Ethics of Duty What do we owe one another? What do we owe one

COLLEGE WHAT I WANT TO MAJOR IN ZOOLOGIST Reason 1 is so I can see so many animals.

RADICAL SURGERY M a t t h e w 5 : 2 9 - 3 0 THE SINFUL EYE SOLUTION: REASON: THE SINFUL HAND

I Prefer Pi Corey Sinnamon Febuary 3, 2015 Big Day 3/14/15 Big Day 3/14/15 Themes Big

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

Question Box An Open Mind Project What is Question Box? Question Box is an elegant shortcut

Lattice Reduction, Integer Programming, and Knapsacks Daniel Lichtblau danl@wolfram.com Wolfram

Discussion Midterm Exam Algorithm Theory WS 2013/14 Fabian Kuhn P1: Maximum Subarray Sum (18pt)

Modified Array Calibration for Precise Angle-of-Arrival Estimation Panarat Cherntanomwong,

Lecture 8 Multi-User MIMO I-Hsiang Wang ihwang@ntu.edu.tw 5/27, 2014

Atomic Nucleus as a Chaotic System V. Zelevinsky National Superconducting Cyclotron Laboratory

1. Knowledge is endless choose wisely

Advanced SQL 01 The Core of SQL Torsten Grust Universitt Tbingen, Germany 1 The Core

Stat 8931 (Aster Models) Lecture Slides Deck 8 Conditional Aster Models Charles J. Geyer School

The big question: How do we infer and reason about meanings of - PowerPoint PPT Presentation

The big question: How do we infer and reason about meanings of sentences? Conceptual importance: Discovering the process of cognition and intelligence. Applications: Automating language-related tasks, such as document search. The big

Rationality, Man and Values Rationality, Man and Values Reason: The act of reasoning Reason:

WORKSHOP Patrick Stapfer / @ryyppy Revision 1.3 About Reason About Reason refmt extra ppx'es

Finding Inter-procedural Bugs at Scale with Infer Jules Villard &lt;jul@fb.com&gt; Facebook London

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Argumentation and human reason Hugo Mercier Institut Jean Nicod CNRS What is reason? Intuition

Infer A static analyzer for catching bugs before you ship Jules Villard jul@fb.com Facebook

God of Peace? Question Question Various approaches Question Various approaches Suggestions

Relational Actions and Planning Agents reason in time. Agents reason about time. Time passes as

Kant's Ethics of Duty Kant's Ethics of Duty What do we owe one another? What do we owe one

COLLEGE WHAT I WANT TO MAJOR IN ZOOLOGIST Reason 1 is so I can see so many animals.

RADICAL SURGERY M a t t h e w 5 : 2 9 - 3 0 THE SINFUL EYE SOLUTION: REASON: THE SINFUL HAND

I Prefer Pi Corey Sinnamon Febuary 3, 2015 Big Day 3/14/15 Big Day 3/14/15 Themes Big

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

Question Box An Open Mind Project What is Question Box? Question Box is an elegant shortcut

Lattice Reduction, Integer Programming, and Knapsacks Daniel Lichtblau danl@wolfram.com Wolfram

Discussion Midterm Exam Algorithm Theory WS 2013/14 Fabian Kuhn P1: Maximum Subarray Sum (18pt)

Modified Array Calibration for Precise Angle-of-Arrival Estimation Panarat Cherntanomwong,

Lecture 8 Multi-User MIMO I-Hsiang Wang ihwang@ntu.edu.tw 5/27, 2014

Atomic Nucleus as a Chaotic System V. Zelevinsky National Superconducting Cyclotron Laboratory

1. Knowledge is endless choose wisely

Advanced SQL 01 The Core of SQL Torsten Grust Universitt Tbingen, Germany 1 The Core

Stat 8931 (Aster Models) Lecture Slides Deck 8 Conditional Aster Models Charles J. Geyer School

Finding Inter-procedural Bugs at Scale with Infer Jules Villard <jul@fb.com> Facebook London