the big question how do we infer and reason about
play

The big question: How do we infer and reason about meanings of - PowerPoint PPT Presentation

The big question: How do we infer and reason about meanings of sentences? Conceptual importance: Discovering the process of cognition and intelligence. Applications: Automating language-related tasks, such as document search. The big


  1. The big question: How do we infer and reason about meanings of sentences? Conceptual importance: Discovering the process of cognition and intelligence. Applications: Automating language-related tasks, such as document search.

  2. The big challenge: Meaning of a sentence � = Collection of meanings of its words. � = { John , likes , Mary } John likes Mary . . . sentence word 1 word 2 word n � = A B Z S = Meaning of a sentence A function of meanings of its words. . . . word 1 word 2 word n A B Z sentence = process depending on S grammatical structure S

  3. Two complementary approaches to meaning 1- The logical or symbolic model  = Meaning of sentence A truth function of its words.  words = ∅ .  2- The vector space or distributional model  = Words Vectors built from context ,  function = ∅ .  . . . word 1 word 2 word n A B Z

  4. Logical vs Vector Space Models (I) Logical Models   Compositional ,         Pros : Model-theoretic semantics (Montague) ,         Automated inferences .      Qualitative (true-false) ,         Not very suitable for real world text ,    Cons :    Says very little about lexical semantics ,            Forgets some of the syntactic structure .  (II) Vector Space Model  Cons: Non-compositional .         Quantitative ,    Pros :    All about lexical semantics .   

  5. A formalism with the best of the two: Compositional & Distributional A function of the − − − − − → Meaning of a sentence = vectors of its words. . . . word 1 word 2 word n A B Z sentence = process depending on S grammatical structure S

  6. Compositional Distributional Models of Meaning Clark, Coecke, Grefenstette, Pulman, Sadrzadeh Computing and Computer Laboratories Oxford and Cambridge

  7. Aim: Understanding this model. Theoretical Preliminaries 0- Some Category Theory 1- Pregroup Grammars 2- Vector Space Models 3- Pregroups and Vector Spaces Categorically 5- Combining the two: Categorical Semantics for Compositional Distributional Models Ed - Concrete: Implementation, Evaluation, Experiments.

  8. Some Category Theory A category has - Objects: A, B, C - Morphisms: f, g, h f ✲ B g ✲ C A B and -The morphisms must compose: f ✲ B and B g ✲ C then ∃ h, A h ✲ C such that h = f ; g . If A - Each object has an identity morphism A 1 A ✲ A 1 B ✲ B B f ✲ B we have This is the unit of composition, i.e. for A 1 A ; f = f ; 1 B = f

  9. Example Objects Morphisms systems processes sets relations sets functions formulas proofs grammatical types grammatical reductions vector spaces linear maps

  10. Sets and Relations Objects: sets A = { x, y } B = { z, w } C = { s, t } Morphisms: Relations f ✲ B A is defined by f ⊆ { ( a, b ) | a ∈ A, b ∈ B } For instance f ✲ B A given by f = { ( x, z ) , ( x, w ) , ( y, z ) } g ✲ C g = { ( z, s ) , ( w, s ) } B given by

  11. Sets and Relations Composition: Composing Relations f ✲ B g ✲ C h ✲ C A ∃ h, A such that h = f ; g In general � � ( a, c ) | ∃ b, ( a, b ) ∈ f & ( b, c ) ∈ g f ; g = For instance in our example f ; g =?

  12. Sets and Relations Identity: Diagonal Relation 1 A = { ( a, a ) | a ∈ A } For our example 1 A = { ( x, x ) , ( y, y ) } 1 B = { ( z, z ) , ( w, w ) } These must satisfy 1 A ; f = f ; 1 B = f For instance compute 1 A ; f = { ( x, x ) , ( y, y ) } ; { ( x, z ) , ( x, w ) , ( y, z ) } and verify that it is = f

  13. Monoidal Category A category with a binary operation called tensor and denoted by ⊗ . This operator acts on two objects and returns their composite A ⊗ B It also acts on morphisms and turns them parallel  f ✲ A Q  A ⊗ B f ⊗ g ✲ Q ⊗ W If then g ✲  B W The tensor has a unit I , that is A ⊗ I = I ⊗ A = A

  14. Sets and Relations There is more than one ⊗ here, but for our purposes, given two sets A, B , we take their tensor product to be cartesian product A ⊗ B = { ( a, b ) | a ∈ A, b ∈ B } For our previous example we have A ⊗ B = { ( x, z ) , ( x, w ) , ( y, z ) , ( y, w ) } I = {∗} The unit is the singleton set A ⊗ I = A × I = { ( a, ∗ ) | a ∈ A } ∼ = { a | a ∈ A } = A Tensor on morphisms is cartesian product of relations.

  15. Diagrammatic Calculus The objects and morphisms of a monoidal category are usually de- picted as follows 1 A f g ; f 1 A ⊗ 1 B f ⊗ 1 C f ⊗ g ( f ⊗ g ); h C E D g f g B B D E f g f f C A A B B B C h f A C A B A A

  16. Diagrammatic Calculus The elements within the objects (e.g. elements of a set) can be depicted using the unit I as follows: ψ : I → A π : A → I π ◦ ψ : I → I π π A π ψ = ψ o A ψ A For instance the morphism I → A can be element x of A = { x, y } . x : I → { x, y }

  17. Compact Category A monoidal category where each object A has a left adjoint A l and a right adjoint A r . This means that for each object A , we have 4 morphisms in the category: ǫ l : A l ⊗ A → I ǫ r : A ⊗ A r → I η l : I → A ⊗ A l η r : I → A r ⊗ A Diagrammatically, these morphisms are depicted by: A A l A r A A r l A A A

  18. Compact Category These morphisms should satisfy: ( η l ⊗ 1 A ); (1 A ⊗ ǫ l ) = 1 A (1 A ⊗ η r ); ( ǫ r ⊗ 1 A ) = 1 A (1 A l ⊗ η l ); ( ǫ l ⊗ 1 A l ) = 1 A ( η r ⊗ 1 A r ); (1 A r ⊗ ǫ r ) = 1 A Diagrammatically, these are depicted by: A A A A l l = = l A A A A l r r A A A A = = r r A A A A

  19. Pregroups ( P, ≤ , • , I, ( − ) l , ( − ) r ) ∃ p r ∈ P, ∃ p l ∈ P ∀ p ∈ P, p l • p ≤ I ≤ p • p l p • p r ≤ I ≤ p r • p ⇒ q l ≤ p l , q r ≤ p r Adjoint are unique and anti-tone p ≤ q = I l = I r = I Unit is self adjoint ( p • q ) l = q l • p l ( p • q ) r = q r • p r So is multiplication ( p r ) r � = p � = ( p l ) l Same adjoint do not cancel out ( p l ) r = p = ( p r ) l But opposite adjoints do

  20. Example of a Proof: adjoints are unique. Suppose p has another left adjoint, call it x . This means x • p ≤ I ≤ p • x Now we have x = x • I ≤ x • p • p l = x • p • p l ≤ I • p l = p l Hence x ≤ p l Similarly p l = p l • I ≤ p l • p • x = p l • p • x ≤ I • x = x Hence p l ≤ x

  21. Example of a Proof • is self-dual We want to show the following (also for the right adjoint) ( p • q ) l = q l • p l Compute ( q l • p l ) • ( p • q ) = q l • ( p l • p ) • q ≤ q l • 1 • q = q l • q ≤ I Also ( p • q ) • ( q l • p l ) = p • ( q • q l ) • p l ≥ p • 1 • p l = p • p l ≥ I Hence we have ( q l • p l ) • ( p • q ) ≤ I ≤ ( p • q ) • ( q l • p l ) So q l • p l is the left adjoint to p • q , but so is ( p • q ) l . Since adjoints are unique, we get q l • p l = ( p • q ) l

  22. Examples of a Pregroup (0) A pregroup in which p l = p r = p − 1 is a (po)-group. (1) The set of all unbounded monotone functions on integers. f : Z → Z m ≤ n = ⇒ f ( m ) ≤ f ( n ) m → ∞ = ⇒ f ( m ) → ∞ and The order is defined pointwisely f ≤ g f ( n ) ≤ g ( n ) ∀ n ∈ Z iff The • is function composition and its unit is the identity ( f • g )( n ) = f ( g ( n )) and I ( n ) = n Adjoints are defined canonically, ∨ is max, ∧ is min f r ( x ) = ∨{ y ∈ Z | f ( y ) ≤ x } f l ( x ) = ∧{ y ∈ Z | x ≤ f ( y ) }

  23. Example 1) Take f ( x ) = 2 x . Define adjoints as follows: f r ( x ) = ∨{ y ∈ Z | 2 y ≤ x } f l ( x ) = ∧{ y ∈ Z | x ≤ 2 y } f r ( x ) = ⌊ x/ 2 ⌋ f l ( x ) = ⌊ ( x + 1) / 2 ⌋ and where ⌊ x ⌋ is the biggest integer less than or equal to x . 2) Restrict to N and a nice example is π r ( x ) =? π ( x ) = the x ’th prime π r (5) = 3 π (5) = 11

  24. Application to Linguistics Let Σ be the set of words of a natural language and B their types. Def. A Pregroup dictionary for Σ based on B is a binary relation D ⊆ Σ × T ( B ) where T ( B ) is the free pregroup generated over the partial order B . Def. A Pregroup grammar is a pair G = � D, s � of a pregroup dictionary and a distinguished element s ∈ B . Def. A string of words w 1 . . . w n of Σ is a grammatical sentence if and only if t 1 • · · · • t n ≤ s for ( w i , t i ) an element in D .

  25. Example A simple dictionary has basic types B = { π, o, w, s, q, q, j, σ } π, o, w stand for subject, direct object, indirect object, s, j stand for statement, infinitive of a verb, q, q stand for yes-no and wh questions, σ is an index type. Partial order π ≤ n, o ≤ n . Dictionary likes: π r so l does: π r sj l σ John: π like: σ r jπ l not: σ r jj l σ Mary : o

  26. Examples Compose the types of the constituents John likes Mary → statement ( π r s o l ) π o ≤ s Compute ππ r so l o ≤ 1 so l o ≤ 1 s 1 = s John does not like Mary → statement ( π r sj l σ ) ( σ r jj l σ ) ( σ r jo l ) ≤ π o s Compute: ππ r sj l σσ r jj l σσ r jo l o ≤ 1 sj l 1 jj l 1 j 1 = sj l jj l j ≤ s 11 = s Can you think of a simpler way to compute the above?

  27. Depicting the Reduction Each reduction corresponds to a diagram. John likes Mary John does not like Mary π r sj l σ σ r jj l σ σ r jo l π o π r s o l π o

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend