Interacting Conceptual Spaces Martha Lewis (with: Yaared Al-Mehairi, - - PowerPoint PPT Presentation
Interacting Conceptual Spaces Martha Lewis (with: Yaared Al-Mehairi, - - PowerPoint PPT Presentation
Interacting Conceptual Spaces Martha Lewis (with: Yaared Al-Mehairi, Joe Bolt, Bob Coecke, Fabrizio Genovese, Dan Marsden, Robin Piedeleu) Quantum Group, Department of Computer Science University of Oxford Dagstuhl Seminar 17192, May 7 - 12:
How can we compose concepts?
How can distributed/neural/conceptual representations be combined with symbolic calculations? General strategy Representational choices Symbolic choices
A general programme for compositional models
1
a Choose a compositional structure, such as a pregroup or combinatory categorial grammar. b Interpret this structure as a category, the grammar category.
2
a Choose or craft appropriate meaning or concept spaces, such as vector spaces, density matrices, or conceptual spaces. b Organize these spaces into a category, the semantics category, with the same abstract structure as the grammar category.
3 Interpret the compositional structure of the grammar category
in the semantics category via a functor preserving the necessary structure.
4 Bingo! This functor maps type reductions in the grammar
category onto algorithms for composing meanings in the semantics category.
Compact closed categories
Diagrammatic notation Clowns ∈ N :=
N Clowns
, f : N →M := M N f g ◦ f :=
M N f N g ,
tell ∈ N ⊗ S ⊗ N := N S N tell Clowns tell jokes n s n
−1n
n−1 →
N N S N N Clowns tell jokes
... originally a calculus for quantum theory!
ψ Ui Aleks Bob Ui “ Bob Aleks ψ
Coecke, Bob, and Aleks Kissinger. Picturing quantum processes: A first course in quantum theory and diagrammatic reasoning. Cambridge University Press, 2017.
Pregroup grammar (Lambek, 1999)
A pregroup P is a partially ordered monoid where each element p ∈ P has left and right ‘inverses’ p−1 and −1p. p−1 · p ≤ 1 ≤ p · p−1 and p · −1p ≤ 1 ≤ −1p · p The pregroup grammar uses atomic types n, s. Other parts of speech are formed from the concatenation of atomic types e.g.
transitive verb = −1nsn−1 adjective = nn−1
Pregroup grammar
If a string reduces to the type s, the sentence is judged
- grammatical. ‘Clowns tell jokes’ → n (−1nsn−1) n:
n (−1nsn−1) n ≤ 1 · sn−1n ≤ 1 · s · 1 ≤ s This reduction can also be expressed via a graphical calculus: Clowns tell jokes n s n
−1n
n−1 (1)
Semantic choices
Vector spaces Density matrices Conceptual spaces
Word meaning is determined by context
U.S. Senate, because they are It made him sympathy for the problems of peace and the sanctity of without the accompaniment of a monstrous crime against the this mystic bond between the suggests a current nostalgia for Harbor” in 1915), the an earthy and very To be Ordinarily, the nothing in the whole range of It is said that fear in megatons: the damage to [?] [?] [?] [?] [?] [?] [?] [?] [?] [?] [?] [?] [?] [?] [?] , like to eat as high on the . beings caught up in the life are not only religious sacrifice. race. and natural world that the values in art. element was the compelling modern dance work, , he believes, is to seek one’s liver synthesizes only enough experience more widely beings produces an odor that germ plasm would be such
Word meaning is determined by context
U.S. Senate, because they are It made him sympathy for the problems of peace and the sanctity of without the accompaniment of a monstrous crime against the this mystic bond between the suggests a current nostalgia for Harbor” in 1915), the an earthy and very To be Ordinarily, the nothing in the whole range of It is said that fear in megatons: the damage to human human human human human human human human human human human human human human human , like to eat as high on the . beings caught up in the life are not only religious sacrifice. race. and natural world that the values in art. element was the compelling modern dance work, , he believes, is to seek one’s liver synthesizes only enough experience more widely beings produces an odor that germ plasm would be such
Word meaning is determined by context
... but that doesn’t work for whole sentences. We want the sentence meaning to be a function of the word meanings. s = f (w1, w2, ...wn)
Vector space models of meaning
iguana cuddly smelly scaly teeth cute
1 10 15 7 2
scaly cuddly smelly Wilbur iguana Similarity is given by cosine distance: sim(v, w) = cos(θv,w) = v, w ||v||||w||
Vector space models of meaning
Atomic types and adjoints n, s map to vector spaces N, S. Concatenation of types maps to the tensor product ⊗ Reductions are mapped to tensor contraction. Clowns tell jokes n s n
−1n
n−1 →
N N S N N Clowns tell jokes
Sentences can be directly compared within sentence space S
Sentence meanings are mapped into one space
N N S N N Clowns tell jokes clowns tell jokes Sad Clowns tell funny jokes
Effective!
Kartsaklis, D., & Sadrzadeh, M. (2013, October). Prior Disambiguation
- f Word Tensors for Constructing Sentence Vectors. In EMNLP (pp.
1590-1601). Grefenstette, E., & Sadrzadeh, M. (2011, July). Experimenting with transitive verbs in a discocat. In Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics (pp. 62-66). Association for Computational Linguistics. Grefenstette, E., & Sadrzadeh, M. (2011, July). Experimental support for a categorical compositional distributional model of meaning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 1394-1404). Association for Computational Linguistics.
Density matrices
We view words as quantum states, combined to form mixed states. I sat on the river I paid some money into the The diagrammatic calculus is the same!
Piedeleu, R., Kartsaklis, D., Coecke, B., & Sadrzadeh, M. (2015). Open system categorical quantum semantics in natural language processing. arXiv preprint arXiv:1502.00831.
Positive operators for hyponymy
pet =
Positive operators for hyponymy
pet =
Positive operators for hyponymy
pet =
Positive operators for hyponymy
pet =
Positive operators for hyponymy
pet =
Positive operators for hyponymy
pet =
Graded hyponymy
Positive operators A, B have the L¨
- wner ordering
A ⊑ B ⇐ ⇒ B − A is positive. We say that A is a hyponym of B if A ⊑ B We say that A is a k-hyponym of B for a given value of k in the range (0, 1] and write A k B if: B − kA is positive We are interested in the maximum such k. Theorem For positive self-adjoint matrices A, B such that supp(A) ⊆ supp(B), the maximum k such that B − kA ≥ 0 is given by 1/λ where λ is the maximum eigenvalue of B+A.
k-Hyponymy interacts well with compositionality
We would like our notion of hyponymy to work at the sentence level. Since sentences are represented as positive operators, we can compare them directly. If sentences have similar structure, we can also give a lower bound on the hyponymy strength between sentences based on the hyponymy strengths between the words in the sentences. Example Suppose dog k pet and park l field. Then My dog runs in the park ??? My pet runs in the field
Conceptual spaces
Conceptual spaces (G¨ ardenfors, 2014) can provide a more cognitively realistic semantics. noun ∈ COLOUR ⊗ SHAPE ⊗ · · · Moulton et al. (2015)
Food and drink - the noun space
We define a property pproperty to be a convex subset of a domain. Some nouns:
Food and drink - the noun space
We define a property pproperty to be a convex subset of a domain. Some nouns: banana = × ×
1 0.2 0.5
apple = × ×
1 0.5 0.8
beer = × ×
1 00.01
Food and drink - the sentence space
Simple example. Events are either positive or negative, surprising or unsurprising. Sentence space of pairs. First element states whether sentence is positive (1) or negative (0), and the second element states whether sentence is surprising (1) or unsurprising (0). Sentence meanings are convex subsets of the space, for example singletons, or larger subsets such as negative = {(0, 1), (0, 0)}. Adjectives are relations from nouns to nouns Transitive verbs are relations from two nouns to a sentence
Concepts in interaction
We form sentences and apply grammatical reductions Sweet bananas are good: bananas taste sweet = (ǫN × 1S × ǫN)(bananas × taste × sweet) = (ǫN × 1S)(banana × (green banana × {(1, 1)} ∪ yellow banana × {(1, 0)}) = {(1, 1), (1, 0)} = positive Sweet beer is not so good: beer tastes sweet = (ǫN × 1S × ǫN)(beer × taste × sweet) = {(0, 1)} = negative and surprising
Bolt, J., Coecke, B., Genovese, F., Lewis, M., Marsden, D., & Piedeleu, R. (2017). Interacting Conceptual Spaces I: Grammatical Composition of
- Concepts. arXiv preprint arXiv:1703.08314.
Neural meaning spaces
DISTRIBUTED REPRESENTATION OF SYMBOLIC STRUCTURE
195 product occurs not at the unit but at the junction of two connections; the two activities entering the triangular junction [12] from the filler and role units are multiplied together and the result is sent along the third line to the binding unit. The representation of complex structures requires superimposing multiple filler/role bindings. There are two obvious ways of doing this: sequentially and in parallel. In the sequential case, one binding is performed at a time, and the binding units accumulate their activity over time. This can be achieved with the network shown in Fig. 8 if we use accumulating sigma-pi binding units obeying dv - E w~ I-ii~,. dt ~ i Equivalently, serial binding can be performed by the network of Fig. 9 if the binding units accumulate activity over time. In order to superimpose all N bindings in parallel, we need to extend the network shown in Fig. 8, creating nodes {f~) N , (~) N },~=1: see
}o-=1 and (r o
- Fig. 10,
which illustrates the simplest case, N = 2. Now each sigma-pi binding unit has N sites instead of one; each site" has unit weight. Each site o- on binding unit/~-~p
(,~) ~o)
receives a pair of connections from the nodes)7 and rp . Now we can bind N pairs of roles and fillers in parallel. In the o'th filler pool we set up the pattern
Filler Units ._j
- O-
j v
@, O,
4¢ I¢
Binding Units
,.c.J
Role Units
d
) )
- Fig. 10. An extension of the network of Fig. 8 that can perform two variable bindings in parallel.
Smolensky (1990)
Categorical compositional cognition
Sentences represented by:
- i
fi ⊗ ri ∈
- i
V ⊗ R(i) Sentences can be processed by matrix multiplication. We map the direct sum to a tensor product
i fi → i fi:
W
- i
- i fi
fi ǫN ⊗ · · · →
- i →
i
· · ·
Al-Mehairi, Y., Coecke, B., & Lewis, M. (2016). Compositional Distributional
- Cognition. arXiv preprint arXiv:1608.03785.
Syntactic choices
Pregroup grammars are just one choice! Coecke, B., Grefenstette, E., & Sadrzadeh, M. (2013). Lambek vs. Lambek: Functorial vector space semantics and string diagrams for Lambek calculus. Annals of pure and applied logic, 164(11), 1079-1100.
Syntactic choices
Summary and Further Work
I have laid out a general programme for applying grammatical,
- r other compositional structures, to words or concepts.
I have described three different semantic models we might use within this programme - vector spaces, density matrices, and conceptual spaces Work in progress - modelling conjunction and disjunction within this framework, entailment in downward monotone contexts, experiments in conceptual spaces, other compositional structures... Any questions?
Thanks to AFOSR, FQXi, EPSRC
Bolt, J., Coecke, B., Genovese, F., Lewis, M., Marsden, D., and Piedeleu, R. (2016). Interacting conceptual spaces. In Workshop on Semantic Spaces at the Intersection of NLP, Physics and Cognitive Science. G¨ ardenfors, P. (2004). Conceptual spaces: The geometry of thought. The MIT Press. G¨ ardenfors, P. (2014). The geometry of meaning: Semantics based on conceptual spaces. MIT Press. Lambek, J. (1999). Type grammar revisited. In Logical aspects of computational linguistics, pages 1–27. Springer. Moulton, D., Goriely, A., and Chirat, R. (2015). The morpho-mechanical basis of ammonite form. Journal of Theoretical Biology, 364:220–230. Piedeleu, R., Kartsaklis, D., Coecke, B., and Sadrzadeh, M. (2015). Open system categorical quantum semantics in natural language processing. In Moss, L. S. and Sobocinski, P., editors, 6th Conference on Algebra and Coalgebra in Computer Science, CALCO 2015, volume 35 of LIPIcs, pages 270–289. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik. Smolensky, P. (1990). Tensor product variable binding and the representation of symbolic structures in connectionist systems. Artificial intelligence, 46(1-2):159–216.