Residuated lattices in syntactic description Alexander Clark - - PowerPoint PPT Presentation

residuated lattices in syntactic description
SMART_READER_LITE
LIVE PREVIEW

Residuated lattices in syntactic description Alexander Clark - - PowerPoint PPT Presentation

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis Residuated lattices in syntactic description Alexander Clark Department of Computer Science Royal Holloway, University of London June 2011


slide-1
SLIDE 1

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Residuated lattices in syntactic description

Alexander Clark

Department of Computer Science Royal Holloway, University of London

June 2011 QMUL

slide-2
SLIDE 2

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Most important problem of linguistics

Chomsky’s questions

1

What constitutes knowledge of a language?

2

How is this knowledge acquired by its speakers?

slide-3
SLIDE 3

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Tension

Chomsky, 1986 To achieve descriptive adequacy it often seems necessary to enrich the system of available devices, whereas to solve our case of Plato’s problem we must restrict the system of available devices so that only a few languages or just one are determined by the given

  • data. It is the tension between these two tasks that

makes the field an interesting one, in my view.

slide-4
SLIDE 4

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Tension

Chomsky, 1986 To achieve descriptive adequacy it often seems necessary to enrich the system of available devices, whereas to solve our case of Plato’s problem we must restrict the system of available devices so that only a few languages or just one are determined by the given

  • data. It is the tension between these two tasks that

makes the field an interesting one, in my view. Boeckx and Piattelli-Palmarini (2005) "the primary contribution of P&P , in the present connection, was to divorce questions of learning entirely from the question of the “format for grammar”"

slide-5
SLIDE 5

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Distributional lattice grammars

A recently developed grammatical formalism that tries to resolve this tension: efficient learnability cubic parsing algorithm slightly context sensitive based on a residuated lattice (the syntactic concept lattice)

slide-6
SLIDE 6

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Empiricist models

Slogan The structure of the representation should be based on the structure of the language, not something arbitrarily imposed on it from outside. Congruence based approaches: DFAs based on the Myhill-Nerode congruence (Angluin, 1982,1987) CFGs based on the syntactic congruence (Clark and Eyraud, 2007, Clark, 2010) MCFGs based on congruence of tuples (Yoshinaka, 2009) Lattice based approaches based on the syntactic concept lattice.

slide-7
SLIDE 7

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Recall (or otherwise)

Monoid M, ◦, 1, ◦ is associative and 1 ◦ u = u = u ◦ 1 Example: strings u ◦ v = uv, 1 is empty string Bounded lattice M, ∧, ∨, ⊤, ⊥ Example: powerset lattice 2X, ∨ = ∪, ∧ = ∩, ⊥ = ∅ Lattice ordered monoid M is a lattice and a monoid such that X ≤ Y, P ≤ Q means X ◦ P ≤ Y ◦ Q

slide-8
SLIDE 8

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Slightly stronger condition

Residuation operations X ◦ Y ≤ Z iff X ≤ Z/Y iff Y ≤ X\Z Z/Y = max{X|X ◦ Y ≤ Z} Example Set of all subsets of a monoid X ◦ Y = {xy|x ∈ X, y ∈ Y} Specifically if monoid is Σ∗ we have the lattice of all languages

  • ver Σ, 1 = λ, ⊤ = Σ∗, ⊥ = ∅

Denote this by 2Σ∗

slide-9
SLIDE 9

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Residuated lattices

Appear twice Algebraic underpinning for DLGs Models for substructural logics and the Lambek calculus Questions in this talk Is this a coincidence? What is the relationship? How does this relate to the proof theory/model theory argument?

slide-10
SLIDE 10

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Rules of inference

Correspondence Inferences in the associative Lambek calculus AL are theorems about residuated lattices: Lambek calculus is sound w.r.t residuated lattices. Lambek calculus residuated lattices x(yz) → (xy)z (X ◦ Y) ◦ Z = X ◦ (Y ◦ Z) (x/y)y → x (X/Y) ◦ Y ≤ X x → y/(x\y) X ≤ (Y/(X\Y)) (x/y)(y/z) → x/z (X/Y) ◦ (Y/Z) ≤ (X/Z)

slide-11
SLIDE 11

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Rules of inference

Correspondence Inferences in the associative Lambek calculus AL are theorems about residuated lattices: Lambek calculus is sound w.r.t residuated lattices. Lambek calculus residuated lattices x(yz) → (xy)z (X ◦ Y) ◦ Z = X ◦ (Y ◦ Z) (x/y)y → x (X/Y) ◦ Y ≤ X x → y/(x\y) X ≤ (Y/(X\Y)) (x/y)(y/z) → x/z (X/Y) ◦ (Y/Z) ≤ (X/Z) ? X ≤ X ◦ (Y/Y) X ∧ Y ≤ X

slide-12
SLIDE 12

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Distributions

The substring context relation

Context (or environment) A context is just a pair of strings (l, r) ∈ Σ∗ × Σ∗. (l, r) ⊙ u = lur (l, r) ⊙ (x, y) = (lx, yr) Distribution of a string in a language (l, r) ∼L u iff lur ∈ L CL(u) = {(l, r)|lur ∈ L} = {f|f ⊙ u ∈ L} (λ, λ) ∈ CL(u) iff u ∈ L

slide-13
SLIDE 13

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Syntactic Concept Lattice

Galois connection of the substring context relation

S is a set of strings, and C is a set of contexts. Polar maps S′ = {(l, r) : ∀w ∈ S lwr ∈ L} C′ = {w : ∀(l, r) ∈ C lwr ∈ L} L = {(λ, λ)}′ Closure operator S′′ ⊇ S If S′′ = S then S is a closed set of strings Always true that S′′′ = S′

slide-14
SLIDE 14

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Concepts

Formal Concept Analysis

Concept A syntactic concept is an ordered pair S, C. where C′ = S and S′ = C. Alternatively: maximal sets such that C ⊙ S ⊆ L. Defined equally by closed sets of strings.

slide-15
SLIDE 15

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Basic properties

Partial order S1, C1 ≤ S2, C2 iff S1 ⊆ S2 iff C1 ⊇ C2 Lattice The set of concepts of a language form a complete lattice Sx, Cx ∧ Sy, Cy = Sx ∩ Sy, (Sx ∩ Sy)′ Finite iff L is regular Typical concepts C(w) = {w}′′, {w}′ Language L = L, {(λ, λ)}′′ = C(L) = C((λ, λ)) Top ⊤ = Σ∗, ∅ Bottom ⊥ = ∅, Σ∗ × Σ∗ Unit 1 = C(λ)

slide-16
SLIDE 16

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

L = (ab)∗

⊥ = ∅, Σ∗ × Σ∗ [a], [λ, b], [b], [a, λ] L = [ab] ∪ [λ], [λ, λ] [ba] ∪ [λ], [a, b] 1 = [λ], [a, b] ∪ [λ, λ] ⊤ = Σ∗, ∅

slide-17
SLIDE 17

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Concatenation

Definition Sx, Cx ◦ Sy, Cy = (SxSy)′′, (SxSy)′ = C(SxSy) The smallest concept that contains the concatenation of the sets Sx and Sy. Observation w = a1 . . . an is in L iff C(a1) ◦ · · · ◦ C(an) ≤ L

slide-18
SLIDE 18

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Difference

L = {ab, c} Free RL 2Σ∗ {a} ◦ {b} = {ab} Syntactic Concept Lattice {a} ◦ {b} = {ab, c}

slide-19
SLIDE 19

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Residuated lattice

This is a complete residuated lattice; written B(L). Concatenation is a monoid: associative and with unit C({λ}). Suppose X = Sx, Cx and Y = Sy, Cy are concepts.

X/Y = C(Cx ⊙ (λ, Sy)) Y \ X = C(Cx ⊙ (Sy, λ))

slide-20
SLIDE 20

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Residuated lattice

This is a complete residuated lattice; written B(L). Concatenation is a monoid: associative and with unit C({λ}). Suppose X = Sx, Cx and Y = Sy, Cy are concepts.

X/Y = C(Cx ⊙ (λ, Sy)) Y \ X = C(Cx ⊙ (Sy, λ))

Residuation Suppose all elements of X occur in a context (l, r); Suppose u is in X/Y and v is in Y So uv occurs in (l, r) – luvr ∈ L u must occur in context (l, vr) which is (l, r) ⊙ (λ, v)

slide-21
SLIDE 21

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Rulon Wells

Immediate Constituents, Language 1947 It is easy to define a focus-class embracing a large variety of sequence classes but characterized by only a few environments; it is also easy to define one characterized by a great many environments in which all its members occur but on the other hand poor in the number of diverse sequence-classes that it embraces. What is difficult, but far more important than either of the easy tasks, is to define focus-classes rich both in the number of environments chracterizing them and at the same time in the diversity of sequence classes that they embrace. Concepts high up in the lattice have a few contexts, but lots

  • f strings

Concepts low down have a larger number of contexts, but

  • nly a few strings.
slide-22
SLIDE 22

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

me/him/us/them/(7) (2) her/(1) it/(1) you/(1) (0) (16) my/his/our/their/(5) (7) (4) he/she/(3) I/(1) we/they/(3)

slide-23
SLIDE 23

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Finite Representation

Distributional Lattice Grammars Fix three finite sets: Finite set of strings K Finite set of contexts F Finite subset L ∩ (F ⊙ KK) The lattice of this finite relation: B(K, L, F)

slide-24
SLIDE 24

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Dyck language

λ, ab, abab, aabb, abaabb . . .

λ a b ab (λ, λ) (a, λ) (λ, b)

L = {λ, ab}, (λ, λ) A = {a}, (λ, b) B = {b}, (a, λ) ⊤ ⊥

slide-25
SLIDE 25

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Goal

Predict which concept a string is in: Define function φ : Σ∗ → B(K, L, F) A string w is in the language if φ(w) has the context (λ, λ). We want φ(w) = S, C to mean that CL(w) ∩ F = C.

slide-26
SLIDE 26

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Representation

Definition φ : Σ∗ → B(K, D, F). φ(λ) = C(λ) for all a ∈ Σ, (i.e. for all w, |w| = 1) φ(a) = C(a) for all w with |w| > 1, φ(w) =

  • u,v∈Σ+:uv=w

φ(u) ◦ φ(v) Language Define L(B(K, L, F)) = {w : φ(w) ≤ C({(λ, λ)})}

slide-27
SLIDE 27

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Solution to Chomsky’s tension

Expressivity Includes all regular languages Some but not all context free languages Some non context free languages Learnability Given membership queries: polynomial update time learnability for these languages

slide-28
SLIDE 28

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Lambek grammars

Types We have a finite or countable set of primitive types with a distinguished element S /, \, · Tp is the infinite set of types Universal inference system Lexicon Lex ⊂ Σ × Tp All of the language variation is in the lexicon.

slide-29
SLIDE 29

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Models for the Lambek Calculus

Buszkowski, 1978/1982, Pentus, 1995

Free models Models are just for the calculus not for the language Canonical model is just the lattice of all languages over Σ Other types of model possible

slide-30
SLIDE 30

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Models for the Lambek Calculus

Buszkowski, 1978/1982, Pentus, 1995

Free models Models are just for the calculus not for the language Canonical model is just the lattice of all languages over Σ Other types of model possible A little strange Lecomte “the interpretation of a category should be the set of words and expressions of this category, shouldn’t it?”. If a has type T then a should be in the interpretation of T We want a model for Lambek grammar not a model for Lambek calculus. We don’t want the calculus to be complete. (There are things that are true about French but not about English)

slide-31
SLIDE 31

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Lambek, 1958

“We shall assign type n to all expressions which can occur in any context in which all proper names can occur.” Let N be the set of proper names (not necessarily closed) N′ is then the set of contexts that all proper nouns can

  • ccur in.

N′′ is the set of all strings which can occur in any of N′ n is assigned to N′′ which is a closed set. “if we write = instead of ⇄ the deductive system studied here becomes a partially ordered system which resembles a residuated lattice.”

slide-32
SLIDE 32

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Intended interpretation

L the language is a concept in B(L) If Y is a concept and X is any set of strings then Y/X is a concept

slide-33
SLIDE 33

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Intended interpretation

L the language is a concept in B(L) If Y is a concept and X is any set of strings then Y/X is a concept Every (product-free) type then corresponds to an element

  • f B(L).

Lambek grammar should be thought of as an equational theory of the syntactic concept lattice.

slide-34
SLIDE 34

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Unification

Let’s try to unify these two approaches. Proof theoretic: Lambek calculus Algebraic/Model theoretic Fixed language L Lambek grammar G for L as a theory View B(L) as a model for G

slide-35
SLIDE 35

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Move the language into the calculus

For each letter a define a “symbol” A. Replace (a, T) ∈ Lex by the inequality A ≤ T. Define ELEX = {A ≤ T|(a, T) ∈ Lex} Then w ∈ L iff A1 · An ≤ L follows from ELEX Replacing equational theory with quasi-equational theory

slide-36
SLIDE 36

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Concepts and Contexts in DLGs

Each context can be defined by a term Translation from a DLG Context (l, r) ∈ F Term l\L/r Concept C = {(l1, r1) . . . (lk, rk)}′ C = (l1\L/r1) ∧ . . . (lk\L/rk) A pure formalism: only have types for L and elements of Σ

slide-37
SLIDE 37

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Partial lattice to whole lattice

Problem: we only have the partial finite lattice B(K, L, F) but we are interested in B(L) Canonical map f ∗(S, C) = S′′, S′ B(K, L, F) → B(L) Lemma If X ≤ Y then f ∗(X) ≤ f ∗(Y) If K is sufficiently large: f ∗(X ∧ Y) = f ∗(X) ∧ f ∗(Y). f ∗(X ◦ Y) ≥ f ∗(X) ◦ f ∗(Y).

slide-38
SLIDE 38

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

DLG

Equations A ≤ X X ◦ Y ≤ Z X ∧ Y ≤ Z Let EDLG be the set of equations from a DLG. Language definition w ∈ L iff φG(w) ≤ L w ∈ L iff w ≤ L follows from EDLG Mapping 1, ◦, ∨-homomorphism from 2Σ∗ → B(L) defined by X → X ′′, X ′

slide-39
SLIDE 39

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Dyck language

λ, ab, abab, aabb, abaabb . . .

L = {λ, ab}, (λ, λ) A = {a}, (λ, b) B = {b}, (a, λ) ⊤ ⊥

L A B ⊥ ⊤ ⊤ ⊤ ⊤ ⊤ ⊥ L ⊤ L A B ⊥ A ⊤ A ⊤ L ⊥ B ⊤ B ⊤ ⊤ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥

slide-40
SLIDE 40

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Example

Dyck language (λ, λ), (a, λ), (λ, b) L, A\L, L/B Equations A ≤ L/B, B ≤ A\L L ◦ L ≤ L, L ◦ B ≤ B, B ◦ B ≤ ⊤ . . . Proof that aabb ∈ L A ◦ A ◦ B ◦ B ≤ A ◦ L ◦ B ≤ A ◦ B ≤ L

slide-41
SLIDE 41

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Church-Rosser systems

Presentation Finite presentation of a monoid Set of generators p, q and an equation p ◦ q = 1 Gives you the bicyclic monoid: the syntactic monoid of the Dyck language Semi-Thue or reduction system Exactly the same as the string rewriting rule pq → λ; which gives a Church-Rosser language. This is similar but with residuated lattices rather than monoids.

slide-42
SLIDE 42

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Semantics

Inconsistent types Ambiguous words will get inconsistent types: Example: “rose” noun, verb, adjective X ≤ N and X ≤ V then X ≤ N ∧ V

slide-43
SLIDE 43

Introduction Syntactic concept lattice Distributional lattice grammars Categorial grammars Synthesis

Conclusion

Languages have an objectively defined natural algebraic structure that is a residuated lattice – the syntactic concept lattice. If you are using the theory of the calculus of residuals to describe a language, then you should describe this structure. No constraint in Lambek grammar that makes the symbols correspond to concepts. Move from the equational theory to the quasi-equational theory.