Distributional Learning of Context-Free Grammars. Alexander Clark - - PowerPoint PPT Presentation

distributional learning of context free grammars
SMART_READER_LITE
LIVE PREVIEW

Distributional Learning of Context-Free Grammars. Alexander Clark - - PowerPoint PPT Presentation

Distributional Learning of Context-Free Grammars. Alexander Clark Department of Philosophy Kings College London alexander.clark@kcl.ac.uk 14 November 2018 UCL Outline Introduction Weak Learning Strong Learning An Algebraic Theory of


slide-1
SLIDE 1

Distributional Learning of Context-Free Grammars.

Alexander Clark

Department of Philosophy King’s College London alexander.clark@kcl.ac.uk

14 November 2018 UCL

slide-2
SLIDE 2

Outline

Introduction Weak Learning Strong Learning An Algebraic Theory of CFGs

slide-3
SLIDE 3

Outline

Introduction Weak Learning Strong Learning An Algebraic Theory of CFGs

slide-4
SLIDE 4

Machine learning

Standard machine learning problem

We learn a function f : X → Y from a sequence of input-output pairs (x1, y1) . . . (xn, yn)

Convergence

As n → ∞ we want our hypothesis ˆ f to tend to f Ideally we want ˆ f = f .

slide-5
SLIDE 5

Vector spaces

Standard two assumptions

  • 1. Assume sets have some algebraic structure:

◮ X is Rn ◮ Y is R

  • 2. Assume f satisfies some smoothness assumptions:

◮ f is linear ◮ or satisfies some Lipschitz condition: |f (xi) − f (xj| ≤ c|xi − xj|

slide-6
SLIDE 6

◮ The input examples are strings. ◮ No output (unsupervised learning!) ◮ Our representations are context-free grammars.

slide-7
SLIDE 7

Context-Free Grammars

Context-Free Grammar

G = Σ, V , S, P L(G, A) = {w ∈ Σ∗ | A ∗ ⇒G w}

Example

Σ = {a, b}, V = {S} P = {S → ab, S → aSb, S → ǫ} L(G, S) = {anbn | n ≥ 0}

slide-8
SLIDE 8

Least fixed point semantics

[Ginsburg and Rice(1962)]

Interpret this as a set of equations in P(Σ∗) S = (a ◦ b) ∨ (a ◦ S ◦ b) ∨ ǫ

slide-9
SLIDE 9

Least fixed point semantics

[Ginsburg and Rice(1962)]

Interpret this as a set of equations in P(Σ∗) S = (a ◦ b) ∨ (a ◦ S ◦ b) ∨ ǫ

◮ Ξ is the set of functions V → P(Σ∗) ◮ ΦG : Ξ → Ξ

ΦG(ξ)[S] = (a ◦ b) ∨ (a ◦ ξ(S) ◦ b) ∨ ǫ Least fixed point ξG =

n Φn G(ξ⊥) = {S → L(G, S)}

slide-10
SLIDE 10

What Algebra?

Monoid: S, ◦, 1

Σ∗

slide-11
SLIDE 11

What Algebra?

Monoid: S, ◦, 1

Σ∗

Complete Idempotent Semiring: S, ◦, 1, ∨, ⊥

P(Σ∗)

slide-12
SLIDE 12

Outline

Introduction Weak Learning Strong Learning An Algebraic Theory of CFGs

slide-13
SLIDE 13

Running example

Propositional logic

Alphabet

rain, snow, hot, cold, danger A1, A2, . . . and, or, implies, iff ∧, ∨, →, ↔ not ¬

  • pen, close

(, )

slide-14
SLIDE 14

Running example

Propositional logic

Alphabet

rain, snow, hot, cold, danger A1, A2, . . . and, or, implies, iff ∧, ∨, →, ↔ not ¬

  • pen, close

(, )

◮ rain ◮ open snow implies cold close ◮ open snow implies open not hot close close

slide-15
SLIDE 15

Distributional Learning

[Harris(1964)]

◮ Look at the dog ◮ Look at the cat

slide-16
SLIDE 16

Distributional Learning

[Harris(1964)]

◮ Look at the dog ◮ Look at the cat ◮ That cat is crazy

slide-17
SLIDE 17

Distributional Learning

[Harris(1964)]

◮ Look at the dog ◮ Look at the cat ◮ That cat is crazy ◮ That dog is crazy

slide-18
SLIDE 18

English counterexample

◮ I can swim ◮ I may swim ◮ I want a can of beer

slide-19
SLIDE 19

English counterexample

◮ I can swim ◮ I may swim ◮ I want a can of beer ◮ *I want a may of beer

slide-20
SLIDE 20

English counterexample

◮ She is Italian ◮ She is a philosopher ◮ She is an Italian philosopher

slide-21
SLIDE 21

English counterexample

◮ She is Italian ◮ She is a philosopher ◮ She is an Italian philosopher ◮ *She is an a philosopher philosopher

slide-22
SLIDE 22

Logic example

Propositional logic is substitutable:

◮ open rain and cold close ◮ open rain implies cold close ◮ open snow implies open not hot close

slide-23
SLIDE 23

Logic example

Propositional logic is substitutable:

◮ open rain and cold close ◮ open rain implies cold close ◮ open snow implies open not hot close ◮ open snow and open not hot close

slide-24
SLIDE 24

Formally

The Syntactic Congruence: a monoid congruence

Two nonempty strings u, v are congruent (u ≡L v) if for all l, r ∈ Σ∗ lur ∈ L ⇔ lvr ∈ L We write [u] for the congruence class of u.

Definition

L is substitutable if lur ∈ L, lvr ∈ L ⇒ u ≡L v

slide-25
SLIDE 25

Example

Input data D ⊆ L

◮ hot ◮ cold ◮ open hot or cold close ◮ open not hot close ◮ open hot and cold close ◮ open hot implies cold close ◮ open hot iff cold close ◮ danger ◮ rain ◮ snow

slide-26
SLIDE 26

One production for each example

◮ S → hot ◮ S → cold ◮ S → open hot or cold close ◮ S → open not hot close ◮ S → open hot and cold close ◮ S → open hot implies cold close ◮ S → open hot iff cold close ◮ S → danger ◮ S → rain ◮ S → snow

slide-27
SLIDE 27

A trivial grammar

Input data D

D = {w1, w2, . . . , wn} are nonempty strings.

Starting grammar

S → w1, S → w2, . . . , S → wn L(G) = D

slide-28
SLIDE 28

A trivial grammar

Input data D

D = {w1, w2, . . . , wn} are nonempty strings.

Starting grammar

S → w1, S → w2, . . . , S → wn L(G) = D

Binarise this every way

One nonterminal [[w]] for every substring w.

◮ [[a]] → a ◮ S → {w}, w ∈ D ◮ [[w]] → [[u]][[v]] when w = u · v

L(G, [[w]]) = {w}

slide-29
SLIDE 29

S [[open not hot close]] [[hot close]] [[close]] close [[hot]] hot [[open not]] [[not]] not [[open]]

  • pen
slide-30
SLIDE 30

Nonterminal for each substring

  • pen not
  • pen hot

hot and cold hot implies cold hot or cold hot iff cold not hot cold close iff cold

  • r cold

implies cold and cold close and implies iff

  • r

hot iff hot implies hot or hot and hot close

  • pen hot iff cold
  • pen hot or cold
  • pen hot implies cold
  • pen not hot
  • pen hot and cold

hot and cold close hot or cold close hot iff cold close hot implies cold close not hot close

  • pen hot or cold close
  • pen hot and cold close

snow

  • pen hot implies cold close
  • pen hot iff cold close
  • pen not hot close

cold hot danger rain implies cold close

  • r cold close

and cold close iff cold close

  • pen
  • pen hot or
  • pen hot iff
  • pen hot implies
  • pen hot and

not

slide-31
SLIDE 31

Nonterminal for each cluster

  • pen not
  • pen hot

hot and cold hot implies cold hot or cold hot iff cold not hot cold close iff cold

  • r cold

implies cold and cold close and implies iff

  • r

hot iff hot implies hot or hot and hot close

  • pen hot iff cold
  • pen hot or cold
  • pen hot implies cold
  • pen not hot
  • pen hot and cold

hot and cold close hot or cold close hot iff cold close hot implies cold close not hot close

  • pen hot or cold close
  • pen hot and cold close

snow

  • pen hot implies cold close
  • pen hot iff cold close
  • pen not hot close

cold hot danger rain implies cold close

  • r cold close

and cold close iff cold close

  • pen
  • pen hot or
  • pen hot iff
  • pen hot implies
  • pen hot and

not

slide-32
SLIDE 32

Productions

Observation

If w = u · v then [w] ⊇ [u] · [v]

slide-33
SLIDE 33

Productions

Observation

If w = u · v then [w] ⊇ [u] · [v]

Add production

[[w]] → [[u]][[v]]

slide-34
SLIDE 34

Productions

Observation

If w = u · v then [w] ⊇ [u] · [v]

Add production

[[w]] → [[u]][[v]]

Consequence

If L is substitutable, then L(G, [[w]]) ⊆ [w] L(G) ⊆ L

slide-35
SLIDE 35

Theorem [Clark and Eyraud(2007)]

◮ If the language is a substitutable context-free language, then

the hypothesis grammar will converge to a correct grammar.

◮ Efficient; provably correct

slide-36
SLIDE 36

Theorem [Clark and Eyraud(2007)]

◮ If the language is a substitutable context-free language, then

the hypothesis grammar will converge to a correct grammar.

◮ Efficient; provably correct

But the grammar may be different for each input data set!

slide-37
SLIDE 37
  • pen not hot close

NT11 NT5 close NT9 NT2 NT11 hot NT15 not NT13

  • pen

NT11 NT5 close NT9 NT11 hot NT0 NT15 not NT13

  • pen

NT11 NT8 NT5 close NT11 hot NT0 NT15 not NT13

  • pen

NT11 NT10 NT8 NT5 close NT11 hot NT15 not NT13

  • pen

NT11 NT10 NT5 close NT2 NT11 hot NT15 not NT13

  • pen
slide-38
SLIDE 38

Larger data set: 92 nonterminals, 435 Productions

and open rain implies snow close cold and open

  • pen rain implies snow

and open rain and open not

  • pen hot and cold close and open rain implies snow

cold and open not hot hot close close close close cold close and open rain implies snow and open rain implies close and

  • pen hot and cold close and open rain implies

and cold close and open rain implies snow close close

  • pen hot and cold close and

close and open and cold close and open close and open rain and cold close and open rain close and open rain implies

  • pen open hot and cold close and

rain implies and cold close and open rain implies

  • pen open hot and cold close and open rain

cold and open not hot close close hot and cold close hot iff cold close

  • pen hot and cold close and open rain implies snow close close

not hot close hot or cold close cold and hot close hot implies cold close

  • pen open hot
  • pen hot or cold close
  • pen hot and cold close
  • pen cold and hot close
  • pen hot implies cold close

cold hot danger rain snow

  • pen hot iff cold close
  • pen cold and open not hot close close
  • pen not hot close
  • pen open hot and cold close and open rain implies snow close close

rain implies snow close close snow close close close and open rain implies snow hot close

  • pen not hot close close

close and open rain implies snow close cold and hot hot and cold cold and open not hot close hot or cold hot implies cold hot iff cold not hot

  • pen hot and cold close and open rain implies snow close

rain implies snow close cold close and open rain implies cold close

  • pen cold and open not

and cold close and

  • pen open hot and cold close and open rain implies

hot and cold close and open

  • pen hot and cold close and open rain
  • pen hot or
  • pen hot iff
  • pen hot implies
  • pen hot and

implies snow close close cold close and open not hot close close cold close and open rain and open not hot close close and hot close and cold close and open rain implies snow hot and cold close and open rain implies snow close close close and open rain implies snow close close and open rain implies snow close close

  • pen open hot and cold close

hot and cold close and open rain

  • pen rain implies

close and implies iff

  • r
  • pen hot iff cold
  • pen hot implies cold
  • pen open hot and cold close and open rain implies snow close
  • pen cold and hot
  • pen hot and cold
  • pen cold and open not hot close
  • pen hot or cold
  • pen not hot

rain implies snow implies snow close implies snow implies cold close

  • r cold close

and cold close iff cold close hot and cold close and open rain implies snow

  • pen

cold and not cold close and open rain implies snow close close and cold close and open rain implies snow close

  • pen open hot and
  • pen cold and
  • pen not

cold close and open rain implies snow close

  • pen open

hot and cold close and

  • pen rain
  • pen hot and cold close and open
  • pen hot

and open not hot hot and cold close and open rain implies snow close hot and cold close and open rain implies

  • pen open hot and cold close and open rain implies snow
  • pen cold and open not hot

and open cold close and

  • pen cold

and open not hot close and hot cold and open not

  • pen cold and open

snow close

  • r cold

iff cold implies cold and cold and open rain implies snow

  • pen open hot and cold

hot and hot iff hot implies hot or

  • pen rain implies snow close
  • pen rain implies snow close close
  • pen open hot and cold close and open
slide-39
SLIDE 39
  • pen open hot and cold close and open rain implies snow

close close

327204 parses

NT25 NT53 close NT55 NT53 close NT76 NT5 NT56 NT25 snow NT20 NT54 implies NT25 rain NT71 NT61

  • pen

NT13 NT10 NT54 and NT53 close NT55 NT25 cold NT40 NT54 and NT72 NT25 hot NT61

  • pen

NT61

  • pen
slide-40
SLIDE 40

Outline

Introduction Weak Learning Strong Learning An Algebraic Theory of CFGs

slide-41
SLIDE 41

Strong Learning

Target class of grammars

G is some set of context-free grammars. Pick some grammar G∗ ∈ G

Weak learning

We receive examples w1, . . . , wn, . . . We produce a series of hypotheses G1, . . . , Gn, . . . We want Gn to converge to some grammar ˆ G such that L( ˆ G) = L(G∗)

slide-42
SLIDE 42

Strong Learning

Target class of grammars

G is some set of context-free grammars. Pick some grammar G∗ ∈ G

Strong learning

We receive examples w1, . . . , wn, . . . We produce a series of hypotheses G1, . . . , Gn, . . . We want Gn to converge to some grammar ˆ G such that ˆ G ≡ G∗

slide-43
SLIDE 43

Inaccurate clusters

  • pen not
  • pen hot

hot and cold hot implies cold hot or cold hot iff cold not hot cold close iff cold

  • r cold

implies cold and cold close and implies iff

  • r

hot iff hot implies hot or hot and hot close

  • pen hot iff cold
  • pen hot or cold
  • pen hot implies cold
  • pen not hot
  • pen hot and cold

hot and cold close hot or cold close hot iff cold close hot implies cold close not hot close

  • pen hot or cold close
  • pen hot and cold close

snow

  • pen hot implies cold close
  • pen hot iff cold close
  • pen not hot close

cold hot danger rain implies cold close

  • r cold close

and cold close iff cold close

  • pen
  • pen hot or
  • pen hot iff
  • pen hot implies
  • pen hot and

not

slide-44
SLIDE 44

Correct congruence classes

hot iff hot and hot implies hot or not

  • pen hot or cold
  • pen hot iff cold
  • pen hot and cold
  • pen not hot
  • pen hot implies cold
  • pen hot iff cold close
  • pen not hot close

cold

  • pen hot or cold close

hot

  • pen hot implies cold close

danger rain snow

  • pen hot and cold close

hot iff cold close not hot close hot implies cold close hot and cold close hot or cold close

  • r cold close

iff cold close and cold close implies cold close

  • pen not
  • pen hot or
  • pen hot iff
  • pen hot implies
  • pen hot and

and cold

  • r cold

implies cold iff cold hot implies cold hot iff cold not hot hot or cold hot and cold cold close hot close close and implies iff

  • r
  • pen
  • pen hot
slide-45
SLIDE 45

Myhill-Nerode Theorem

A language has a finite number of congruence classes if and only if it is regular.

slide-46
SLIDE 46

Myhill-Nerode Theorem

A language has a finite number of congruence classes if and only if it is regular. We need some principled way of picking a finite collection of “good” congruence classes.

slide-47
SLIDE 47

hot iff hot and hot implies hot or not

  • pen not
  • pen hot or
  • pen hot iff
  • pen hot implies
  • pen hot and
  • pen
slide-48
SLIDE 48

Definition

Definition

A congruence class X is composite if there are two congruence classes Y , Z such that X = YZ. (and neither Y nor Z is the class [λ])

Definition

A congruence class X is prime if it is not composite. The whole is greater than the sum of the parts

slide-49
SLIDE 49

hot iff hot and hot implies hot or not

  • pen hot or cold
  • pen hot iff cold
  • pen hot and cold
  • pen not hot
  • pen hot implies cold
  • pen hot iff cold close
  • pen not hot close

cold

  • pen hot or cold close

hot

  • pen hot implies cold close

danger rain snow

  • pen hot and cold close

hot iff cold close not hot close hot implies cold close hot and cold close hot or cold close

  • r cold close

iff cold close and cold close implies cold close

  • pen not
  • pen hot or
  • pen hot iff
  • pen hot implies
  • pen hot and

and cold

  • r cold

implies cold iff cold hot implies cold hot iff cold not hot hot or cold hot and cold cold close hot close close and implies iff

  • r
  • pen
  • pen hot
slide-50
SLIDE 50
  • pen cold and
  • pen open rain or cold close implies
  • pen not
  • pen hot or
  • pen hot iff
  • pen rain or
  • pen hot implies
  • pen rain and
  • pen hot and

cold and open

  • r cold close implies danger close

and open not

  • pen rain and snow close
  • pen hot iff cold close
  • pen rain or cold close

snow

  • pen cold and open not hot close close
  • pen not hot close
  • pen hot or cold close

cold

  • pen open rain or cold close implies danger close
  • pen hot implies cold close

hot danger rain

  • pen hot and cold close

hot close close

  • pen rain or cold
  • pen cold and open not hot close
  • pen rain and snow
  • pen hot or cold
  • pen open rain or cold close implies danger
  • pen hot iff cold
  • pen hot and cold
  • pen not hot
  • pen hot implies cold

close close hot iff cold close rain and snow close not hot close rain or cold close hot implies cold close hot and cold close cold and open not hot close close

  • pen rain or cold close implies danger close

hot or cold close close implies

  • pen open rain

implies danger close and open not hot close close and snow close

  • r cold close

iff cold close and cold close implies cold close

  • pen open rain or

cold and rain and hot iff hot implies hot and hot or

  • pen rain or cold close implies

not rain or rain or cold close implies danger close implies danger close

  • pen rain or cold close implies danger

hot implies cold rain and snow cold and open not hot close hot iff cold rain or cold not hot hot or cold hot and cold cold close snow close hot close

  • pen not hot close close

danger close

  • pen cold and open not

rain or cold close implies danger close and open not hot close implies danger and cold

  • r cold

implies cold and snow iff cold close and implies iff

  • r
  • pen
  • pen open
  • pen rain
  • pen open rain or cold close
  • pen hot
  • pen cold
  • pen cold and open not hot

cold and open not

  • pen cold and open

cold and open not hot

  • r cold close implies

rain or cold close implies cold close implies danger

  • r cold close implies danger

not hot close close cold close implies cold close implies danger close and open not hot and open close implies danger

  • pen open rain or cold
slide-51
SLIDE 51

cold and open

  • pen hot and cold close and open

and open rain and open not and open rain implies hot implies cold cold and hot cold and open not hot close hot iff cold rain implies snow not hot hot or cold

  • pen hot and cold close and open rain implies snow close

hot and cold snow close close hot close close

  • pen rain implies snow
  • pen cold and open not hot close
  • pen hot or cold
  • pen hot iff cold
  • pen hot and cold
  • pen not hot
  • pen cold and hot
  • pen open hot and cold close and open rain implies snow close
  • pen hot implies cold

hot iff cold close rain implies snow close not hot close cold and hot close

  • pen hot and cold close and open rain implies snow close close

hot implies cold close hot and cold close cold and open not hot close close hot or cold close close close close and cold and open not

  • pen hot and cold close and open rain implies

and cold close and open rain implies snow close close

  • pen cold and
  • pen not
  • pen rain implies
  • pen hot or
  • pen hot iff
  • pen hot implies
  • pen open hot and cold close and
  • pen hot and

close and open and cold close and open rain

  • pen hot iff cold close
  • pen cold and hot close

snow

  • pen cold and open not hot close close
  • pen not hot close
  • pen hot or cold close

cold

  • pen rain implies snow close
  • pen open hot and cold close and open rain implies snow close close
  • pen hot implies cold close

hot danger rain

  • pen hot and cold close

close and open rain implies

  • pen open hot and cold close and open rain implies snow
  • pen cold and open not hot
  • pen hot and cold close and

cold and hot iff hot implies hot and rain implies hot or not

  • pen open hot and cold close and open rain
  • pen open hot

and hot close and open not hot close close

  • r cold close

iff cold close implies snow close and open rain implies snow close close and cold close implies cold close close and open rain implies snow close and open rain implies snow close cold close snow close

  • pen rain implies snow close close

hot close

  • pen not hot close close
  • pen cold and open not
  • pen open hot and cold close and open rain implies

and cold close and hot and cold close and open

  • pen hot and cold close and open rain

implies snow close close cold close and open and cold close and open rain implies snow and open rain implies snow close and open not hot close and hot and cold

  • r cold

implies cold iff cold implies snow hot and cold close and open rain implies snow close close

  • pen open hot and cold close
  • pen rain
  • pen hot
  • pen cold

close and implies iff

  • r

hot and cold close and open rain implies snow

  • pen

cold close and open rain implies snow close close and cold close and open rain implies snow close

  • pen open hot and
  • pen open
  • pen cold and open
  • pen open hot and cold close and open
  • pen open hot and cold
  • pen hot and cold close and open rain implies snow

cold and open not hot cold close and open rain implies snow close and open rain and cold close and open and cold close and open rain implies not hot close close rain implies snow close close cold close and open rain implies cold close and open rain close and open rain implies snow close close hot and cold close and open rain cold close and open rain implies snow close hot and cold close and and open rain implies snow and open not hot hot and cold close and open rain implies snow close hot and cold close and open rain implies and open cold close and

slide-52
SLIDE 52

Restriction

◮ We only consider substitutable languages which have a finite

number of primes.

◮ We define nonterminals only for these primes.

Label Examples P rain, cold, open rain and cold close O

  • pen

C close B and, or, . . . N not, hot or, cold and . . .

slide-53
SLIDE 53

Fundamental theorem of substitutable languages

Every congruence class Q can be uniquely represented as a sequence of primes such that Q = P1 . . . Pn

slide-54
SLIDE 54

Fundamental theorem of substitutable languages

Every congruence class Q can be uniquely represented as a sequence of primes such that Q = P1 . . . Pn

Intuition

If X = YZ, and we have a rule P → QXR then we can change it to P → QYZR

slide-55
SLIDE 55

N P C B O

hot iff hot and hot implies hot or not

  • pen hot or cold
  • pen hot iff cold
  • pen hot and cold
  • pen not hot
  • pen hot implies cold
  • pen hot iff cold close
  • pen not hot close

cold

  • pen hot or cold close

hot

  • pen hot implies cold close

danger rain snow

  • pen hot and cold close

hot iff cold close not hot close hot implies cold close hot and cold close hot or cold close

  • r cold close

iff cold close and cold close implies cold close

  • pen not
  • pen hot or
  • pen hot iff
  • pen hot implies
  • pen hot and

and cold

  • r cold

implies cold iff cold hot implies cold hot iff cold not hot hot or cold hot and cold cold close hot close close and implies iff

  • r
  • pen
  • pen hot
slide-56
SLIDE 56

N ONP P NPC BPC ON BP NP PC C B O OP

hot iff hot and hot implies hot or not

  • pen hot or cold
  • pen hot iff cold
  • pen hot and cold
  • pen not hot
  • pen hot implies cold
  • pen hot iff cold close
  • pen not hot close

cold

  • pen hot or cold close

hot

  • pen hot implies cold close

danger rain snow

  • pen hot and cold close

hot iff cold close not hot close hot implies cold close hot and cold close hot or cold close

  • r cold close

iff cold close and cold close implies cold close

  • pen not
  • pen hot or
  • pen hot iff
  • pen hot implies
  • pen hot and

and cold

  • r cold

implies cold iff cold hot implies cold hot iff cold not hot hot or cold hot and cold cold close hot close close and implies iff

  • r
  • pen
  • pen hot
slide-57
SLIDE 57

Productions

We need non-binary rules.

Correct productions

P0 → P1 . . . Pk where P0 P1 . . . Pk

Infinite number of correct productions

◮ N → PB ◮ P → ONPC ◮ P → OPBPC ◮ P → ONONPCC ◮ . . .

slide-58
SLIDE 58

P C P B P O

slide-59
SLIDE 59

P C P N B P O

slide-60
SLIDE 60

Productions

Valid productions

◮ Correct productions where the right hand side does not

contain the right hand side of a valid production.

◮ If there are n primes then there are at most n2 valid

productions.

Examples

◮ N → PB ◮ P → ONPC

slide-61
SLIDE 61

A Strong Learning Result

Class of grammars

Gsc is the class of canonical grammars for all substitutable languages with a finite number of primes.

Theorem [Clark(2014)]

There is an algorithm which learns Gsc

◮ From positive examples ◮ Identification in the limit ◮ Strongly: converges structurally ◮ Using polynomial time and data

slide-62
SLIDE 62

Running example

(verbatim output from implementation)

S NT0 NT2 close NT0 hot NT1 NT3 and NT0 cold NT4

  • pen
slide-63
SLIDE 63
  • pen open hot and cold close and open rain implies snow

close close

1 parse

S NT0 NT2 close NT0 NT2 close NT0 snow NT1 NT3 implies NT0 rain NT4

  • pen

NT1 NT3 and NT0 NT2 close NT0 cold NT1 NT3 and NT0 hot NT4

  • pen

NT4

  • pen
slide-64
SLIDE 64

Outline

Introduction Weak Learning Strong Learning An Algebraic Theory of CFGs

slide-65
SLIDE 65

Contexts

A context is a string with a hole: lr

Derivation contexts

The derivation contexts of CFGs are just string contexts: S

⇒G lNr

slide-66
SLIDE 66

Definition

Filling the hole

lr ⊙ u = lur

A factorisation of a language L

C is a set of contexts; S is a set of strings C ⊙ S ⊆ L

slide-67
SLIDE 67

Context free grammars

Contexts and yields

L(G, N) = {w ∈ Σ∗ | N

⇒ w} C(G, N) = {lr | S

⇒ lNr}.

Nonterminals in a context-free grammar

C(G, N) ⊙ L(G, N) ⊆ L

slide-68
SLIDE 68

Context free grammars

Contexts and yields

L(G, N) = {w ∈ Σ∗ | N

⇒ w} C(G, N) = {lr | S

⇒ lNr}.

Nonterminals in a context-free grammar

C(G, N) ⊙ L(G, N) ⊆ L We can reverse this process and go from a collection of decompositions back to a CFG.

slide-69
SLIDE 69

Polar maps

If S is a set of strings: S⊲ = {lr | ∀u ∈ S, lur ∈ L} (1) If C is a set of contexts: C ⊳ = {u ∈ Σ∗ | ∀lr ∈ C, lur ∈ L} (2)

slide-70
SLIDE 70

Polar maps

If S is a set of strings: S⊲ = {lr | ∀u ∈ S, lur ∈ L} (1) If C is a set of contexts: C ⊳ = {u ∈ Σ∗ | ∀lr ∈ C, lur ∈ L} (2) ·⊲ and ·⊳ form a Galois connection between sets of strings and sets

  • f contexts.
slide-71
SLIDE 71

Closed sets of strings

◮ ·⊲⊳ is a closure operator on the sets of strings; ◮ X ⊲ = Y ⊲ is a CIS-congruence; ◮ L is always closed.

slide-72
SLIDE 72

Closed sets of strings

◮ ·⊲⊳ is a closure operator on the sets of strings; ◮ X ⊲ = Y ⊲ is a CIS-congruence; ◮ L is always closed.

The syntactic concept lattice

The set of all closed sets of strings form a complete idempotent semiring: B(L). (A generalisation of the syntactic monoid; the collection of maximal decompositions into strings and contexts.)

slide-73
SLIDE 73

L is regular iff B(L) is finite

L = (ab)∗

Σ∗ L (ba)∗ a(ba)∗ b(ab)∗ {ǫ} ∅

slide-74
SLIDE 74

Recognising a language

Definition

We say that a CIS B recognizes L if there is a surjective morphism h from P(Σ∗) → B such that h∗(h(L)) = L, where h∗ is the residual of h.

slide-75
SLIDE 75

Recognising a language

Definition

We say that a CIS B recognizes L if there is a surjective morphism h from P(Σ∗) → B such that h∗(h(L)) = L, where h∗ is the residual of h. Given a CIS B and a homomorphism h : P(Σ∗) → B, we can define a new grammar φh(G) by merging nonterminals M, N if h(L(G, M)) = h(L(G, N))

slide-76
SLIDE 76

Theorem

Let G be a CFG over Σ and h a homomorphism h : P(Σ∗) → B. Then

◮ φh(G) defines the same language as G iff ◮ B recognizes L through h

slide-77
SLIDE 77

Theorem

Let G be a CFG over Σ and h a homomorphism h : P(Σ∗) → B. Then

◮ φh(G) defines the same language as G iff ◮ B recognizes L through h

Uniqueness

There is a unique ’smallest’ CIS that recognizes L: which is B(L).

slide-78
SLIDE 78

The universal morphism

G P(Σ∗) φ : N → L(G, N) 1 B(L) B ζL hL h

slide-79
SLIDE 79

The universal cfg-morphism

G φ(G) φ : N → L(G, N) φ1(G) φL(G) φM(G) ζL hL h

slide-80
SLIDE 80

[Clark(2013)]

Mergeable nonterminals

If L(G, M)⊲⊳ = L(G, N)⊲⊳ then we can merge M and N without increasing the language defined by G,

slide-81
SLIDE 81

[Clark(2013)]

Mergeable nonterminals

If L(G, M)⊲⊳ = L(G, N)⊲⊳ then we can merge M and N without increasing the language defined by G,

Minimal grammars correspond to maximal factorisations

A grammar without mergable nonterminals will have nonterminals that correspond to closed sets of strings.

slide-82
SLIDE 82

Conclusion

◮ We can learn context-free grammars weakly decomposing

strings into contexts and substrings.

◮ Minimal grammars will correspond to maximal decompositions. ◮ We can learn grammars strongly by identifying structure in

some canonical algebras associated with the languages:

◮ the syntactic monoid ◮ the syntactic concept lattice.

◮ The same approach applies to Multiple Context-Free

Grammars, a mildly context-sensitive grammar formalism.

slide-83
SLIDE 83

Bibliography

Alexander Clark. The syntactic concept lattice: Another algebraic theory of the context-free languages? Journal of Logic and Computation, 2013. doi: 10.1093/logcom/ext037. Alexander Clark. Learning trees from strings: A strong learning algorithm for some context-free grammars. Journal of Machine Learning Research, 14:3537–3559, 2014. URL http://jmlr.org/papers/v14/clark13a.html. Alexander Clark and Rémi Eyraud. Polynomial identification in the limit of substitutable context-free languages. Journal of Machine Learning Research, 8:1725–1745, August 2007.

  • S. Ginsburg and H.G. Rice.

Two families of languages related to ALGOL. Journal of the ACM (JACM), 9(3):350–371, 1962. Zellig Harris. Distributional structure. In J. A. Fodor and J. J. Katz, editors, The structure of language: Readings in the philosophy of language, pages 33–49. Prentice-Hall, 1964.