Categorial Grammar C a gr C oltekin c.coltekin@rug.nl - - PowerPoint PPT Presentation

categorial grammar
SMART_READER_LITE
LIVE PREVIEW

Categorial Grammar C a gr C oltekin c.coltekin@rug.nl - - PowerPoint PPT Presentation

Categorial Grammar C a gr C oltekin c.coltekin@rug.nl November 18, 2008 1 / 28 Overview A review of CFGs and Chomsky hierarchy Categorial Grammar Categorial Grammar and semantics Learning Categorial Grammars 2 /


slide-1
SLIDE 1

Categorial Grammar

C ¸a˘ grı C ¸¨

  • ltekin

c.coltekin@rug.nl

November 18, 2008

1 / 28

slide-2
SLIDE 2

Overview

◮ A review of CFGs and Chomsky hierarchy ◮ Categorial Grammar ◮ Categorial Grammar and semantics ◮ Learning Categorial Grammars

2 / 28

slide-3
SLIDE 3

Grammars

◮ A grammar is a set of rules governing use of a given natural

language.

◮ Formal grammars are precise description of a given formal

  • language. They are commonly used to describe components
  • f natural language grammar, such as syntax.

◮ The grammar of a language recognizes and generates all and

the only set of strings (sentences, phrases) that belongs to that language.

3 / 28

slide-4
SLIDE 4

Context-free grammars

Formally, a Context-free grammar (CFG) is specified by a tuple (V , S, Σ, R), where:

◮ V is a finite set of non-terminal symbols. ◮ S ∈ V is the start symbol (sentence). ◮ Σ is a finite set of terminal symbols. ◮ R is a set of rules of the form X → y where X is a single

symbol from V , and y is a (possibly empty) string of terminal and non-terminal symbols.

4 / 28

slide-5
SLIDE 5

CFGs for natural language syntax

Example: derivation of sentence ‘She read a nice book’. The grammar Derivation Tree S → NP VP VP → V NP NP → DET N N → ADJ N NP → she V → read DET → a ADJ → nice N → book S ⇒ NP VP NP ⇒ she VP ⇒ V NP V ⇒ read NP ⇒ DET N DET ⇒ a N ⇒ ADJ N ADJ ⇒ nice N ⇒ book S NP She VP V read NP DET a N ADJ nice N book

5 / 28

slide-6
SLIDE 6

Chomsky hierarchy of (formal) languages

Grammar Language Automaton Unrestricted (type-0) Recursively enumerable Turing machine Context-sensitive (type-1) Context-sensitive Linear-bounded Context-free (type-2) Context-free Push-down Regular (type-3) Regular Finite State

◮ Each language in the hierarchy is a proper subset of the ones

higher in the hierarchy.

◮ We try to find the most restrictive grammar that is adequate

to describe the language.

◮ The syntax natural languages are known to be (slightly) more

complex than context-free, generally referred to as mildly context-free.

6 / 28

slide-7
SLIDE 7

Categorial grammars: overview

◮ CG has a long history: basic ideas dates back to 1935. ◮ CG is ‘radically’ lexicalized: all language specific information

resides in the lexicon.

◮ Generative power of CG is equivalent to CFG. ◮ CG assumes a strong relation between syntax and semantics.

7 / 28

slide-8
SLIDE 8

Categorial grammars: categories

Formally a CG is specified by the tuple (A, S, Σ), where:

◮ A is a set of atomic (or basic) categories. ◮ S ∈ A is the start symbol. ◮ Σ is lexicon containing lexical items of the form

word := category where category can be any valid CG category.

◮ Valid CG categories consist of

◮ A finite set of basic categories, A. ◮ A set of complex categories, C, such that: ◮ If X, Y ∈ A, then (X\Y ), (X/Y ) ∈ C ◮ If X, Y ∈ C, then (X\Y ), (X/Y ) ∈ C 8 / 28

slide-9
SLIDE 9

Categorial grammars: rules

CG has two operations (combinators):

◮ Forward Application:

X/Y Y ⇒ X (>)

◮ Backward Application:

Y X\Y ⇒ X (<)

9 / 28

slide-10
SLIDE 10

CG for natural languages

Basic categories are S, N and NP. Example lexicon: she := NP read := (S\NP)/NP a := NP/N nice := N/N book := N An example derivation: she read a nice book NP (S\NP)/NP (NP/N) (N/N) N

>

N

>

NP

>

(S\NP)

<

S

10 / 28

slide-11
SLIDE 11

CG lexical categories: more examples

Conventional name CG category Example Proper nouns NP Mary Common nouns N book Determiners NP/N the Adjectives N/N green Intransitive verbs S\NP sleep Transitive verbs (S\NP)/NP read Ditransitive verbs ((S\NP)/NP)/NP give Adverbs (S\NP)\(S\NP) well Prepositions (N\N)/NP ((S\NP)\(S\NP))/NP with

11 / 28

slide-12
SLIDE 12

CG lexical categories: more derivations

she saw the boy with a book NP (S\NP)/NP (NP/N) N (N\N)/NP (NP/N) N

>

NP

>

N\N

<

N

>

NP

>

S\NP

<

S

12 / 28

slide-13
SLIDE 13

CG lexical categories: more derivations (2)

she saw the boy with a telescope NP (S\NP)/NP (NP/N) N ((S\NP)\(S\NP))/NP (NP/N) N

> >

NP NP

> >

S\NP (S\NP)\(S\NP)

<

S\NP

<

S

13 / 28

slide-14
SLIDE 14

CG and semantics

◮ We extend categories to include semantic types. ◮ The function application rules become:

Forward Application: X/Y: f Y: a ⇒ X: fa (>) Backward Application: Y: a X\Y: f ⇒ X: fa (<)

◮ Example lexicon extended with semantic types:

she := NP : she′ read := (S\NP)/NP : λxλy.like′xy a := NP/N : λx.a′x nice := N/N : λx.nice′x book := N : book′

14 / 28

slide-15
SLIDE 15

Yet another example derivation

◮ Lexicon:

walk := S\NP : λx.walk′x kitties := NP : cats′ milk := NP : milk′ eat := (S\NP)/NP : λxλy.like′xy

◮ An example derivation:

kitties eat milk NP: cats′ (S\NP)/NP: λxλy.like′xy NP: milk′

>

S\NP: λy.like′milk′ y

<

S:like′milk′ cats′

15 / 28

slide-16
SLIDE 16

CFG vs. CG: the lexicon and rules

CFG: CG: NP → she V → read DET → a ADJ → nice N → book she := NP read := (S\NP)/NP a := NP/N nice := N/N book := N S → NP VP VP → V NP NP → DET N N → ADJ N X/Y Y ⇒ X (>) Y X\Y ⇒ X (<)

16 / 28

slide-17
SLIDE 17

CFG vs. CG: derivation

CFG: S NP She VP V read NP DET a N ADJ nice N book CG: she read a nice book NP (S\NP)/NP (NP/N) (N/N) N

>

N

>

NP

>

(S\NP)

<

S

17 / 28

slide-18
SLIDE 18

Beyond context free power

The CG has extensions that provide expressive capacity for covering non-context-free phenomena in natural languages. Combinatory Categorial Grammar (CCG), a popular extension of CG, adds a few more rules. Function composition rules: Forward X/ Y: f Y/Z : g ⇒ X/ Z: λx.f (gx) (> B) Backward Y\ Z: f X\Y : g ⇒ X\ Z: λx.f (gx) (< B) Forward cross X/ Y: f Y\Z : g ⇒ X\ Z: λx.f (gx) (> B×) Backward cross Y/ Z: f X\Y : g ⇒ X/ Z: λx.f (gx) (< B×) Type raising rules: Forward X: a ⇒ T/ (T\ X): λf .fa (> T) Backward X: a ⇒ T\ (T/ X): λf .fa (< T)

18 / 28

slide-19
SLIDE 19

Why learning with categorial grammars?

◮ Highly lexicalized ◮ Based on sound mathematical formalisms ◮ Transparency between syntax and semantics ◮ Encouraging formal results from learning theory ◮ Extensions (e.g. CCG) are possible for wider coverage of

human languages

19 / 28

slide-20
SLIDE 20

Learning CG

◮ Assume the combinators (operations) are given ◮ Input is (somewhat noisy) valid sentences

kitties eat milk penguin eats cookies

◮ Output is a lexicalized grammar

milk := NP : milk′ cookies := NP : cookies′ penguin := NP : penguin′ kitties := NP : cats′ eat := (S\NP)/NP : λxλy.eat′x y

20 / 28

slide-21
SLIDE 21

Learning CG: generating hypotheses

Assuming input ‘kitties eat milk’, and only possible lexical categories NP, (S\NP)/NP: milk := NP 0.8 milk := (S\NP)/NP 0.2 kitties := NP 0.7 kitties := (S\NP)/NP 0.2 eat := NP 0.3 eat := (S\NP)/NP 0.6 This is overly simplified, hypothesis generation is complicated.

21 / 28

slide-22
SLIDE 22

Learning CG: problems

Assuming we have K categories, with input of length N.

◮ Number of lexical hypotheses to generate are N × K. ◮ This amounts to K N possible lexical category assignments for

every input sentence.

◮ To validate (parse) the input, we need to consider

CN =

(2N)! (N+1)! N! different number of pairings. ◮ K can be infinite!

22 / 28

slide-23
SLIDE 23

Learning CG: some possible solutions

We do not have any labeled data, but we have certain cues/constraints that may help:

◮ Some hypotheses are impossible. ◮ Lexical items consistently occurring in the same context are

likely to have same categories.

◮ Sentences have to parse to S. ◮ When learning with semantics, the semantic output has to

‘make sense’.

◮ Certain category structures are likely to occur in natural

languages.

◮ Certain languages tend to use certain category structures. ◮ We expect a tendency towards unambiguous lexical items. ◮ We expect lexicon to be compact.

23 / 28

slide-24
SLIDE 24

Learning CG: a short example from learning morphology

Input: a-dam-lar : plural(man) Lexicon contains: adam := N : man

  • 1. Generate all lexical hypotheses:

a := N : man a := Nplu/N : λx.plural(x) a.dam := N : man a.dam := Nplu/N : λx.plural(x) dam.lar := N : man dam.lar := Nplu\Ndat : λx.plural(x) lar := N : man lar := Nplu\Ndat : λx.plural(x)

  • 2. Parse the input:

(1) adam : man lar : λx.plural(x) N : man Nplu\N : λx.plural(x)

<

Nplu : plural(man) (2) adam : λx.plural(x) lar : man Nplu/N : λx.plural(x) N : man

<

Nplu : plural(man) (3) a : man damlar : λx.plural(x) N : man Nplu\N : λx.plural(x)

<

Nplu : plural(man) (4) a : λx.plural(x) damlar : man Nplu/N : λx.plural(x) N : man

<

Nplu : plural(man)

  • 3. Parse (1) scores highest, since it is supported by the lexicon.

3.1 Item ‘lar := Nplu\Ndat : λx.plural(x)’ inserted into lexicon 3.2 Weight of item ‘adam := N : man’ is increased.

24 / 28

slide-25
SLIDE 25

25 / 28

slide-26
SLIDE 26

Example: cross-serial dependencies

dat Jan1 Marie2 het boek3 wil1 laten2 lezen3 that Jan Marie the book wants let read that Jan wants to let Marie read the book

26 / 28

slide-27
SLIDE 27

Example: center embedding

◮ Center-embedding:

The book that the child likes. The book that the child that the man knows likes. The book that the child that the man that the woman saw knows likes.

27 / 28

slide-28
SLIDE 28

Examples from Turkish morphology

◮ ˙

Istanbul-lu-la¸ s-tır-ama-dık-lar-ımız-dan-mı-sınız? ‘Are you one of those we could not convert to an ˙ Istanbulite’

◮ ˙

Istanbul-lu-la¸ s-tır-ama-yabile-ecek-ler-imiz-den-mi¸ s-siniz- cesine. ‘As if you were of those whom we may consider not converting to an ˙ Istanbulite’

ev

  • de
  • ki
  • nin
  • ki
  • ler
  • de
  • ki

house

  • LOC
  • REL
  • POS3s
  • REL
  • PLU
  • LOC
  • REL

28 / 28