Identifiability and Unmixing of Latent Parse Trees Daniel Hsu, Sham - - PowerPoint PPT Presentation

identifiability and unmixing of latent parse trees
SMART_READER_LITE
LIVE PREVIEW

Identifiability and Unmixing of Latent Parse Trees Daniel Hsu, Sham - - PowerPoint PPT Presentation

Identifiability and Unmixing of Latent Parse Trees Daniel Hsu, Sham Kakade, Percy Liang NIPS 2012 Jan Gasthaus Tea talk January 8th, 2013 1 / 15 Parsing 2 / 15 Big Picture Generative parsing models define joint distributions P ( x , z )


slide-1
SLIDE 1

Identifiability and Unmixing of Latent Parse Trees

Daniel Hsu, Sham Kakade, Percy Liang NIPS 2012

Jan Gasthaus Tea talk January 8th, 2013

1 / 15

slide-2
SLIDE 2

Parsing

2 / 15

slide-3
SLIDE 3

Big Picture

Generative parsing models define joint distributions Pθ(x, z)

  • ver sentences x and their structure z.

3 / 15

slide-4
SLIDE 4

Big Picture

Generative parsing models define joint distributions Pθ(x, z)

  • ver sentences x and their structure z.

Can we identify θ given only sentences (but not their structure, i.e. without supervision)?

3 / 15

slide-5
SLIDE 5

Big Picture

Generative parsing models define joint distributions Pθ(x, z)

  • ver sentences x and their structure z.

Can we identify θ given only sentences (but not their structure, i.e. without supervision)? The paper has two parts:

1

Identifiabilty of several models (PCFGs not identifiable!)

3 / 15

slide-6
SLIDE 6

Big Picture

Generative parsing models define joint distributions Pθ(x, z)

  • ver sentences x and their structure z.

Can we identify θ given only sentences (but not their structure, i.e. without supervision)? The paper has two parts:

1

Identifiabilty of several models (PCFGs not identifiable!)

2

Parameter recovery: unmixing (for restricted PCFGs)

3 / 15

slide-7
SLIDE 7

Big Picture

the lady sang Gatsby likes Bayesians

4 / 15

slide-8
SLIDE 8

Big Picture

5 / 15

slide-9
SLIDE 9

Big Picture

5 / 15

slide-10
SLIDE 10

PCFG model

6 / 15

slide-11
SLIDE 11

Dependency Grammars

7 / 15

slide-12
SLIDE 12

Identifiability

8 / 15

slide-13
SLIDE 13

Identifiability

SΘ(θ0) defined by moment constraints hθ0(θ) = µ(θ) − µ(θ0) = 0 Rows of Jacobian of hθ0 are directions of constraint violation

9 / 15

slide-14
SLIDE 14

Identifiability

SΘ(θ0) defined by moment constraints hθ0(θ) = µ(θ) − µ(θ0) = 0 Rows of Jacobian of hθ0 are directions of constraint violation

9 / 15

slide-15
SLIDE 15

Identifiability

10 / 15

slide-16
SLIDE 16

Unmixing

11 / 15

slide-17
SLIDE 17

Unmixing

12 / 15

slide-18
SLIDE 18

Unmixing

13 / 15

slide-19
SLIDE 19

Results

14 / 15

slide-20
SLIDE 20

Conclusions

15 / 15