A variational method for the Rasch model Frank Rijmen and Ji r - - PowerPoint PPT Presentation

a variational method for the rasch model
SMART_READER_LITE
LIVE PREVIEW

A variational method for the Rasch model Frank Rijmen and Ji r - - PowerPoint PPT Presentation

A variational method for the Rasch model Frank Rijmen and Ji r Vomlel Catholic University Leuven, Belgium Academy of Sciences of the Czech Republic Salzburg, December, 3, 2005 F. Rijmen and J. Vomlel () Rasch model Salzburg,


slide-1
SLIDE 1

A variational method for the Rasch model

Frank Rijmen and Jiˇ r´ ı Vomlel

Catholic University Leuven, Belgium Academy of Sciences of the Czech Republic

Salzburg, December, 3, 2005

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 1 / 12

slide-2
SLIDE 2

Variables and parameters

Model variables Yn,i binary response variable - its values indicates whether the answer of person n to question i was correct n = 1, . . . , N person index i = 1, . . . , I question index Model parameters δi difficulty of question i - fixed effects βn ability (knowledge level) of person n - a random effect

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 2 / 12

slide-3
SLIDE 3

Models for the response variable Y

Yn,i = 1 if βn ≥ δi

  • therwise.

P(Yn,i = 1) = exp(βn − δi) 1 + exp(βn − δi) P(Yn,i = 1 | βn) for δi = −2

  • 6
  • 4
  • 2

2 0.2 0.4 0.6 0.8 1

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 3 / 12

slide-4
SLIDE 4

Probability distribution for random effect βn

P(βn) = N(0, σ2) a normal (Gaussian) distribution with the mean equal zero, and variance σ2.

  • 3
  • 2
  • 1

1 2 3 0.1 0.2 0.3 0.4

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 4 / 12

slide-5
SLIDE 5

Computations with the Rasch model

prior Nβ(0, 1)

  • 3
  • 2
  • 1

1 2 3 0.1 0.2 0.3 0.4

Nβ(0, 1)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 5 / 12

slide-6
SLIDE 6

Computations with the Rasch model

P(Y = 1 | β, δ1 = −2) posterior

  • 3
  • 2
  • 1

1 2 3 0.2 0.4 0.6 0.8 1

  • 3
  • 2
  • 1

1 2 3 0.05 0.1 0.15 0.2 0.25 0.3 0.35

Nβ(0, 1) · P(Y = 1 | β, δ1 = −2)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 5 / 12

slide-7
SLIDE 7

Computations with the Rasch model

P(Y = 0 | β, δ2 = 0) posterior

  • 3
  • 2
  • 1

1 2 3 0.2 0.4 0.6 0.8 1

  • 3
  • 2
  • 1

1 2 3 0.025 0.05 0.075 0.1 0.125 0.15 0.175

Nβ(0, 1) · P(Y = 1 | β, δ1 = −2) · P(Y = 0 | β, δ2 = 0)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 5 / 12

slide-8
SLIDE 8

Computations with the Rasch model

P(Y = 0 | β, δ3 = +1) posterior

  • 3
  • 2
  • 1

1 2 3 0.2 0.4 0.6 0.8 1

  • 3
  • 2
  • 1

1 2 3 0.02 0.04 0.06 0.08 0.1 0.12 0.14

Nβ(0, 1) · P(Y = 1 | β, δ1 = −2) · P(Y = 0 | β, δ2 = 0) ·P(Y = 0 | β, δ3 = +1)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 5 / 12

slide-9
SLIDE 9

Computations with the Rasch model

P(Y = 0 | β, δ4 = +2) posterior

  • 3
  • 2
  • 1

1 2 3 0.2 0.4 0.6 0.8 1

  • 3
  • 2
  • 1

1 2 3 0.02 0.04 0.06 0.08 0.1 0.12

Nβ(0, 1) · P(Y = 1 | β, δ1 = −2) · P(Y = 0 | β, δ2 = 0) ·P(Y = 0 | β, δ3 = +1) · P(Y = 0 | β, δ4 = +2)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 5 / 12

slide-10
SLIDE 10

Likelihood of the model given data

Assume we have observed answers to I questions from N persons, i.e., we have data y. y1,1 y1,2 . . . y1,I y2,1 y2,2 . . . y2,I . . . yN,1 yN,2 . . . yN,I The task is to find model parameters σ, δ1, . . . , δI that maximize likelihood. L =

N

  • n=1
  • Nβn(0, σ2) ·

I

  • i=1

P(yni | βn, δi)dβn ... but this integral does not have a closed-form solution!

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 6 / 12

slide-11
SLIDE 11

Likelihood of the model given data

Assume we have observed answers to I questions from N persons, i.e., we have data y. y1,1 y1,2 . . . y1,I y2,1 y2,2 . . . y2,I . . . yN,1 yN,2 . . . yN,I The task is to find model parameters σ, δ1, . . . , δI that maximize likelihood. L =

N

  • n=1
  • Nβn(0, σ2) ·

I

  • i=1

P(yni | βn, δi)dβn ... but this integral does not have a closed-form solution!

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 6 / 12

slide-12
SLIDE 12

Likelihood of the model given data

Assume we have observed answers to I questions from N persons, i.e., we have data y. y1,1 y1,2 . . . y1,I y2,1 y2,2 . . . y2,I . . . yN,1 yN,2 . . . yN,I The task is to find model parameters σ, δ1, . . . , δI that maximize likelihood. L =

N

  • n=1
  • Nβn(0, σ2) ·

I

  • i=1

P(yni | βn, δi)dβn ... but this integral does not have a closed-form solution!

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 6 / 12

slide-13
SLIDE 13

Likelihood of the model given data

Assume we have observed answers to I questions from N persons, i.e., we have data y. y1,1 y1,2 . . . y1,I y2,1 y2,2 . . . y2,I . . . yN,1 yN,2 . . . yN,I The task is to find model parameters σ, δ1, . . . , δI that maximize likelihood. L =

N

  • n=1
  • Nβn(0, σ2) ·

I

  • i=1

P(yni | βn, δi)dβn ... but this integral does not have a closed-form solution!

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 6 / 12

slide-14
SLIDE 14

Likelihood of the model given data

Assume we have observed answers to I questions from N persons, i.e., we have data y. y1,1 y1,2 . . . y1,I y2,1 y2,2 . . . y2,I . . . yN,1 yN,2 . . . yN,I The task is to find model parameters σ, δ1, . . . , δI that maximize likelihood. L =

N

  • n=1
  • Nβn(0, σ2) ·

I

  • i=1

P(yni | βn, δi)dβn ... but this integral does not have a closed-form solution!

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 6 / 12

slide-15
SLIDE 15

Likelihood of the model given data

Assume we have observed answers to I questions from N persons, i.e., we have data y. y1,1 y1,2 . . . y1,I y2,1 y2,2 . . . y2,I . . . yN,1 yN,2 . . . yN,I The task is to find model parameters σ, δ1, . . . , δI that maximize likelihood. L =

N

  • n=1
  • Nβn(0, σ2) ·

I

  • i=1

P(yni | βn, δi)dβn ... but this integral does not have a closed-form solution!

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 6 / 12

slide-16
SLIDE 16

Likelihood of the model given data

Assume we have observed answers to I questions from N persons, i.e., we have data y. y1,1 y1,2 . . . y1,I y2,1 y2,2 . . . y2,I . . . yN,1 yN,2 . . . yN,I The task is to find model parameters σ, δ1, . . . , δI that maximize likelihood. L =

N

  • n=1
  • Nβn(0, σ2) ·

I

  • i=1

P(yni | βn, δi)dβn ... but this integral does not have a closed-form solution!

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 6 / 12

slide-17
SLIDE 17

Approximations to the integral

Gaussian quadrature Laplace approximation Variational approximation

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 7 / 12

slide-18
SLIDE 18

Approximations to the integral

Gaussian quadrature Laplace approximation Variational approximation

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 7 / 12

slide-19
SLIDE 19

Approximations to the integral

Gaussian quadrature Laplace approximation Variational approximation

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 7 / 12

slide-20
SLIDE 20

Variational approximation

Let h = βn − δi. P(Yn,i = 1 | h) = exp(h) 1 + exp(h) = exp(h/2) exp(−h/2) + exp(h/2) = exp {h/2 − log [exp(h/2) + exp(−h/2)]} = exp {h/2 + f (h)} f (h) is approximated by the first order Taylor expansion ˜ f (h, ξ) in variable h2 around point ξ. P(Yn,i = 1 | h) ≈ exp

  • h/2 +
  • f (ξ) + ∂f (h)

∂(h2)

  • h=ξ · (h2 − ξ2)
  • =

exp

  • h/2 + ˜

f (h, ξ)

  • =

˜ P(Yn,i = 1 | h, ξ)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 8 / 12

slide-21
SLIDE 21

Variational approximation

Let h = βn − δi. P(Yn,i = 1 | h) = exp(h) 1 + exp(h) = exp(h/2) exp(−h/2) + exp(h/2) = exp {h/2 − log [exp(h/2) + exp(−h/2)]} = exp {h/2 + f (h)} f (h) is approximated by the first order Taylor expansion ˜ f (h, ξ) in variable h2 around point ξ. P(Yn,i = 1 | h) ≈ exp

  • h/2 +
  • f (ξ) + ∂f (h)

∂(h2)

  • h=ξ · (h2 − ξ2)
  • =

exp

  • h/2 + ˜

f (h, ξ)

  • =

˜ P(Yn,i = 1 | h, ξ)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 8 / 12

slide-22
SLIDE 22

Variational approximation

Let h = βn − δi. P(Yn,i = 1 | h) = exp(h) 1 + exp(h) = exp(h/2) exp(−h/2) + exp(h/2) = exp {h/2 − log [exp(h/2) + exp(−h/2)]} = exp {h/2 + f (h)} f (h) is approximated by the first order Taylor expansion ˜ f (h, ξ) in variable h2 around point ξ. P(Yn,i = 1 | h) ≈ exp

  • h/2 +
  • f (ξ) + ∂f (h)

∂(h2)

  • h=ξ · (h2 − ξ2)
  • =

exp

  • h/2 + ˜

f (h, ξ)

  • =

˜ P(Yn,i = 1 | h, ξ)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 8 / 12

slide-23
SLIDE 23

Variational approximation

Let h = βn − δi. P(Yn,i = 1 | h) = exp(h) 1 + exp(h) = exp(h/2) exp(−h/2) + exp(h/2) = exp {h/2 − log [exp(h/2) + exp(−h/2)]} = exp {h/2 + f (h)} f (h) is approximated by the first order Taylor expansion ˜ f (h, ξ) in variable h2 around point ξ. P(Yn,i = 1 | h) ≈ exp

  • h/2 +
  • f (ξ) + ∂f (h)

∂(h2)

  • h=ξ · (h2 − ξ2)
  • =

exp

  • h/2 + ˜

f (h, ξ)

  • =

˜ P(Yn,i = 1 | h, ξ)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 8 / 12

slide-24
SLIDE 24

Variational approximation

Let h = βn − δi. P(Yn,i = 1 | h) = exp(h) 1 + exp(h) = exp(h/2) exp(−h/2) + exp(h/2) = exp {h/2 − log [exp(h/2) + exp(−h/2)]} = exp {h/2 + f (h)} f (h) is approximated by the first order Taylor expansion ˜ f (h, ξ) in variable h2 around point ξ. P(Yn,i = 1 | h) ≈ exp

  • h/2 +
  • f (ξ) + ∂f (h)

∂(h2)

  • h=ξ · (h2 − ξ2)
  • =

exp

  • h/2 + ˜

f (h, ξ)

  • =

˜ P(Yn,i = 1 | h, ξ)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 8 / 12

slide-25
SLIDE 25

Variational approximation

Let h = βn − δi. P(Yn,i = 1 | h) = exp(h) 1 + exp(h) = exp(h/2) exp(−h/2) + exp(h/2) = exp {h/2 − log [exp(h/2) + exp(−h/2)]} = exp {h/2 + f (h)} f (h) is approximated by the first order Taylor expansion ˜ f (h, ξ) in variable h2 around point ξ. P(Yn,i = 1 | h) ≈ exp

  • h/2 +
  • f (ξ) + ∂f (h)

∂(h2)

  • h=ξ · (h2 − ξ2)
  • =

exp

  • h/2 + ˜

f (h, ξ)

  • =

˜ P(Yn,i = 1 | h, ξ)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 8 / 12

slide-26
SLIDE 26

Variational approximation

Let h = βn − δi. P(Yn,i = 1 | h) = exp(h) 1 + exp(h) = exp(h/2) exp(−h/2) + exp(h/2) = exp {h/2 − log [exp(h/2) + exp(−h/2)]} = exp {h/2 + f (h)} f (h) is approximated by the first order Taylor expansion ˜ f (h, ξ) in variable h2 around point ξ. P(Yn,i = 1 | h) ≈ exp

  • h/2 +
  • f (ξ) + ∂f (h)

∂(h2)

  • h=ξ · (h2 − ξ2)
  • =

exp

  • h/2 + ˜

f (h, ξ)

  • =

˜ P(Yn,i = 1 | h, ξ)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 8 / 12

slide-27
SLIDE 27

Variational approximation

Let h = βn − δi. P(Yn,i = 1 | h) = exp(h) 1 + exp(h) = exp(h/2) exp(−h/2) + exp(h/2) = exp {h/2 − log [exp(h/2) + exp(−h/2)]} = exp {h/2 + f (h)} f (h) is approximated by the first order Taylor expansion ˜ f (h, ξ) in variable h2 around point ξ. P(Yn,i = 1 | h) ≈ exp

  • h/2 +
  • f (ξ) + ∂f (h)

∂(h2)

  • h=ξ · (h2 − ξ2)
  • =

exp

  • h/2 + ˜

f (h, ξ)

  • =

˜ P(Yn,i = 1 | h, ξ)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 8 / 12

slide-28
SLIDE 28

Variational approximation

f (h) = log [exp(h/2) + exp(−h/2)] is a convex function in variable x = h2

10 20 30 40 50h^2

  • 3.5
  • 3
  • 2.5
  • 2
  • 1.5
  • 1

fh^2

Therefore, f (h) ≥ ˜ f (h, ξ) and, consequently, ˜ P(Yn,i = 1 | h, ξ) is a lower bound of P(Yn,i = 1 | h). ˜ f (h, ξ) is a quadratic function of β therefore ˜ P(Yn,i = 1 | h, ξ) = exp

  • h/2 + ˜

f (h, ξ)

  • is proportional to a Gaussian

distribution N(ν, τ 2)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 9 / 12

slide-29
SLIDE 29

Variational approximation

f (h) = log [exp(h/2) + exp(−h/2)] is a convex function in variable x = h2

10 20 30 40 50h^2

  • 3.5
  • 3
  • 2.5
  • 2
  • 1.5
  • 1

fh^2

Therefore, f (h) ≥ ˜ f (h, ξ) and, consequently, ˜ P(Yn,i = 1 | h, ξ) is a lower bound of P(Yn,i = 1 | h). ˜ f (h, ξ) is a quadratic function of β therefore ˜ P(Yn,i = 1 | h, ξ) = exp

  • h/2 + ˜

f (h, ξ)

  • is proportional to a Gaussian

distribution N(ν, τ 2)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 9 / 12

slide-30
SLIDE 30

Variational approximation

f (h) = log [exp(h/2) + exp(−h/2)] is a convex function in variable x = h2

10 20 30 40 50h^2

  • 3.5
  • 3
  • 2.5
  • 2
  • 1.5
  • 1

fh^2

Therefore, f (h) ≥ ˜ f (h, ξ) and, consequently, ˜ P(Yn,i = 1 | h, ξ) is a lower bound of P(Yn,i = 1 | h). ˜ f (h, ξ) is a quadratic function of β therefore ˜ P(Yn,i = 1 | h, ξ) = exp

  • h/2 + ˜

f (h, ξ)

  • is proportional to a Gaussian

distribution N(ν, τ 2)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 9 / 12

slide-31
SLIDE 31

Variational approximation

  • f P(Yn,i = y | δ, β) by ˜

P(Yn,i = y | δ, β, ξ)

y = 0 y = 1

  • 6
  • 4
  • 2

2 4 6 0.2 0.4 0.6 0.8 1

  • 6
  • 4
  • 2

2 4 6 0.2 0.4 0.6 0.8 1

ξ = −6, −4, −2, 0, +2 ξ = −2, 0, +2, +4, +6

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 10 / 12

slide-32
SLIDE 32

Computations with the Rasch model using a variational approximation with ξi = 0.5 for i = 1, . . . , 4

prior Nβ(0, 1)

  • 3
  • 2
  • 1

1 2 3 0.1 0.2 0.3 0.4

Nβ(0, 1)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 11 / 12

slide-33
SLIDE 33

Computations with the Rasch model using a variational approximation with ξi = 0.5 for i = 1, . . . , 4

P(Y = 1 | β, δ1 = −2) posterior

  • 3
  • 2
  • 1

1 2 3 0.1 0.2 0.3 0.4 0.5 0.6 0.7

  • 3
  • 2
  • 1

1 2 3 0.01 0.02 0.03 0.04 0.05 0.06

Nβ(0, 1) · ˜ P(Y = 1 | β, δ1 = −2)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 11 / 12

slide-34
SLIDE 34

Computations with the Rasch model using a variational approximation with ξi = 0.5 for i = 1, . . . , 4

P(Y = 0 | β, δ2 = 0) posterior

  • 3
  • 2
  • 1

1 2 3 0.2 0.4 0.6 0.8

  • 3
  • 2
  • 1

1 2 3 0.005 0.01 0.015 0.02 0.025

Nβ(0, 1) · ˜ P(Y = 1 | β, δ1 = −2) · ˜ P(Y = 0 | β, δ2 = 0)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 11 / 12

slide-35
SLIDE 35

Computations with the Rasch model using a variational approximation with ξi = 0.5 for i = 1, . . . , 4

P(Y = 0 | β, δ3 = +1) posterior

  • 3
  • 2
  • 1

1 2 3 0.2 0.4 0.6 0.8

  • 3
  • 2
  • 1

1 2 3 0.001 0.002 0.003 0.004 0.005 0.006

Nβ(0, 1) · ˜ P(Y = 1 | β, δ1 = −2) · ˜ P(Y = 0 | β, δ2 = 0) ·˜ P(Y = 0 | β, δ3 = +1)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 11 / 12

slide-36
SLIDE 36

Computations with the Rasch model using a variational approximation with ξi = 0.5 for i = 1, . . . , 4

P(Y = 0 | β, δ4 = +2) posterior

  • 3
  • 2
  • 1

1 2 3 0.1 0.2 0.3 0.4 0.5 0.6 0.7

  • 3
  • 2
  • 1

1 2 3 0.0002 0.0004 0.0006 0.0008 0.001 0.0012

Nβ(0, 1) · ˜ P(Y = 1 | β, δ1 = −2) · ˜ P(Y = 0 | β, δ2 = 0) ·˜ P(Y = 0 | β, δ3 = +1) · ˜ P(Y = 0 | β, δ4 = +2)

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 11 / 12

slide-37
SLIDE 37

Optimization of the variational parameter ξ

In the example we had the value of ξi = 0.5 fixed for i = 1, 2, 3, 4. But we can find the optimal value of ξi for each i. The task was to find model parameters σ, δ1, . . . , δI that maximize

  • likelihood. If we substitute the lower bounds ˜

P(yn,i | βn, δi, ξn,i) L =

N

  • n=1
  • Nβn(0, σ2) ·

I

  • i=1

˜ P(yn,i | βn, δi, ξn,i)dβn ... this integral has a closed-form solution! Since for any value of ξi we have that ˜ P(Yi | β, δi, ξi) is a lower bound, we can find the best value of ξi as the value that maximizes approximated likelihood. For example, we can use the EM-algorithm to find optimal parameters.

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 12 / 12

slide-38
SLIDE 38

Optimization of the variational parameter ξ

In the example we had the value of ξi = 0.5 fixed for i = 1, 2, 3, 4. But we can find the optimal value of ξi for each i. The task was to find model parameters σ, δ1, . . . , δI that maximize

  • likelihood. If we substitute the lower bounds ˜

P(yn,i | βn, δi, ξn,i) L =

N

  • n=1
  • Nβn(0, σ2) ·

I

  • i=1

˜ P(yn,i | βn, δi, ξn,i)dβn ... this integral has a closed-form solution! Since for any value of ξi we have that ˜ P(Yi | β, δi, ξi) is a lower bound, we can find the best value of ξi as the value that maximizes approximated likelihood. For example, we can use the EM-algorithm to find optimal parameters.

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 12 / 12

slide-39
SLIDE 39

Optimization of the variational parameter ξ

In the example we had the value of ξi = 0.5 fixed for i = 1, 2, 3, 4. But we can find the optimal value of ξi for each i. The task was to find model parameters σ, δ1, . . . , δI that maximize

  • likelihood. If we substitute the lower bounds ˜

P(yn,i | βn, δi, ξn,i) L =

N

  • n=1
  • Nβn(0, σ2) ·

I

  • i=1

˜ P(yn,i | βn, δi, ξn,i)dβn ... this integral has a closed-form solution! Since for any value of ξi we have that ˜ P(Yi | β, δi, ξi) is a lower bound, we can find the best value of ξi as the value that maximizes approximated likelihood. For example, we can use the EM-algorithm to find optimal parameters.

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 12 / 12

slide-40
SLIDE 40

Optimization of the variational parameter ξ

In the example we had the value of ξi = 0.5 fixed for i = 1, 2, 3, 4. But we can find the optimal value of ξi for each i. The task was to find model parameters σ, δ1, . . . , δI that maximize

  • likelihood. If we substitute the lower bounds ˜

P(yn,i | βn, δi, ξn,i) L =

N

  • n=1
  • Nβn(0, σ2) ·

I

  • i=1

˜ P(yn,i | βn, δi, ξn,i)dβn ... this integral has a closed-form solution! Since for any value of ξi we have that ˜ P(Yi | β, δi, ξi) is a lower bound, we can find the best value of ξi as the value that maximizes approximated likelihood. For example, we can use the EM-algorithm to find optimal parameters.

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 12 / 12

slide-41
SLIDE 41

Optimization of the variational parameter ξ

In the example we had the value of ξi = 0.5 fixed for i = 1, 2, 3, 4. But we can find the optimal value of ξi for each i. The task was to find model parameters σ, δ1, . . . , δI that maximize

  • likelihood. If we substitute the lower bounds ˜

P(yn,i | βn, δi, ξn,i) L =

N

  • n=1
  • Nβn(0, σ2) ·

I

  • i=1

˜ P(yn,i | βn, δi, ξn,i)dβn ... this integral has a closed-form solution! Since for any value of ξi we have that ˜ P(Yi | β, δi, ξi) is a lower bound, we can find the best value of ξi as the value that maximizes approximated likelihood. For example, we can use the EM-algorithm to find optimal parameters.

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 12 / 12

slide-42
SLIDE 42

Optimization of the variational parameter ξ

In the example we had the value of ξi = 0.5 fixed for i = 1, 2, 3, 4. But we can find the optimal value of ξi for each i. The task was to find model parameters σ, δ1, . . . , δI that maximize

  • likelihood. If we substitute the lower bounds ˜

P(yn,i | βn, δi, ξn,i) L =

N

  • n=1
  • Nβn(0, σ2) ·

I

  • i=1

˜ P(yn,i | βn, δi, ξn,i)dβn ... this integral has a closed-form solution! Since for any value of ξi we have that ˜ P(Yi | β, δi, ξi) is a lower bound, we can find the best value of ξi as the value that maximizes approximated likelihood. For example, we can use the EM-algorithm to find optimal parameters.

  • F. Rijmen and J. Vomlel ()

Rasch model Salzburg, December, 3, 2005 12 / 12