SLIDE 6 6
Eric Xing 11
KL and Log Likelihood
Jensen’s inequality KL and Lower bound of likelihood Setting q()=p(z|x) closes the gap (c.f. EM)
∑ ∑ ∑
≥ = = =
z z z
x z q z x p x z q x z q z x p x z q z x p x p x ) | ( ) | , ( log ) | ( ) | ( ) | , ( ) | ( log ) | , ( log ) | ( log ) ; ( θ θ θ θ θ l
) ( ) , ; ( ) ; ( q H z x x
q q c
L l l = + ≥ ⇒ θ θ
∑ ∑ ∑ ∑
+ = = = = =
z z z z
x z p z q z q z q z x p z q x z p z q z q z x p z q x z p z x p z q x z p z x p x p x ) , | ( ) ( log ) ( ) ( ) | , ( log ) ( ) , | ( ) ( ) ( ) | , ( log ) ( ) , | ( ) | , ( log ) ( ) , | ( ) | , ( log ) | ( log ) ; ( θ θ θ θ θ θ θ θ θ θ l
) || ( ) ( ) ; ( p q KL q x + = ⇒ L l θ
ln ( ) p D ln ( ) p D L( ) q L( ) q KL( || ) q p KL( || ) q p
Eric Xing 12
{ } { }
q q Q q q q Q q
H E H E q − = + − =
∈ ∈
min arg max arg
Difficulty: Hq is intractable for general q “solution”: approximate Hq and/or, relax or tighten Q where Q is the equivalent sets of realizable distributions, e.g., all valid parameterizations of exponential family distributions, marginal polytopes [winright et al. 2003].
A variational representation of probability distributions