An N-gram Topic Model for Time-Stamped Documents
Shoaib Jameel and Wai Lam
The Chinese University of Hong Kong
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
An N-gram Topic Model for Time-Stamped Documents Shoaib Jameel and - - PowerPoint PPT Presentation
An N-gram Topic Model for Time-Stamped Documents Shoaib Jameel and Wai Lam The Chinese University of Hong Kong Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia Outline Introduction and Motivation The Bag-of-Words (BoW) assumption
The Chinese University of Hong Kong
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
◮ The Bag-of-Words (BoW) assumption ◮ Temporal nature of data
◮ Temporal Topic Models ◮ N-gram Topic Models
◮ Background ⋆ Topics Over Time (TOT) Model - proposed earlier ⋆ Our proposed n-gram model
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
◮ Blei et al., (David M. Blei and John D. Lafferty. 2006.) - Dynamic
◮ Knights et al., (Knights, D., Mozer, M., and Nicolov, N. 2009.) -
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
◮ Blei et al., (David M. Blei and John D. Lafferty. 2006.) - Dynamic
◮ Knights et al., (Knights, D., Mozer, M., and Nicolov, N. 2009.) -
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
◮ Noriaki (Noriaki Kawamae. 2011.) - Trend Analysis Model - The
◮ Uri et al., (Uri Nodelman, Christian R. Shelton, and Daphne Koller.
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
◮ Noriaki (Noriaki Kawamae. 2011.) - Trend Analysis Model - The
◮ Uri et al., (Uri Nodelman, Christian R. Shelton, and Daphne Koller.
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
1
2
i
1
i from
2
i
i
3
i
i
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
i
i
2
i
i
i
i
i
w(d)
i
i
v=1(nz(d)
i
v + βv) − 1
i
Ω
z(d) i 1−1
(d)Ω
z(d) i 2−1
i
i
1, Ωz(d)
i
2)
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
4
5
6
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
4
5
6
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
4
5
6
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
4
5
6
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
4
5
6
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
4
5
6
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
α θ ti−1 ti ti+1 xi xi+1 zi−1 zi zi+1 wi−1 wi wi+1 D TW ψ γ T φ β δ σ TW Ω xi+2
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
i
i
i−1w(d) i−1);
i
i
i
w(d)
i−1) if x(d)
i
i
i
i
i
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
P(z(d)
i
, x(d)
i
|w, t, x(d)
¬i , z(d) ¬i , α, β, γ, δ, Ω) ∝
i−1w(d) i−1xi − 1
i
+ qdz(d)
i
− 1
(1 − t(d)
i
)
Ω
z(d) i 1−1
t
(d)Ω
z(d) i 2−1
i
B(Ωz(d)
i
1, Ωz(d)
i
2)
×
β
w(d) i
+n
z(d) i w(d) i −1
W
v=1(βv +n z(d) i v )−1
if x(d)
i
= 0
δ
w(d) i
+m
z(d) i w(d) i−1w(d) i
−1 W
v=1(δv +m z(d) i w(d) i−1v )−1
if x(d)
i
= 1 (3)
ˆ θ(d)
z
= αz + qdz T
t=1(αt + qdt )
(4) ˆ φzw = βw + nzw W
v=1(βv + nzv )
(5) ˆ ψzwk = γk + pzwk 1
k=0(γk + pzwk )
(6) ˆ σzwv = δv + mzwv W
v=1(δv + mzwv )
(7) ˆ Ωz1 = tz tz(1 − tz) s2
z
− 1
ˆ Ωz2 = (1 − tz) tz(1 − tz) s2
z
− 1
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
Input : γ, δ, α, T, β, Corpus, MaxIteration Output: Topic assignments for all the n-gram words with temporal information
1 Initialization: Randomly initialize the n-gram topic assignment for all words; 2 Zero all count variables; 3 for iteration ← 1 to MaxIteration do 4
for d ← 1 to D do
5
for w ← 1 to Nd according to word order do
6
Draw z(d)
w , x(d) w
defined in Equation 3;
7
if x(d)
w
← 0 then
8
Update nzw;
9
end
10
else
11
Update mzw;
12
end
13
Update qdz, pzw;
14
end
15
end
16
for z ← 1 to T do
17
Update Ωz by the method of moments as in Equations 8 and 9;
18
end
19 end 20 Compute the posterior estimates of α, β, γ, δ defined in Equations 4, 5, 6, 7; Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
1
2
1http://infomotions.com/etexts/gutenberg/dirs/etext04/suall11.txt 2http://www.cs.nyu.edu/roweis/data.html 3http://ai.stanford.edu/gal/Data/NIPS/ Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1800 1850 1900 1950 2000 1000 2000 3000
Year
Our Model Mexican War
1800 1850 1900 1950 2000 2000 4000 6000
Year TOT Mexican War
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1800 1850 1900 1950 2000 2000 4000 6000
Year Our Model Panama Canal
1800 1850 1900 1950 2000 2000 4000 6000
Year TOT Panama Canal
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
cells cell model response firing activity input neurons stimulus figure NIPS-1987 network learning input units training
layer hidden weights networks NIPS-1988 data model algorithm method probability models problem distribution information NIPS-1995 function data set distribution model models neural probability parameters networks NIPS-1996 NIPS-2004 NIPS-2005 algorithm state learning time algorithms step action node policy learning data set training algorithm test number kernel classification class set sequence
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
firing threshold time delay neural state low conduction safety correlogram peak centric models long channel synaptic chip frog sciatic nerve NIPS-1987 neural networks hidden units hidden layer neural network training set mit press hidden unit learning algorithm
NIPS-1988 linear algebra input signals gaussian filters
model matching resistive line input signal analog vlsi depth map temporal precision NIPS-1995 probability vector relevant documents continuous embedding doubly stochastic matrix probability vectors binding energy energy costs variability index learning bayesian polynomial time NIPS-1996 NIPS-2004 NIPS-2005
build stack reinforcement learning nash equilibrium suit stack synthetic items compressed map reward function td networks intrinsic reward kernel cca empirical risk training sample data clustering random selection gaussian regression
linear separators covariance operator line algorithm
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1990 1995 2000 2005 1000 2000 3000 4000 5000
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
1
2
3
4
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia
Shoaib Jameel and Wai Lam ECIR-2013, Moscow, Russia