Scalable Machine Learning
- 10. Distributed Inference and Applications
Alex Smola Yahoo! Research and ANU
http://alex.smola.org/teaching/berkeley2012 Stat 260 SP 12
Scalable Machine Learning 10. Distributed Inference and Applications - - PowerPoint PPT Presentation
Scalable Machine Learning 10. Distributed Inference and Applications Alex Smola Yahoo! Research and ANU http://alex.smola.org/teaching/berkeley2012 Stat 260 SP 12 Outline Latent Dirichlet Allocation Basic model Sampling and
http://alex.smola.org/teaching/berkeley2012 Stat 260 SP 12
USA airline
Singapore airline
Singapore food
USA food
Singapore university
Australia university
?
?
cluster probability cluster label instance
topic probability topic label instance
Cluster/ topic distributions
clustering: (0, 1) matrix topic model: stochastic matrix LSI: arbitrary matrices
Latent Dirichlet Allocation; Blei, Ng, Jordan, JMLR 2003
topic probability topic label instance
word model
p(θ|α) = Γ (P
i αi)
Q
i Γ(αi)
Y
i
θαi−1
i
p(y|θ) = θy
p(x|φ, y) = φx,y
integrate
topic label instance
p(θ|α) = Γ (P
i αi)
Q
i Γ(αi)
Y
i
θαi−1
i
p(y|θ) = θy
p(x|φ, y) = φx,y
integrate
topic label instance
exchangeable exchangeable
p(yd1, . . . ydm|α)
k
Y
j=1
p(W|yid = j|β)
p(θ) ∝ p(Xfake|θ) p(x|X) = Z p(x|θ)p(θ|X)dθ ∝ Z p(x|θ)p(X|θ)p(Xfake|θ)dθ = Z p({x} ∪ X ∪ Xfake|θ)dθ
look up closed form expansions
(Beta, binomial) (Dirichlet, multinomial) (Gamma, Poisson) (Wishart, Gauss)
http://en.wikipedia.org/wiki/Exponential_family
p(θ) = exp (m0 hµ0, θi m0g(θ) h(m0µ0, m0))
p(θ|X) ∝
m
Y
i=1
p(xi|θ)p(θ) = exp * m0µ0 +
m
X
i=1
φ(xi), θ + − (m0 + m)g(θ) − h(m0µ0, m0) !
p(X|µ0, m0) = exp (h(m0µ0 + mµ[X], m0 + m) − h(m0µ0, m0))
p(X|µ0, m0) = exp (h(m0µ0 + mµ[X], m0 + m) − h(m0µ0, m0))
p(θ|α) = Γ (P
i αi)
Q
i Γ(αi)
Y
i
θαi−1
i
h(α) = X
i
log Γ(αi) − log Γ X
i
αi ! exp h(α ∪ X) − h(α) = αi + ni P
i αi + ni
topic label instance
exchangeable exchangeable
n−ij(t, w) + βt n−i(t) + P
t βt
n−ij(t, d) + αt n−i(d) + P
t αt
topic label instance
exchangeable exchangeable
n−ij(t, w) + βt n−i(t) + P
t βt
n−ij(t, d) + αt n−i(d) + P
t αt
p(t|wij) ∝ βw αt n(t) + ¯ β + βw n(t, d = i) n(t) + ¯ β + n(t, w = wij) [n(t, d = i) + αt] n(t) + ¯ β
p(t|wij) ∝ βw αt n(t) + ¯ β + βw n(t, d = i) n(t) + ¯ β + n(t, w = wij) [n(t, d = i) + αt] n(t) + ¯ β
slow
p(t|wij) ∝ βw αt n(t) + ¯ β + βw n(t, d = i) n(t) + ¯ β + n(t, w = wij) [n(t, d = i) + αt] n(t) + ¯ β
slow changes rapidly
p(t|wij) ∝ βw αt n(t) + ¯ β + βw n(t, d = i) n(t) + ¯ β + n(t, w = wij) [n(t, d = i) + αt] n(t) + ¯ β
slow changes rapidly moderately fast
p(t|wij) ∝ βw αt n(t) + ¯ β + βw n(t, d = i) n(t) + ¯ β + n(t, w = wij) [n(t, d = i) + αt] n(t) + ¯ β
slow changes rapidly moderately fast
table out
blocking network bound memory inefficient
network bound
concurrent cpu hdd net
network bound memory inefficient
concurrent cpu hdd net
minimal view
table out
network bound memory inefficient continuous sync
concurrent cpu hdd net
minimal view
table out
blocking network bound memory inefficient continuous sync barrier free
concurrent cpu hdd net
minimal view
(almost) avoids stalling for locks in the sampler
tokens topics file combiner count updater diagnostics &
file topics sampler sampler sampler sampler sampler
Intel Threading Building Blocks
joint state table
ice ice ice ice
ice ice ice ice
(read from disk if already assigned - hotstart)
(worst case we lose 1 iteration out of 1000)
a single machine dies.
parameters θ and ѱ
problem in Y
not a typical sample)
p(X, Y |α, β)
Hal Daume; Joey Gonzalez
problem in θ and ѱ
(this breaks for large models)
Hoffmann, Blei, Bach (in VW)
p(X, ψ, θ|α, β)
distribution by tractable factors
statistics)
(this breaks for large models)
Blei, Ng, Jordan
log p(x) log p(x) D(q(y)kp(y|x)) = Z dq(y) [log p(x) + log p(y|x) q(y)] = Z dq(y) log p(x, y) + H[q]
Can be done in parallel
Can be done in parallel
(only aggregate statistics)
independent*
(this breaks for large models)
2 2 1
*for the right model
parameters θ and ѱ
yij|X,Y-ij at a time from
(variables lock each other)
p(X, Y |α, β)
Griffiths & Steyvers 2005
n−ij(t, w) + βt n−i(t) + P
t βt
n−ij(t, d) + αt n−i(d) + P
t αt
parameters θ and ѱ
yij|X,Y-ij at a time from
(variables lock each other)
p(X, Y |α, β)
Griffiths & Steyvers 2005
n−ij(t, w) + βt n−i(t) + P
t βt
n−ij(t, d) + αt n−i(d) + P
t αt
machine
between machines
Asuncion, Smyth, Welling, ... UCI Mimno, McCallum, ... UMass
n−ij(t, w) + βt n−i(t) + P
t βt
n−ij(t, d) + αt n−i(d) + P
t αt
(delayed updates from samplers)
(Welling, Asuncion, et al. 2008)
its own sufficient statistics)
synchronizer quality
Ahmed, Gonzalez, et al., 2012
n−ij(t, w) + βt n−i(t) + P
t βt
n−ij(t, d) + αt n−i(d) + P
t αt
data likelihood
distribution is too uneven
p(X, Y |α, β)
Canini, Shi, Griffiths, 2009 Ahmed et al., 2011
p(X, Y |α, β) =
m
Y
i=1
p(xi, yi|x1, y1, . . . xi−1, yi−1, α, β)
yi ∼ p(yi|xi, x1, y1, . . . xi−1, yi−1, α, β)
p(xi+1|x1, y1, . . . xi, yi, α, β)
data likelihood
distribution is too uneven
p(X, Y |α, β)
Canini, Shi, Griffiths, 2009 Ahmed et al., 2011
p(X, Y |α, β) =
m
Y
i=1
p(xi, yi|x1, y1, . . . xi−1, yi−1, α, β)
yi ∼ p(yi|xi, x1, y1, . . . xi−1, yi−1, α, β)
p(xi+1|x1, y1, . . . xi, yi, α, β)
parallelization is open problem
is messy
(integration over y), e.g. as part of sampler
algorithm with log loss ...
Uncollapsed Variational approximation Collapsed natural parameters Collapsed topic assignments Optimization
too costly easy parallelization big memory footprint
too costly easy to optimize big memory footprint difficult parallelization Sampling slow mixing conditionally independent n.a. fast mixing difficult parallelization approximate inference by delayed updates particle filtering sequential sampling difficult
global state is too large does not fit into memory network load & barriers does not fit into memory local state is too large
global state is too large does not fit into memory network load & barriers does not fit into memory local state is too large
stream local data from disk
global state is too large does not fit into memory network load & barriers does not fit into memory local state is too large
stream local data from disk asynchronous synchronization
global state is too large does not fit into memory network load & barriers does not fit into memory local state is too large
stream local data from disk asynchronous synchronization partial view
x ← x + (xglobal − xold) xold ← xglobal δ ← x − xold xold ← x xglobal ← xglobal + δ
x ← x + (xglobal − xold) xold ← xglobal δ ← x − xold xold ← x xglobal ← xglobal + δ
m(x) = argmin
m∈M
h(x, m)
m(x) = argmin
m∈M
h(x, m)
m(x) = argmin
m∈M
h(x, m)
m(x) = argmin
m∈M
h(x, m)
0.78 < eff. < 0.89
(no global locks are needed)
y’
current copy global state
times (think 1000x)
time local data is accessed
y
tokens topics file combiner count updater diagnostics &
file topics sampler sampler sampler sampler sampler
Sha and Jordan 2008; Zhu, Ahmed and Xing 2009)
health but not about sports and politics
health but not about sports and politics
determine segments
needed
Girolami & Kaban, 2003; Wallach, 2006; Wang & McCallum, 2007
timestamp
lexical similarity, distributional similarity
2008; Zhu, Ahmed and Xing 2009)
φ2 φ1 φ3
Assume that the total smoother weight is constant
p(y|Y, α) = n(y) + αy n + P
y0 αy0
p(y|Y, α) = n(y) n + P
y0 αy0 and p(new|Y, α) =
α n + α
and
new
Genera=ve ¡Process
φ2 φ1 φ3
the rich get richer
φ2,1 φ1,1 φ3,1
T=1 T=2 m'1,1=2 m'2,1=3 m'3,1=1
φ2,1 φ1,1 φ3,1
T=1 T=2 m'1,1=2 m'2,1=3 m'3,1=1
φ2,1 φ1,1 φ3,1
φ2,1 φ1,1 φ3,1
T=1 T=2 m'1,1=2 m'2,1=3 m'3,1=1
φ2,1 φ1,1 φ3,1
φ2,1 φ1,1 φ3,1
T=1 T=2 m'1,1=2 m'2,1=3 m'3,1=1
φ2,1 φ1,1 φ3,1
φ2,1 φ1,1 φ3,1
T=1 T=2 m'1,1=2 m'2,1=3 m'3,1=1 Sample ¡φ1,2 ¡~ ¡P(.| φ1,1) ¡
φ2,1 φ1,1 φ3,1
φ2,1 φ1,1 φ3,1
T=1 T=2 m'1,1=2 m'2,1=3 m'3,1=1
φ2,1 φ1,1 φ3,1
T=1 T=2
φ4,2
m'1,1=2 m'2,1=3 m'3,1=1
φ2,2 φ1,2 φ3,1
φ2,1 φ1,1 φ3,1
T=1 T=2
φ2,2 φ1,2 φ3,1
m'1,1=2 m'2,1=3 m'3,1=1
φ4,2
T=3
φ2,2 φ1,2 φ4,2
m'2,3
W= ∞ λ = ∞
DPM
W=4 λ = .4
TDPM
W= 0 λ = ? (any)
Independent DPMs
37
power law
10 20 30 40 0.1 0.2 0.3 Propotion Day
Baseball Finance Jobs Dating
time
time
show ads now
time
show ads now too late
time
Car Deals van job Hiring diet Hiring Salary Diet calories Auto Price Used inspec=on Flight London Hotel weather Diet Calories Recipe chocolate Movies Theatre Art gallery School Supplies Loan college
Car Deals van job Hiring diet Hiring Salary Diet calories Auto Price Used inspec=on Flight London Hotel weather Diet Calories Recipe chocolate Movies Theatre Art gallery School Supplies Loan college
Car Deals van job Hiring diet Hiring Salary Diet calories Auto Price Used inspec=on Flight London Hotel weather Diet Calories Recipe chocolate Movies Theatre Art gallery School Supplies Loan college CARS Art Diet Jobs Travel College finance
Flight London Hotel weather School Supplies Loan college Travel College finance Input
Output
wij zij θi α φk αt β θt
i
zij wij φt
k
βt αt−1 αt+1 θt−1
i
θt+1
i
φt−1
k
βt−1 φt+1
k
βt+1
time dependent user interest user actions actions per topic
job ¡ Career Business Assistant Hiring Part-‑=me Recep=onist Car Blue Book Kelley Prices Small Speed large Bank Online Credit Card debt ¡ porZolio Finance Chase Recipe Chocolate Pizza Food Chicken Milk Bu\er Powder
Time ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡t ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡t+1 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
Food Chicken pizza recipe job hiring Part-‑=me Opening salary food chicken Pizza millage Kelly recipe cuisine
Diet Cars Job Finance Prior ¡for ¡user ¡ ac=ons ¡at ¡=me ¡t
All
job ¡ Career Business Assistant Hiring Part-‑=me Recep=onist Car Blue Book Kelley Prices Small Speed large Bank Online Credit Card debt ¡ porZolio Finance Chase Recipe Chocolate Pizza Food Chicken Milk Bu\er Powder
Time ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡t ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡t+1 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
Food Chicken pizza recipe job hiring Part-‑=me Opening salary food chicken Pizza millage Kelly recipe cuisine
Diet Cars Job Finance Prior ¡for ¡user ¡ ac=ons ¡at ¡=me ¡t
Long-‑term
All
job ¡ Career Business Assistant Hiring Part-‑=me Recep=onist Car Blue Book Kelley Prices Small Speed large Bank Online Credit Card debt ¡ porZolio Finance Chase Recipe Chocolate Pizza Food Chicken Milk Bu\er Powder
month Time ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡t ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡t+1 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
Food Chicken pizza recipe job hiring Part-‑=me Opening salary food chicken Pizza millage Kelly recipe cuisine
Diet Cars Job Finance Prior ¡for ¡user ¡ ac=ons ¡at ¡=me ¡t
Long-‑term
All week
job ¡ Career Business Assistant Hiring Part-‑=me Recep=onist Car Blue Book Kelley Prices Small Speed large Bank Online Credit Card debt ¡ porZolio Finance Chase Recipe Chocolate Pizza Food Chicken Milk Bu\er Powder
month Time ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡t ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡t+1 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
Food Chicken pizza recipe job hiring Part-‑=me Opening salary food chicken Pizza millage Kelly recipe cuisine
Diet Cars Job Finance Prior ¡for ¡user ¡ ac=ons ¡at ¡=me ¡t
Long-‑term short-‑term
All week
job ¡ Career Business Assistant Hiring Part-‑=me Recep=onist Car Blue Book Kelley Prices Small Speed large Bank Online Credit Card debt ¡ porZolio Finance Chase Recipe Chocolate Pizza Food Chicken Milk Bu\er Powder
month Time ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡t ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡t+1 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
Food Chicken pizza recipe job hiring Part-‑=me Opening salary food chicken Pizza millage Kelly recipe cuisine
Diet Cars Job Finance Prior ¡for ¡user ¡ ac=ons ¡at ¡=me ¡t
Long-‑term short-‑term
All week
job ¡ Career Business Assistant Hiring Part-‑=me Recep=onist Car Blue Book Kelley Prices Small Speed large Bank Online Credit Card debt ¡ porZolio Finance Chase Recipe Chocolate Pizza Food Chicken Milk Bu\er Powder
month Time ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡t ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡t+1 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
Food Chicken pizza recipe job hiring Part-‑=me Opening salary food chicken Pizza millage Kelly recipe cuisine
Diet Cars Job Finance Prior ¡for ¡user ¡ ac=ons ¡at ¡=me ¡t μ μ2 μ3
Long-‑term short-‑term
Food ¡Chicken
Pizza ¡ ¡mileage
Car ¡speed ¡offer Camry ¡accord ¡career
At ¡)me ¡t At ¡)me ¡t+1
job ¡ Career Business Assistant Hiring Part-‑Eme RecepEoni st Car AlEma Accord Blue Book Kelley Prices Small Speed Bank Online Credit Card debt ¡ porYolio Finance Chase Recipe Chocolate Pizza Food Chicken Milk Bu\er Powder
Food ¡Chicken
Pizza ¡ ¡mileage
Car ¡speed ¡offer Camry ¡accord ¡career
At ¡)me ¡t At ¡)me ¡t+1
job ¡ Career Business Assistant Hiring Part-‑Eme RecepEoni st Car AlEma Accord Blue Book Kelley Prices Small Speed Bank Online Credit Card debt ¡ porYolio Finance Chase Recipe Chocolate Pizza Food Chicken Milk Bu\er Powder
Food ¡Chicken
Pizza ¡ ¡mileage
Car ¡speed ¡offer Camry ¡accord ¡career
At ¡)me ¡t At ¡)me ¡t+1
short-‑term priors
job ¡ Career Business Assistant Hiring Part-‑Eme RecepEoni st Car AlEma Accord Blue Book Kelley Prices Small Speed Bank Online Credit Card debt ¡ porYolio Finance Chase Recipe Chocolate Pizza Food Chicken Milk Bu\er Powder
Food ¡Chicken
Pizza ¡ ¡mileage
Car ¡speed ¡offer Camry ¡accord ¡career
At ¡)me ¡t At ¡)me ¡t+1
Genera=ve ¡Process
short-‑term priors
job ¡ Career Business Assistant Hiring Part-‑Eme RecepEoni st Car AlEma Accord Blue Book Kelley Prices Small Speed Bank Online Credit Card debt ¡ porYolio Finance Chase Recipe Chocolate Pizza Food Chicken Milk Bu\er Powder
At ¡)me ¡t At ¡)me ¡t+1 At ¡)me ¡t+2 At ¡)me ¡t+3 User ¡1 process User ¡2 process User ¡3 process Global process m m' n n'
10 20 30 40 0.1 0.2 0.3 Propotion Day
Baseball Finance Jobs Dating
10 20 30 40 0.1 0.2 0.3 0.4 0.5 Propotion Day
Baseball Dating Celebrity Health Snooki Tom Cruise Katie Holmes Pinkett Kudrow Hollywood League baseball basketball, doublehead Bergesen Griffey bullpen Greinke skin body fingers cells toes wrinkle layers women men dating singles personals seeking match
Dating Baseball Celebrity Health
job career business assistant hiring part-time receptionist financial Thomson chart real Stock Trading currency
Jobs Finance
10 20 30 40 0.1 0.2 0.3 Propotion Day
Baseball Finance Jobs Dating
10 20 30 40 0.1 0.2 0.3 0.4 0.5 Propotion Day
Baseball Dating Celebrity Health Snooki Tom Cruise Katie Holmes Pinkett Kudrow Hollywood League baseball basketball, doublehead Bergesen Griffey bullpen Greinke skin body fingers cells toes wrinkle layers women men dating singles personals seeking match
Dating Baseball Celebrity Health
job career business assistant hiring part-time receptionist financial Thomson chart real Stock Trading currency
Jobs Finance
50 52 54 56 58 60 62
Dataset−2
>1000 [1000,600] [600,400] [400,200] [200,100] [100,60] [60,40] [40,20] <20 baseline TLDA TLDA+Baseline
Sample Z For users Sample Z For users Sample Z For users Sample Z For users
Barrier
Write counts to memcached Write counts to memcached Write counts to memcached Write counts to memcached Collect counts and sample Do nothing Do nothing Do nothing
Barrier
Read from memcached Read from memcached Read from memcached Read from memcached
UEFA-soccer Champions* Goal* Coach* Striker* Midfield* penalty* Juventus** AC*Milan** Lazio* Ronaldo* Lyon**** Tax-Bill Tax* Billion* Cut* Plan* Budget* Economy* Bush* Senate* Fleischer* White*House* Republican* Bor
der-T
ension ion
Nuclear* Border* Dialogue* DiplomaJc* militant* Insurgency* missile* Pakistan* India* Kashmir* New*Delhi* Islamabad* Musharraf* Vajpayee*
Sports games Won Team Final Season League held Poli)cs Government Minister AuthoriEes OpposiEon Officials Leaders group Accidents Police A\ack run man group arrested move
γ!
UEFA-soccer Champions* Goal* Coach* Striker* Midfield* penalty* Juventus** AC*Milan** Lazio* Ronaldo* Lyon**** Tax-Bill Tax* Billion* Cut* Plan* Budget* Economy* Bush* Senate* Fleischer* White*House* Republican* Bor
der-T
ension ion
Nuclear* Border* Dialogue* DiplomaJc* militant* Insurgency* missile* Pakistan* India* Kashmir* New*Delhi* Islamabad* Musharraf* Vajpayee*
Sports games Won Team Final Season League held Poli)cs Government Minister AuthoriEes OpposiEon Officials Leaders group Accidents Police A\ack run man group arrested move
γ!
UEFA-soccer Champions* Goal* Coach* Striker* Midfield* penalty* Juventus** AC*Milan** Lazio* Ronaldo* Lyon**** Tax-Bill Tax* Billion* Cut* Plan* Budget* Economy* Bush* Senate* Fleischer* White*House* Republican* Bor
der-T
ension ion
Nuclear* Border* Dialogue* DiplomaJc* militant* Insurgency* missile* Pakistan* India* Kashmir* New*Delhi* Islamabad* Musharraf* Vajpayee*
Sports games Won Team Final Season League held Poli)cs Government Minister AuthoriEes OpposiEon Officials Leaders group Accidents Police A\ack run man group arrested move
γ!
using Gibbs Sampling for each particle
(resample some assignments for diversity, too)
In our case predictive likelihood yields weights
Filter'threads'update'par-cles'
Root$
1
games:$1$
league:$4$
2 3
(empty)$ league:$5$ minister:$1$ games:$0$ season:$2$
Ini-al'tree' (ready'for'threads)'
Root$
1
games:$1$
league:$4$
2 3
(empty)$ league:$5$ games:$3$ minister:$7$ games:$0$ season:$2$
0 = get(1,’games’) set(2,’games’,3) set(3,’minister’,7)
Resampling'copies'par-cles'
Root$
games:$1$
league:$4$
2,1$ 3
(empty)$ league:$5$ games:$3$ minister:$7$ games:$0$ season:$2$
copy(2,1)
Prune.unused'branches'
Root$
games:$1$
league:$4$
2,1$ 3
(empty)$ league:$5$ games:$3$ minister:$7$ games:$0$ season:$2$
Collapse.long'branches'
Root$
games:$1$
league:$4$
2,1$ 3
league:$5$ games:$3$ minister:$7$
2,1$
games:$3$ season:$2$ league:$5$
maintain_prune() maintain_collapse()
Create'new.leaves'
Root$
games:$1$
league:$4$
3
minister:$7$ games:$3$ season:$2$ league:$5$
branch(1) branch(2)
1 2
(empty)$ (empty)$
New'ini-al'tree' (ready'for'threads)'
Root$
games:$1$
league:$4$
3
minister:$7$ games:$3$ season:$2$ league:$5$
1 2
(empty)$ (empty)$
Filter'threads'update'par-cles'
Root$
1
games:$1$
league:$4$
2 3
(empty)$ league:$5$ minister:$1$ games:$0$ season:$2$
Ini-al'tree' (ready'for'threads)'
Root$
1
games:$1$
league:$4$
2 3
(empty)$ league:$5$ games:$3$ minister:$7$ games:$0$ season:$2$
0 = get(1,’games’) set(2,’games’,3) set(3,’minister’,7)
Resampling'copies'par-cles'
Root$
games:$1$
league:$4$
2,1$ 3
(empty)$ league:$5$ games:$3$ minister:$7$ games:$0$ season:$2$
copy(2,1)
Prune.unused'branches'
Root$
games:$1$
league:$4$
2,1$ 3
(empty)$ league:$5$ games:$3$ minister:$7$ games:$0$ season:$2$
Collapse.long'branches'
Root$
games:$1$
league:$4$
2,1$ 3
league:$5$ games:$3$ minister:$7$
2,1$
games:$3$ season:$2$ league:$5$
maintain_prune() maintain_collapse()
Create'new.leaves'
Root$
games:$1$
league:$4$
3
minister:$7$ games:$3$ season:$2$ league:$5$
branch(1) branch(2)
1 2
(empty)$ (empty)$
New'ini-al'tree' (ready'for'threads)'
Root$
games:$1$
league:$4$
3
minister:$7$ games:$3$ season:$2$ league:$5$
1 2
(empty)$ (empty)$
Root
1
India: [(I-P tension,3),(Tax bills,1)] Pakistan: [(I-P tension,2),(Tax bills,1)] Congress: [(I-P tension,1),(Tax bills,1)]
2 3
(empty) Congress: [(I-P tension,0),(Tax bills,2)] Bush: [(I-P tension,1),(Tax bills,2)] India: [(Tax bills,0)] India: [(I-P tension,2)] US: [(I-P tension,1),[Tax bills,1)]
Extended Inheritance Tree
[(I-P tension,2),(Tax bills,1)] = get_list(1,’India’) set_entry(3,’India’,’Tax ¡bills’,0)
Note: ¡“I-P ¡tension” ¡is ¡short ¡for ¡“India-Pakistan ¡tension”
time entities topics story words 0.84 0.90 0.86 0.75
Sports
games won team final season league held
Politics
government minister authorities
leaders group
Unrest
police attack run man group arrested move India-Pakistan tension nuclear border dialogue diplomatic militant insurgency missile Pakistan India Kashmir New Delhi Islamabad Musharraf Vajpayee UEFA-soccer champions goal leg coach striker midfield penalty Juventus AC Milan Real Madrid Milan Lazio Ronaldo Lyon Tax bills tax billion cut plan budget economy lawmakers Bush Senate US Congress Fleischer White House Republican
T O P I C S S T O R Y L I N E S
India-Pakistan tension nuclear border dialogue diplomatic militant insurgency missile Pakistan India Kashmir New Delhi Islamabad Musharraf Vajpayee Middle-east conflict Peace Roadmap Suicide Violence Settlements bombing Israel Palestinian West bank Sharon Hamas Arafat North Korea nuclear nuclear summit warning policy missile program North Korea South Korea U.S Bush Pyongyang
“Show similar stories by topic” “Show similar stories, require the word nuclear”
Build ¡a ¡model ¡to ¡describe ¡both ¡ collec=ons ¡of ¡data
VisualizaEon
Build ¡a ¡model ¡to ¡describe ¡both ¡ collec=ons ¡of ¡data
VisualizaEon
ClassificaEon
Build ¡a ¡model ¡to ¡describe ¡both ¡ collec=ons ¡of ¡data
VisualizaEon
ClassificaEon Structured ¡browsing
Ω1 Ω2 β1 β1 βk-‑1 βk φ1,1 φ1,2 φ1,k φ2,1 φ2,2 φ2,k Ideology ¡1 Views Ideology ¡2 Views Topics
Ω1 Ω2 β1 β2 βk-‑1 βk φ1,1 φ1,2 φ1,k φ2,1 φ2,2 φ2,k Ideology ¡1 Views Ideology ¡2 Views Topics
Ω1 Ω2 β1 β2 βk-‑1 βk φ1,1 φ1,2 φ1,k φ2,1 φ2,2 φ2,k Ideology ¡1 Views Ideology ¡2 Views Topics
λ 1−λ λ 1−λ
palestinian israeli peace year political process state end right
government
need conflict way security palestinian israeli Peace political
process end security conflict way government people time year force negotiation bush US president american sharon administration prime pressure policy washington powell minister colin visit internal policy statement express pro previous package work transfer european arafat state leader roadmap election month iraq yasir senior involvement clinton terrorism
US ¡ ¡role
PalesEnian View Israeli View
roadmap phase security ceasefire state plan international step authority end settlement implementation obligation stop expansion commitment fulfill unit illegal present previous assassination meet forward process force terrorism unit provide confidence element interim discussion union succee point build positive recognize present timetable
Roadmap ¡process
syria syrian negotiate lebanon deal conference concession asad agreement regional
track negotiation official leadership position withdrawal time victory present second stand circumstance represent sense talk strategy issue participant parti negotiator peace strategic plo hizballah islamic neighbor territorial radical iran relation think
greater conventional intifada affect jihad time
Arab ¡Involvement
144