Learning Temporal Point Processes via Reinforcement Learning
Shuang Li1, Shuai Xiao2, Shixiang Zhu1, Nan Du3, Yao Xie1, Le Song1,2
1Georgia Institute of Technology 2Ant Financial 3Google Brain
Learning Temporal Point Processes via Reinforcement Learning Shuang - - PowerPoint PPT Presentation
Learning Temporal Point Processes via Reinforcement Learning Shuang Li 1 , Shuai Xiao 2 , Shixiang Zhu 1 , Nan Du 3 , Yao Xie 1 , Le Song 1,2 1 Georgia Institute of Technology 2 Ant Financial 3 Google Brain Motivation 1:30 pm 1:18 pm 1:00 pm
1Georgia Institute of Technology 2Ant Financial 3Google Brain
u Event data : tweets/retweets, crime events, earthquakes,
u Learn temporal pattern of event data.
u Event time is random u Complex dependency structure
time
!
"# "$ "%
History &'
" ?
David
1:00 pm Cool picture 1:18 pm Funny joke 1:30 pm Dinner together?
…
( " ∈ 0 ∪ ,-
u
Point Process 𝝁𝜾 𝒖|𝓘𝒖 Temporal Pattern
Poisson
constant
Inhomogeneous Poisson
𝜇4(𝑢)
Hawkes
𝜈 + 𝛽 8 exp −|𝑢 − 𝑢=|
time
!
"# "$ "%
Da David
"&
time
!
"# "$ "%
Da David
"& "' "# "$ "% "& "'
time
!
"# "$ "%
Da David
u Model conditional intensity
u Learn model by maximizing likelihood
"# "$ "% "& "'
time
!
"# "$ "%
Da David
u
u
imitate
time
!
density "∗ $ ≔ "($|())
$+ $, $-
History ()
$
id !" ℎ$ ℎ" !% ℎ% !& ℎ& '((!|+,): LSTM .(!") .(!%) .(!&)
David
u
V∈ℱ
]^ 𝔽]^ 8 𝑠 𝑏=
u
u
(d), 𝑏 − =eC f deC
(h), 𝑏 =eC i heC
mean embedding of expert intensity function mean embedding of policy intensity function
Policy Gradient
expert
"# (&|()) &+ &, &- &.
/∗(&+)/∗(&,) /∗(&-) /∗(&.) update optimal reward
1+ 1, 1-
u
Our method: RLPP
u
Baselines:
u
State-of-the-art methods: RMTPP(Du et al. 2016 KDD ) WGANTPP (Xiao et al. 2017 NIPS)
u
Parametric baselines: Inhomogeneous Poisson (IP), Hawkes (SE), Self-correcting (SC)
u
Comparison of learned empirical intensity
u
Comparison of runtime
2 4 6
4 8 12
Time Index Intensity
Methods Real RLPP WGAN IP SE RMTPP SC0.0 2.5 5.0 7.5
4 8 12
Time Index Intensity
Methods Real RLPP WGAN IP SE RMTPP SC1 2 3 4
4 8 12
Time Index Intensity
Methods Real RLPP WGAN IP SE RMTPP SCMethod RLPP WGANTPP RMTPP SE SC IP Time 80 m 1560m 60m 2m 2m 2m Ratio 40x 780x 30x 1x 1x 1x