Predicting Temporal Sets with Deep Neural Networks Le Yu, Leilei - PowerPoint PPT Presentation

Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’20), August 23– 27, 2020, Virtual Event, CA, USA. Predicting Temporal Sets with Deep Neural Networks Le Yu, Leilei Sun*, Bowen Du, Chuanren Liu, Hui Xiong, Weifeng Lv 1

CONTENTS Background 01 Formalization 02 03 Motivation 04 Methodology Experiments 05 2

Background 3

Background Baskets in shopping Drugs in healthcare Places in traveling Courses in schools 4

Related Work Existing methods usually follow a two-stage strategy (Yu et al. 2016, Hu and He, 2019): (1) set embedding (2) sequential behaviors learning Information loss The two-stage approach often leads to information loss. 5

A unique perspective Existing methods: Sequential Behaviors + Set Embedding Learning Lead to information loss How would the model perform if we make prediction of temporal sets from a different perspective? This work: Learning the relationship of elements Leverage information in elements relationship as much as possible 6

Formalization ⚫ L et 𝕍 = 𝑣 1 , 𝑣 2 , ⋯ , 𝑣 𝑜 , 𝕎 = {𝑤 1 , 𝑤 2 , ⋯ , 𝑤 𝑛 } denote the set of 𝑜 users and 𝑛 elements, a set 𝑇 ⊂ 𝕎 denotes the collection of elements. 𝑈 that records the 1 , 𝑇 𝑗 2 , ⋯ , 𝑇 𝑗 ⚫ Given ： a sequence of sets 𝕋 𝑗 = 𝑇 𝑗 historical behaviors of user 𝑣 𝑗 ∈ 𝕍 ⚫ Goal ： predict the next-period set of 𝑣 𝑗 , 𝑈+1 = 𝑔 𝑇 𝑗 መ 1 , 𝑇 𝑗 2 , ⋯ , 𝑇 𝑗 𝑈 , 𝑿 , 𝑇 𝑗 where 𝑿 is the trainable parameter. 7

Our Method: DNNTSP Fuse static and dynamic information by a gated updating mechanism Learn temporal dependencies of elements in the sequence of sets Learn set-level element relationship 8

Element Relationship Learning Learn element relationship in the same set. 1) Weighted Graphs Construction 𝑤 𝑗,1 , 𝑤 𝑗,2 , 𝑤 𝑗,1 , 𝑤 𝑗,3 , 𝑤 𝑗,1 , 𝑤 𝑗,2 , 2 , 𝑤 𝑗,1 , 𝑤 𝑗,2 , 0.67 , 𝑤 𝑗,2 , 𝑤 𝑗,1 , 𝑤 𝑗,2 , 𝑤 𝑗,3 , 𝑤 𝑗,1 , 𝑤 𝑗,3 , 3 , 𝑤 𝑗,1 , 𝑤 𝑗,3 , 1.0 , 𝑤 𝑗,3 , 𝑤 𝑗,1 , (𝑤 𝑗,3 , 𝑤 𝑗,2 ) 𝑤 𝑗,1 , 𝑤 𝑗,4 , 1 , 𝑤 𝑗,1 , 𝑤 𝑗,4 , 0.33 , 𝑤 𝑗,2 , 𝑤 𝑗,1 , 2 , 𝑤 𝑗,2 , 𝑤 𝑗,1 , 0.67 , … … … … … 𝑤 𝑗,1 , 𝑤 𝑗,1 , 1 , 𝑤 𝑗,1 , 𝑤 𝑗,1 , 0.33 , 𝑤 𝑗,1 , 𝑤 𝑗,3 , 𝑤 𝑗,1 , 𝑤 𝑗,4 , 𝑤 𝑗,2 , 𝑤 𝑗,2 , 1 , 𝑤 𝑗,2 , 𝑤 𝑗,2 , 0.33 , 𝑤 𝑗,3 , 𝑤 𝑗,1 , 𝑤 𝑗,3 , 𝑤 𝑗,4 , 𝑤 𝑗,3 , 𝑤 𝑗,3 , 1 , 𝑤 𝑗,3 , 𝑤 𝑗,3 , 0.33 , 𝑤 𝑗,4 , 𝑤 𝑗,1 , (𝑤 𝑗,4 , 𝑤 𝑗,3 ) 𝑤 𝑗,4 , 𝑤 𝑗,4 , 1 𝑤 𝑗,4 , 𝑤 𝑗,4 , 0.33 ( a ) ( b ) ( c ) ( d ) 2) Weighted Convolutions on Dynamic Graphs 𝑢,𝑚+1 = 𝜏 𝒄 𝑚 + 𝑢 𝑘, 𝑙 · 𝑿 𝑚 𝒅 𝑗,𝑙 𝑢,𝑚 𝒅 𝑗,𝑘 ෍ 𝐵 𝑗 𝑢 ∪{𝑘} 𝑙∈𝒪 𝑗,𝑘 trainable convolutional parameters 9

Attention-based Temporal Dependency Learning Learn element temporal dependency among the sequence of sets. T 𝑫 𝑗,𝑘 𝑿 𝑟 𝑫 𝒋,𝑘 𝑿 𝑙 𝒂 𝑗,𝑘 = 𝑡𝑝𝑔𝑢𝑛𝑏𝑦 + 𝑵 𝑗 ⋅ (𝑫 𝑗,𝑘 𝑿 𝑤 ) 𝐺 ′′ 𝑢,𝑢 ′ = ቊ 0, 𝑗𝑔 𝑢 ≤ 𝑢′ 𝑁 𝑗 −∞, 𝑝𝑢ℎ𝑓𝑠𝑥𝑗𝑡𝑓 is a masked matrix. T T · 𝒂 𝑗,𝑘 𝒜 𝑗,𝑘 = 𝒂 𝑗,𝑘 · 𝒙 𝑏𝑕𝑕 a trainable parameter to learn the importance of different timestamps 10

Gated Information Fusing Mine shared patterns and fuse static and dynamic representations of elements. 𝑣𝑞𝑒𝑏𝑢𝑓 = 1 − 𝛾 𝑗,𝐽 𝑘 · 𝛿 𝐽 𝑘 𝑭 𝑗,𝐽 𝑘 · 𝑭 𝑗,𝐽 𝑘 + (𝛾 𝑗,𝐽 𝑘 · 𝛿 𝐽 𝑘 ) · 𝒜 𝑗,𝑘 controls the importance of static an indicator vector and dynamic representations 11

The Learning Process Prediction : 𝑣𝑞𝑒𝑏𝑢𝑓 𝒙 𝑝 + 𝒄 𝑝 𝒛 𝑗 = 𝑡𝑗𝑕𝑛𝑝𝑗𝑒 𝑭 𝑗 ෝ Loss function: 𝑂 1 𝑛 𝑀 = − 1 𝟑 𝑂 ෍ 𝑛 ෍ 𝑧 𝑗,𝑘 log ො 𝑧 𝑗,𝑘 + (1 − 𝑧 𝑗,𝑘 ) log 1 − ො 𝑧 𝑗,𝑘 + 𝝁 𝑿 𝑗 𝑘 12

Datasets and Baselines Four datasets are used for evaluation, i.e. TaFeng, DC, TaoBao and TMS. Statistics of the datasets: Both classical and state-of-the-art methods: TOP, PersonalTOP, ElementTransfer, DREAM and Sets2Sets Three evaluation metrics: Recall, NDCG and PHR 13

Experimental Results Our model outperforms existing methods with a significant margin. 14

Experimental Results Our model is applicable to scenarios with sparse data. • Our model is better than Sets2Sets- when no empirical information is added. • Our method outperforms Sets2Sets when incorporating the component for modelling repeated behaviors. 15

Applications ? ? ? ? ? ? customer patient … … 𝑈 + 1 𝑈 + 1 time time automatic treatment next-basket recommendation ? ? ? ? ? ? tourist student … 𝑈 + 1 time … 𝑈 + 1 time travel-package recommendation courses planning 16

Our Team Website: https://www.brilliantasus.com/ 17

Thanks! Contact Le Yu yule@buaa.edu.cn Reference Le Yu, Leilei Sun*, Bowen Du, Chuanren Liu, Hui Xiong, Weifeng Lv, Predicting Temporal Sets with Deep Neural Networks, KDD 2020 Code and data https://github.com/yule-BUAA/DNNTSP 18

Predicting Temporal Sets with Deep Neural Networks Le Yu, Leilei - PowerPoint PPT Presentation

Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 20), August 23 27, 2020, Virtual Event, CA, USA. Predicting Temporal Sets with Deep Neural Networks Le Yu, Leilei Sun*, Bowen Du, Chuanren Liu, Hui

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Temporal Privacy in Wireless Sensor Networks Temporal Privacy in Wireless Sensor Networks

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Deep Learning with Neural Networks The Structure and Optimization of Deep Neural Networks Allan

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep

Spatio-Temporal Statistics with R Chapter Two: Exploring Spatio-Temporal Data Spatio-Temporal

Optimizing Deep Neural Networks Leena Chennuru Vankadara 26-10-2015 Table of Contents Neural

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

On the Expressive Power of Deep Neural Networks Maithra Raghu, Ben Poole, Jon Kleinberg, Surya

Weight Parameterizations in Deep Neural Networks Sergey Zagoruyko e Paris-Est, Universit

(Very) Brief Introduction to Neural Networks IITP-03 Algorithms for NLP 1 / 31 Learning

Introduction to Deep Neural Networks 0. Logistics Spring 2020 1 Neural Networks are taking

Dual Variational Generation for Low Shot Heterogeneous Face Recognition Chaoyou Fu, Xiang Wu

Short introduction to the CHAIN-REDS Project Federico Ruggieri INFN/GARR Joint CHAIN-REDS /

Multifractal analysis: an example with two different Olsens cutoff functions Jacques Peyri`

Normalization Techniques in Training of Deep Neural Networks Lei Huang ( ) State Key

Multi-domain Hybrid RKDG and WENO-FD Method for Hyperbolic Conservation Laws Tiegang Liu School

Adaptive Multi-Compositionality for Recursive Neural Models with Applications to Sentiment

The Automated Acquisition of Suggestions from Tweets July 16, 2013 What is suggestion?

CARSI: Cross University Identity Management and Resource Sharing over CERNET Prof. PING CHEN