A Reinforcement Learning and Synthetic Data Approach to Mobile - - PowerPoint PPT Presentation

a reinforcement learning and synthetic
SMART_READER_LITE
LIVE PREVIEW

A Reinforcement Learning and Synthetic Data Approach to Mobile - - PowerPoint PPT Presentation

A Reinforcement Learning and Synthetic Data Approach to Mobile Notification Management Rowan Sutton, Kieran Fraser, Owen Conlan ADAPT Centre, Trinity College Dublin The ADAPT Centre is funded under the SFI Research Centres Programme (Grant


slide-1
SLIDE 1

A Reinforcement Learning and Synthetic Data Approach to Mobile Notification Management

Rowan Sutton, Kieran Fraser, Owen Conlan ADAPT Centre, Trinity College Dublin

The ADAPT Centre is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.

slide-2
SLIDE 2

www.adaptcentre.ie

Content

❖ Motivation ❖ Research Design ❖ Experiment Implementation ❖ Results ❖ Limitations & Future Work ❖ Conclusion

slide-3
SLIDE 3

www.adaptcentre.ie

Motivation: Anecdotal

slide-4
SLIDE 4

www.adaptcentre.ie

Motivation: SOTA

Growing number of notifications pushed at users (Pielot, M. et al, 2014). Large no. of incoming notifications = negative user emotions (Sahami Shirazi, A. et al, 2014). Notification delivery not smart (Mehrotra, A. et al, 2016). Unnecessary notifications may dramatically decrease productivity (Iqbal, S.

  • T. et al, 2010).
slide-5
SLIDE 5

www.adaptcentre.ie

Motivation: Observed Problem

slide-6
SLIDE 6

www.adaptcentre.ie

Research Design: Gathering Data WeAreUs Android App

❖ Experience Sampling Method ❖ Moments of notification interest, moments of phone usage interest ❖ Anonymised & Synthesised

slide-7
SLIDE 7

www.adaptcentre.ie

Research Design: Gathering Data

15 participants

  • ver 3 months

31,329 notifications logged 291 questionnaire responses 4,940 smartphone usage logs

slide-8
SLIDE 8

www.adaptcentre.ie

Research Design: Data Analysis

slide-9
SLIDE 9

www.adaptcentre.ie

Research Design: Data Analysis

slide-10
SLIDE 10

www.adaptcentre.ie

Research Design: Data Analysis

slide-11
SLIDE 11

www.adaptcentre.ie

Research Design: Data Analysis

slide-12
SLIDE 12

www.adaptcentre.ie

Research Design: Data Analysis

slide-13
SLIDE 13

www.adaptcentre.ie

Research Design: Synthesising Data

slide-14
SLIDE 14

www.adaptcentre.ie

Research Design: Synthesising Data

slide-15
SLIDE 15

www.adaptcentre.ie

Research Design: Synthesising Data

slide-16
SLIDE 16

www.adaptcentre.ie

Research Design: Synthesising Data

slide-17
SLIDE 17

www.adaptcentre.ie

Research Design: Synthesising Data

Train on Real, Test on Synthetic 1 RMSE F1 scores differ in range 0.02 – 0.07 indicating synthetic data imitates real world data.

  • 1. Esteban, C., Hyland, S.L., Ratsch, G.: Real-valued (medical) time series generation
slide-18
SLIDE 18

www.adaptcentre.ie

Research Design: Reinforcement Learning OpenAI Gym Open source toolkit for “developing and comparing reinforcement learning algorithms” 1 Gym-Push Custom OpenAI Gym environment simulating push-notification overload

  • n mobile device users
  • 1. https://gym.openai.com/
slide-19
SLIDE 19

www.adaptcentre.ie

Research Design: Reinforcement Learning Gym-Push Custom OpenAI Gym environment simulating push-notification overload

  • n mobile device users

State Context + Notification Features Action Open / Dismiss the notification

slide-20
SLIDE 20

www.adaptcentre.ie

Experiment Implementation

  • Learn a policy to maximise total reward
  • Create q-table to track quality of state->action pairs
  • Updates q-values according to Watkins one-step Q-learning

algorithm (1)

  • Can explore or exploit (ε)

Q-learning Agent

  • Replaces the q-table with a DNN
  • Takes the state as input and output is an action
  • Weights optimised based on the Huber Loss function (2)

Deep Q-learning Agent

slide-21
SLIDE 21

www.adaptcentre.ie

Experiment Implementation

  • Comprised of ≈6000 synthetic notifications
  • Split into sets of size: 50, 100, 250, 500, 1000, 2500, 5000
  • Balanced

Individual User (Synth & Balanced)

  • Comprised of ≈6000 real notifications
  • Split into sets of size: 50, 100, 250, 500, 1000, 2500, 5000
  • Balanced

Individual User (Real & Balanced)

  • Comprised of ≈6000 real notifications
  • Split into sets of size: 50, 100, 250, 500, 1000, 2500, 5000
  • Unbalanced

Individual User (Real & Unbalanced)

  • Comprised of ≈1000 real notifications
  • Unbalanced

Multiple Users (Real & Unbalanced)

slide-22
SLIDE 22

www.adaptcentre.ie

Experiment Implementation

❖ Evaluating agents ability to correctly predict user action of open/dismiss notification ❖ Feature set: { app, category, time-of-day, day-of- week } ❖ Evaluate with 10-fold cross validation ❖ Accuracy ❖ Precision – important when cost of false positive is high e.g. agent predicts user wants to see it, delivers

  • > they end up dismissing it

❖ Recall – important when cost of false negative is high e.g. agent predicts user doesn’t need to see it, caches it -> they miss an important message ❖ F1

slide-23
SLIDE 23

www.adaptcentre.ie

Results: Q-learning on synthetic data

slide-24
SLIDE 24

www.adaptcentre.ie

Results: Q-learning on synthetic data

slide-25
SLIDE 25

www.adaptcentre.ie

Results: Q-learning on synthetic data

slide-26
SLIDE 26

www.adaptcentre.ie

Results: Q-learning on real data

slide-27
SLIDE 27

www.adaptcentre.ie

Results: Q-learning on real data

slide-28
SLIDE 28

www.adaptcentre.ie

Results: State Space Impact Synthetic Data Real Data

slide-29
SLIDE 29

www.adaptcentre.ie

Applied Research: Observed User Problem

slide-30
SLIDE 30

www.adaptcentre.ie

Applied Research: Q-learning Solution

slide-31
SLIDE 31

www.adaptcentre.ie

Results: DQN on synthetic data

slide-32
SLIDE 32

www.adaptcentre.ie

Results: DQN on synthetic data

slide-33
SLIDE 33

www.adaptcentre.ie

Results: DQN on synthetic data

slide-34
SLIDE 34

www.adaptcentre.ie

Results: DQN on real data

slide-35
SLIDE 35

www.adaptcentre.ie

Results: DQN on real data

slide-36
SLIDE 36

www.adaptcentre.ie

Applied Research: Observed User Problem

slide-37
SLIDE 37

www.adaptcentre.ie

Applied Research: DQN Solution

slide-38
SLIDE 38

www.adaptcentre.ie

Results: Feature importance

slide-39
SLIDE 39

www.adaptcentre.ie

Results: Train on Synthetic, Test on Real

slide-40
SLIDE 40

www.adaptcentre.ie

Results: Train on Synthetic, Test on Real

slide-41
SLIDE 41

www.adaptcentre.ie

Results: Multiple Users

slide-42
SLIDE 42

www.adaptcentre.ie

Results: Multiple Users

slide-43
SLIDE 43

www.adaptcentre.ie

Limitations & Future Work

  • Small set of users
  • Restricted set of features e.g. ticker text not used
  • Fixed hyper-parameters

Limitations

  • Generative modeling applied to text
  • Exploring other RL algorithms e.g. HER, IMPALA
  • Larger user study

Future Work

slide-44
SLIDE 44

www.adaptcentre.ie

Future Work – Conditional Ticker Text Generation

slide-45
SLIDE 45

www.adaptcentre.ie

Future Work – Autonomous Personalised Notifications

slide-46
SLIDE 46

www.adaptcentre.ie

Future Work – Autonomous Personalised Notifications

slide-47
SLIDE 47

www.adaptcentre.ie

Future Work – Autonomous Personalised Notifications

slide-48
SLIDE 48

www.adaptcentre.ie

Conclusion Shareable notification data set OpenAI Gym environment for training on notifications Two methods of RL applied to notification management Evaluations illustrate agents achieve comparable performance to SOTA

slide-49
SLIDE 49

www.adaptcentre.ie

EvalUMAP

http://evalumap.adaptcentre.ie/

slide-50
SLIDE 50

www.adaptcentre.ie

Thank you. Questions?

Demo: https://review2019.github.io Email: kieran.fraser@adaptcentre.ie