A Reinforcement Learning and Synthetic Data Approach to Mobile - PowerPoint PPT Presentation

A Reinforcement Learning and Synthetic Data Approach to Mobile Notification Management Rowan Sutton, Kieran Fraser, Owen Conlan ADAPT Centre, Trinity College Dublin The ADAPT Centre is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.

Content www.adaptcentre.ie ❖ Motivation ❖ Research Design ❖ Experiment Implementation ❖ Results ❖ Limitations & Future Work ❖ Conclusion

Motivation: Anecdotal www.adaptcentre.ie

Motivation: SOTA www.adaptcentre.ie Growing number of Notification notifications pushed delivery not smart at users (Pielot, M. (Mehrotra, A. et al, et al, 2014). 2016). Unnecessary Large no. of notifications may incoming dramatically notifications = decrease negative user productivity (Iqbal, S. emotions (Sahami T. et al, 2010). Shirazi, A. et al, 2014).

Motivation: Observed Problem www.adaptcentre.ie

Research Design: Gathering Data www.adaptcentre.ie WeAreUs Android App ❖ Experience Sampling Method ❖ Moments of notification interest, moments of phone usage interest ❖ Anonymised & Synthesised

Research Design: Gathering Data www.adaptcentre.ie 31,329 15 participants notifications over 3 months logged 291 4,940 questionnaire smartphone responses usage logs

Research Design: Data Analysis www.adaptcentre.ie

Research Design: Synthesising Data www.adaptcentre.ie

Research Design: Synthesising Data www.adaptcentre.ie Train on Real, Test on Synthetic 1 RMSE F1 scores differ in range 0.02 – 0.07 indicating synthetic data imitates real world data. 1. Esteban, C., Hyland, S.L., Ratsch, G.: Real-valued (medical) time series generation

Research Design: Reinforcement Learning www.adaptcentre.ie OpenAI Gym Open source toolkit for “developing and comparing reinforcement learning algorithms” 1 Gym-Push Custom OpenAI Gym environment simulating push-notification overload on mobile device users 1. https://gym.openai.com/

Research Design: Reinforcement Learning www.adaptcentre.ie Gym-Push Custom OpenAI Gym environment simulating push-notification overload on mobile device users State Action Context + Open / Dismiss the Notification Features notification

Experiment Implementation www.adaptcentre.ie Q-learning Agent • Learn a policy to maximise total reward • Create q-table to track quality of state->action pairs • Updates q-values according to Watkins one-step Q-learning algorithm (1) • Can explore or exploit ( ε ) Deep Q-learning Agent • Replaces the q-table with a DNN • Takes the state as input and output is an action • Weights optimised based on the Huber Loss function (2)

Experiment Implementation www.adaptcentre.ie Individual User (Synth & Balanced) • Comprised of ≈ 6000 synthetic notifications • Split into sets of size: 50, 100, 250, 500, 1000, 2500, 5000 • Balanced Individual User (Real & Balanced) • Comprised of ≈ 6000 real notifications • Split into sets of size: 50, 100, 250, 500, 1000, 2500, 5000 • Balanced Individual User (Real & Unbalanced) • Comprised of ≈ 6000 real notifications • Split into sets of size: 50, 100, 250, 500, 1000, 2500, 5000 • Unbalanced Multiple Users (Real & Unbalanced) • Comprised of ≈ 1000 real notifications • Unbalanced

Experiment Implementation www.adaptcentre.ie ❖ Evaluating agents ability to correctly predict user action of open/dismiss notification ❖ Feature set: { app, category, time-of-day, day-of- week } ❖ Evaluate with 10-fold cross validation ❖ Accuracy ❖ Precision – important when cost of false positive is high e.g. agent predicts user wants to see it, delivers -> they end up dismissing it ❖ Recall – important when cost of false negative is high e.g. agent predicts user doesn’t need to see it, caches it -> they miss an important message ❖ F1

Results: Q-learning on synthetic data www.adaptcentre.ie

Results: Q-learning on real data www.adaptcentre.ie

Results: State Space Impact www.adaptcentre.ie Synthetic Data Real Data

Applied Research: Observed User Problem www.adaptcentre.ie

Applied Research: Q-learning Solution www.adaptcentre.ie

Results: DQN on synthetic data www.adaptcentre.ie

Results: DQN on real data www.adaptcentre.ie

Applied Research: Observed User Problem www.adaptcentre.ie

Applied Research: DQN Solution www.adaptcentre.ie

Results: Feature importance www.adaptcentre.ie

Results: Train on Synthetic, Test on Real www.adaptcentre.ie

Results: Multiple Users www.adaptcentre.ie

Limitations & Future Work www.adaptcentre.ie Limitations • Small set of users • Restricted set of features e.g. ticker text not used • Fixed hyper-parameters Future Work • Generative modeling applied to text • Exploring other RL algorithms e.g. HER, IMPALA • Larger user study

Future Work – Conditional Ticker Text Generation www.adaptcentre.ie

Future Work – Autonomous Personalised Notifications www.adaptcentre.ie

Conclusion www.adaptcentre.ie OpenAI Gym Shareable notification environment for training on data set notifications Two methods of RL Evaluations illustrate applied to agents achieve notification comparable management performance to SOTA

EvalUMAP www.adaptcentre.ie http://evalumap.adaptcentre.ie/

www.adaptcentre.ie Thank you. Questions? Demo: https://review2019.github.io Email: kieran.fraser@adaptcentre.ie

A Reinforcement Learning and Synthetic Data Approach to Mobile - PowerPoint PPT Presentation

A Reinforcement Learning and Synthetic Data Approach to Mobile Notification Management Rowan Sutton, Kieran Fraser, Owen Conlan ADAPT Centre, Trinity College Dublin The ADAPT Centre is funded under the SFI Research Centres Programme (Grant

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

Synthetic Biology Considerations in Synthetic Biology Considerations in Synthetic Biology

Introduction to Reinforcement Learning Kevin Chen and Zack Khan Lecture 1: Introduction to

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Synthetic Biology and Rational Design Keith Shearwin University of Adelaide Synthetic biology

Introduction to Reinforcement Learning and Q-Learning Skyler Seto (ss3349) May 2, 2016 Skyler

7. Motor Control and Reinforcement Learning Outline A. Action Selection and Reinforcement B.

CSC2621 Topics in Robotics Reinforcement Learning in Robotics Week 11: Hierarchical Reinforcement

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

Introduction CSCE CSCE 496/896 496/896 Lecture 7: Lecture 7: Reinforcement Reinforcement

Jugger - 5pm Thursdays on Memorial Glade cs160. cs160. valkyriesavage.com valkyriesavage.com

Creating a recruitment plan FOR your participants, not just for your research Ashley Smith,

Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning

Improving Accuracy in End-to-end Packet Loss Measurement * Joel Sommers, Paul Barford, Nick

Notes on Quantitative UX Research at Google Chris Chapman Quantitative UX Researcher Overview

Unified View width of backup Dynamic Temporal- programming difference learning height

Chapter 5: Monte Carlo Methods Monte Carlo methods are learning methods Experience

Monte Carlo Methods CS60077: Reinforcement Learning Abir Das IIT Kharagpur Sep 06 and 12, 2019