Privacy Preserving Bandits Joint work with: Mohammad Malekzadeh - PowerPoint PPT Presentation

Privacy Preserving Bandits Joint work with: Mohammad Malekzadeh (QMUL/Brave) ● Hamed Haddadi(ICL/Brave) ● Ben Livshits (ICL/Brave) ● Dimitrios Athanasakis (Brave)• 02.03.2020 @dimmu

Why this is an important topic Personalization is often Personalization is invasive ubiquitous Tracking all over the ● Many sites/apps offer ● internet personalized experiences Why is my being a fan of my ● Advertising (arguably the ● little pony relevant to the single biggest application of pricing of my plane tickets? personalization) fuels the internet. Some info gets REALLY ● personal

Real-time Ad bidding Image source: The economist Big tech faces competition and privacy concerns in Brussels https://www.economist.com/briefi ng/2019/03/23/big-tech-faces-co mpetition-and-privacy-concerns-i n-brussels

Let’s learn everything locally Great for privacy Not so good for utility No data ever leaves the It may take a long time for ● ● user’s device, therefore the local model to learn a fewer things to worry from useful recommendation a privacy perspective. policy Eventually the local model What happens when new ● ● will learn a very accurate personalization options model recommendation appear policy for the user.

Online advertising and bandits Earning Learning Given what we know about ● What are the user’s ● the user how can we interests? maximise his engagement? Should we display an ad for ● product X to user Y? Have the interests of the ● user changed?

Problem Definition A_1 : P_1 A_2 : P_2 ฀ . State(t) Action(t) . . A_K : P_K K D Complexity? data tuple = ( S = [S_0, S_1, …, S_D] , A {1,2,...,K} , R {0,1} ) ∋ ∋ Privacy first! 6

State? What state? ● “brave://histograms” ● Example: ○ Past 100 page visits? (%) 7

Research Question ● How we can we enable an agent to know its user faster and better ? ● Choose the best CBA ● Warm start, instead of Cold! 8

Slight Problem How can we use user data to initialize a warm model without violating a user’s privacy? 9

Can you recognize yourself by your own data? Vanilla model inversion VS VS Model inversion on noised data

Can we quantify privacy? Crowd-blending Differential Privacy: ( Gehrke et al 2011) (Dwork & Roth 2013)

Our approach: ESA + LinUCB 12

State Space ● Histograms ○ D -dimensional vector of real numbers ○ Its sum is 1 ○ It’s rounded to F decimal points ● e.g. if we set D=10 : F 10 Stars into D Bars ○ with F=1 we have ~ 100K possible states ○ with F=2 it is ~ 4T Number of possible states is too large

Encoding * size shows the value ● e.g. D=3 , F=1 ● 66 possible states ● 6 cluster ○ Locality-sensitive hashing ● 3 bits This helps increasing the size of the crowd a user can blend in. E.g. D=10 → 10 bits : → 1K 4T

Shuffling ● Anonymization: Remove Meta-data (eg.ip address) received from local agents ● Shuffling: gather tuples received from different sources into batches and shuffle their order. ● Thresholding: remove tuples whose encoded context vector frequency in the batch is less than a defined threshold. ● Yes, that means throwing away potentially useful data for the sake of privacy ● This happens in an sgx secure enclave

Model updates ● Updates are performed using standard LinUCB update rules on the data the shuffler releases. ● Agents can then upload their local models according to the globally updated weights

Privacy Model ● Crowd-Blending + Sampling ⇒ Differential Privacy iid random sampling with probability p ○ Ɛ CB Ɛ DP = Ɛ DP p 17

Evaluation Algorithm Environment ● Synthetic Datasets ● Linear UCB ○ Linear and nonlinear randomly initialized mapping functions ■ Input: a histogram ■ Output: a stochastic preference model Context ● Real Multi-Label Datasets ○ Input: a binary vector (features) ● Histograms ○ Output: a binary vector (labels) ● Criteo Ad Recommendation Dataset ○ Input: Integer values (unknown features) ○ Output: a one-hot vector (product category) Github: 18 https://github.com/mmalekzadeh/privacy-preserving-bandits

Results: Synthetic Data ● Left: effect of available actions on expected reward for varying numbers of users ● Bottom: effect of the dimensionality of the context on expected reward 19

Results: Multi-Label Classification ● MediaMill: d=20, |A|=40, ~ 44000 instances ● TextMining: d=20, |A|=20, ~28,500 instances 20

Results: Ad. Recommendation (Criteo) ● k= 32 ● k= 128 |A|=40, d=10, u=3,000 agents 21

Some Remarks Personal Notes ● The Criteo ad recommendation ● Mohammad will be looking for a experiments are somewhat strange job soon. but surely interesting ● Pleasantly surprised to see ● ESA is making a comeback (ESA some remote presentations. Revisited) ● Also SMPC for bandits ● Feel free to play around with the notebooks. Also stickers, again Github: 22 https://github.com/mmalekzadeh/privacy-preserving-bandits

Let’s keep in touch 1. Poster #15 2. Working on privacy? Let’s talk. Have experiences in the adtech ecosystem? We’d like to hear from you. 3. We’re always looking for great engineers: https://brave.com/careers/ Also @dimmu

Privacy Preserving Bandits Joint work with: Mohammad Malekzadeh - PowerPoint PPT Presentation

Privacy Preserving Bandits Joint work with: Mohammad Malekzadeh (QMUL/Brave) Hamed Haddadi(ICL/Brave) Ben Livshits (ICL/Brave) Dimitrios Athanasakis (Brave) 02.03.2020 @dimmu Why this is an important topic Personalization

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The

Cooperative Multi-Agent Bandits with Heavy Tails Introduction K-Armed Bandits Cooperation

Introduction to Bandits R emi Munos SequeL project: Sequential Learning

Privacy Preserving Protocols Workshop on Cryptography for the Internet of Things Jens Hermans KU

FERTILITY PRESERVING SURGERY FERTILITY PRESERVING SURGERY FERTILITY PRESERVING SURGERY FERTILITY

Privacy Preserving Privacy Preserving Netw ork Flow Netw ork Flow Recording Recording Bilal

Data privacy: Privacy models Vicen c Torra March, 2019 Hamilton Institute, Maynooth

Privacy in Wireless Networks privacy notions and metrics; privacy in RFID systems; location

Chicag cago o Bandits dits Affili liate te Program ram Junior r Affiliate and Tra vel

Data Poisoning Attack cks on Stoch chastic c Bandits Fang Liu and Ness Shroff Outline

Module 13 Bayesian Bandits CS 886 Sequential Decision Making and Reinforcement Learning

Econ 2148, fall 2019 Multi-armed bandits Maximilian Kasy Department of Economics, Harvard

Differentially-Private Federated Linear Bandits Introduction Federated Learning Contextual

CS885 Reinforcement Learning Lecture 8b: May 25, 2018 Bayesian and Contextual Bandits [SutBar]

Weighted bandits or: How bandits learn distorted values that are not expected Prashanth L.A.

On adaptive regret bounds for non- stochastic bandits Gergely Neu INRIA Lille, SequeL team

Mathematical Modeling of Competition in Sponsored Search Market Jerry Jian Liu and Dah Ming Chiu

CSE 258 Lecture 15 Web Mining and Recommender Systems AdWords Advertising 1. We cant

Preference Networks in Matching Markets CSE 5339: Topics in Network Data Analysis Samir Chowdhury

Introduction to Computational Advertising MS&E 239 Stanford University Autumn 2011

ObliviAd : Provably Secure and Practical Online Behavioral Advertising [IEEE S&P 12]

Big Data Analytics Building Blocks. Simple Data Storage (SQLite) Duen Horng (Polo) Chau

Computer Poker Research at The University of Alberta Richard Gibson Computing Science Honours

IP over Optical Networks - A Framework draft-ip-optical-framework-01.txt Bala Rajagopalan James

Sambuz

Useful Links

Newsletter

Mail Us

Privacy Preserving Bandits Joint work with: Mohammad Malekzadeh - PowerPoint PPT Presentation

Privacy Preserving Bandits Joint work with: Mohammad Malekzadeh (QMUL/Brave) Hamed Haddadi(ICL/Brave) Ben Livshits (ICL/Brave) Dimitrios Athanasakis (Brave) 02.03.2020 @dimmu Why this is an important topic Personalization

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The

Cooperative Multi-Agent Bandits with Heavy Tails Introduction K-Armed Bandits Cooperation

Introduction to Bandits R emi Munos SequeL project: Sequential Learning

Privacy Preserving Protocols Workshop on Cryptography for the Internet of Things Jens Hermans KU

FERTILITY PRESERVING SURGERY FERTILITY PRESERVING SURGERY FERTILITY PRESERVING SURGERY FERTILITY

Privacy Preserving Privacy Preserving Netw ork Flow Netw ork Flow Recording Recording Bilal

Data privacy: Privacy models Vicen c Torra March, 2019 Hamilton Institute, Maynooth

Privacy in Wireless Networks privacy notions and metrics; privacy in RFID systems; location

Chicag cago o Bandits dits Affili liate te Program ram Junior r Affiliate and Tra vel

Data Poisoning Attack cks on Stoch chastic c Bandits Fang Liu and Ness Shroff Outline

Module 13 Bayesian Bandits CS 886 Sequential Decision Making and Reinforcement Learning

Econ 2148, fall 2019 Multi-armed bandits Maximilian Kasy Department of Economics, Harvard

Differentially-Private Federated Linear Bandits Introduction Federated Learning Contextual

CS885 Reinforcement Learning Lecture 8b: May 25, 2018 Bayesian and Contextual Bandits [SutBar]

Weighted bandits or: How bandits learn distorted values that are not expected Prashanth L.A.

On adaptive regret bounds for non- stochastic bandits Gergely Neu INRIA Lille, SequeL team

Mathematical Modeling of Competition in Sponsored Search Market Jerry Jian Liu and Dah Ming Chiu

CSE 258 Lecture 15 Web Mining and Recommender Systems AdWords Advertising 1. We cant

Preference Networks in Matching Markets CSE 5339: Topics in Network Data Analysis Samir Chowdhury

Introduction to Computational Advertising MS&amp;E 239 Stanford University Autumn 2011

ObliviAd : Provably Secure and Practical Online Behavioral Advertising [IEEE S&amp;P 12]

Big Data Analytics Building Blocks. Simple Data Storage (SQLite) Duen Horng (Polo) Chau

Computer Poker Research at The University of Alberta Richard Gibson Computing Science Honours

IP over Optical Networks - A Framework draft-ip-optical-framework-01.txt Bala Rajagopalan James

Sambuz

Useful Links

Newsletter

Mail Us

Introduction to Computational Advertising MS&E 239 Stanford University Autumn 2011

ObliviAd : Provably Secure and Practical Online Behavioral Advertising [IEEE S&P 12]