Differentially-Private Federated Linear Bandits Introduction - PowerPoint PPT Presentation

Differentially- Private Federated Linear Bandits Dubey and Pentland June 2020 Differentially-Private Federated Linear Bandits Introduction Federated Learning Contextual Bandits Summary Background Abhimanyu Dubey and Alex Pentland Contextual Bandits Federated Bandits Optimism Media Lab and Institute for Data Systems and Society (IDSS) Cooperation Massachusetts Institute of Technology Differential Privacy Method dubeya@mit.edu Algorithm Design Algorithm Regret Guarantees June 2020 Conclusion

Differentially- Federated Learning Private Federated Linear Bandits Dubey and Pentland June 2020 Introduction Federated Learning Contextual Bandits Summary Background Contextual Bandits Federated Bandits Optimism Cooperation Differential Privacy Method Algorithm Design Algorithm Regret Guarantees Conclusion Figure: Federated Learning (courtesy blogs.nvidia.com ).

Differentially- Federated Learning Private Federated Linear Bandits Dubey and Pentland June 2020 Advantages : ◮ Agents have small personal datasets, resulting in weak local models. Introduction Federated Learning ◮ The federated learning model allows each agent to leverage the Contextual Bandits Summary stronger joint model trained on data from all agents. Background Contextual Bandits ◮ Federated learning is designed to be private: Federated Bandits Optimism ◮ No raw data leaves any agent. Cooperation ◮ All messages sent to the server must keep user data private. Differential Privacy Method Challenges : Algorithm Design Algorithm ◮ Communication-utility tradeoff: frequent communication can be Regret Guarantees Conclusion expensive and non-private, but grant higher utility. ◮ Performance guarantees are non-trivial to obtain for private algorithms.

Differentially- Multi-Armed Bandits Private Federated Linear Bandits Dubey and Pentland June 2020 Introduction Federated Learning Contextual Bandits Summary Background Contextual Bandits Federated Bandits Optimism Cooperation Differential Privacy Method Algorithm Design Algorithm Regret Guarantees Conclusion Figure: Multi-armed bandit (courtesy lilianweng.github.io ).

Differentially- Contextual Bandits Private Federated Linear Bandits Dubey and Pentland June 2020 Introduction ◮ The most fundamental reinforcement learning problem, basic framework to Federated Learning Contextual Bandits Summary study sequential decision-making. Background ◮ Contextual bandits have numerous applications: Contextual Bandits Federated Bandits ◮ Recommender systems in e-commerce. Optimism ◮ Portfolio selection and management. Cooperation Differential Privacy ◮ Channel selection in distributed communication systems. Method ◮ Information retrieval and caching. Algorithm Design ◮ Power schedules for current limiting in electric vehicle batteries. Algorithm Regret Guarantees Conclusion

Differentially- Summary of Contributions Private Federated Linear Bandits Dubey and Pentland June 2020 Introduction ◮ We study the contextual bandit in a differentially-private federated setting. Federated Learning Contextual Bandits ◮ We provide the first differentially-private algorithms for both centralized Summary Background and decentralized federated learning for the multi-agent contextual bandit. Contextual Bandits ◮ We prove rigorous bounds on the utility of our algorithms - matching Federated Bandits � Optimism near-optimal rates in terms of regret (utility) and only a factor of O ( 1 /ε ) Cooperation Differential Privacy from the optimal rate in terms of privacy. Method Algorithm Design ◮ We additionally shed some light into the communication-utility tradeoff, Algorithm Regret Guarantees and provide design guidelines for practitioners in real-world settings. Conclusion

Differentially- Single-Agent Contextual Bandits Private Federated Linear Bandits Dubey and Pentland June 2020 ◮ In each round t , the agent is given a decision set D t . Introduction ◮ They select an action x t ∈ D t and obtain a reward y t , such that Federated Learning Contextual Bandits Summary y t = ( θ ∗ ) ⊤ x t + ε t , Background Contextual Bandits Federated Bandits ε t is i.i.d. noise, and θ ∗ is an unknown (but fixed) parameter vector. Optimism Cooperation ◮ The objective of the problem is to minimize regret: Differential Privacy Method Algorithm Design � � T � Algorithm ( θ ∗ ) ⊤ x ∗ t − ( θ ∗ ) ⊤ x t , where , x ∗ x ⊤ θ ∗ . R ( T ) = t = arg max Regret Guarantees Conclusion x ∈ D t t =1

Differentially- Federated Contextual Bandits Private Federated Linear Bandits Dubey and Pentland ◮ M agents are each solving the same contextual bandit in parallel. June 2020 ◮ Each agent m ∈ [ M ] receives their own (unique) decision sets, and selects Introduction actions independently of other agents. Federated Learning Contextual Bandits ◮ Agents communicate with each other following fixed protocols: Summary ◮ Centralized Setting : Agents synchronize via a central server, i.e., they send Background Contextual Bandits synchronization requests to the server, and the server acts as an intermediary. Federated Bandits ◮ Decentralized Setting : Agents directly communicate with each other over Optimism Cooperation an undirected network via peer-to-peer messages. Differential Privacy Method ◮ The objective of the problem is to minimize group regret: Algorithm Design Algorithm Regret Guarantees � � T � � Conclusion ( θ ∗ ) ⊤ x ∗ m , t − ( θ ∗ ) ⊤ x m , t , where , x ∗ x ⊤ θ ∗ . R M ( T ) = m , t = arg max x ∈ D m , t t =1 m ∈ [ M ]

Differentially- The Upper Confidence Bound (UCB) Algorithm Private Federated Linear Bandits Dubey and Pentland June 2020 ◮ “Optimism in the face of uncertainty” strategy – i.e. to be optimistic about Introduction Federated Learning an arm when we are uncertain of its utility. Contextual Bandits Summary ◮ In the multi-armed setting, for each arm we compute: Background � Contextual Bandits � n k ( t − 1) Federated Bandits r i 2 ln( t − 1) Optimism i =1 k UCB k ( t ) = + . Cooperation n k ( t − 1) n k ( t − 1) Differential Privacy � �� Method empirical mean exploration bonus Algorithm Design Algorithm Regret Guarantees Conclusion ◮ Choose arm with largest UCB k ( t ).

Differentially- The Upper Confidence Bound (UCB) Algorithm Private Federated Linear Bandits Dubey and Pentland June 2020 Introduction ◮ In the contextual bandit case, we construct an analog to the UCB in the Federated Learning Contextual Bandits form of a confidence set E t . Summary Background ◮ E t is a region of R d that contains θ ∗ with high probability. Contextual Bandits Federated Bandits ◮ The action is taken optimistically with respect to E t , i.e., Optimism Cooperation � � Differential Privacy Method x t = arg max max θ ∈E t � x , θ � . Algorithm Design x ∈D t Algorithm Regret Guarantees Conclusion

Differentially- The Upper Confidence Bound (UCB) Algorithm Private Federated Linear Bandits Dubey and Pentland June 2020 ◮ How do we construct a reasonable E t ? Introduction ◮ We look to the classic linear prediction problem: linear regression. Given Federated Learning � � ⊤ and y < t = [ y 1 y 2 ... y t − 1 ] ⊤ , consider: Contextual Bandits x ⊤ 1 x ⊤ 2 ... x ⊤ X < t = Summary t − 1 Background � � Contextual Bandits ˆ � X < t θ − y < t � 2 2 + θ ⊤ H t θ θ t := arg min Federated Bandits Optimism θ ∈ R d Cooperation Differential Privacy ◮ The regression solution can be given by ˆ Method θ t := ( G t + H t ) − 1 X ⊤ < t y < t , where Algorithm Design G t = X ⊤ < t X < t is the Gram matrix of actions, and H t is a regularizer. Algorithm Regret Guarantees ◮ Since we know the finite-sample behavior of linear regression, we can center Conclusion E t around the estimate ˆ θ t to obtain a reasonable algorithm.

Differentially- The Upper Confidence Bound (UCB) Algorithm Private Federated Linear Bandits Dubey and Pentland June 2020 Introduction ◮ We can therefore set E t as follows (for some fixed β t ): Federated Learning Contextual Bandits � � Summary θ ∈ R d : ( θ − ˆ θ t ) ⊤ ( G t + H t )( θ − ˆ E t := θ t ) ≤ β t Background Contextual Bandits Federated Bandits ◮ E t is an ellipsoid centered at ˆ θ t , and β t determines its “radius”. Optimism Cooperation Differential Privacy ◮ The UCB can be given as Method � � Algorithm Design UCB t ( x ) = � ˆ x ⊤ ( G t + H t ) − 1 x Algorithm θ t , x � + β t . Regret Guarantees Conclusion

Differentially-Private Federated Linear Bandits Introduction - PowerPoint PPT Presentation

Differentially- Private Federated Linear Bandits Dubey and Pentland June 2020 Differentially-Private Federated Linear Bandits Introduction Federated Learning Contextual Bandits Summary Background Abhimanyu Dubey and Alex Pentland

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The

Cooperative Multi-Agent Bandits with Heavy Tails Introduction K-Armed Bandits Cooperation

Introduction to Bandits R emi Munos SequeL project: Sequential Learning

Verifying Differentially Private Bayesian Inference Marco Gaboardi University of Dundee Joint

Differentially Private Recommender Systems David Madras University of Toronto April 4, 2017

Federated Learning Min Du Postdoc, UC Berkeley Outline q Preliminary: deep learning and SGD q

Estimating the Variance of Complex Differentially Private Algorithms Robert Ashmead JSM 2019,

Chicag cago o Bandits dits Affili liate te Program ram Junior r Affiliate and Tra vel

Data Poisoning Attack cks on Stoch chastic c Bandits Fang Liu and Ness Shroff Outline

Module 13 Bayesian Bandits CS 886 Sequential Decision Making and Reinforcement Learning

Econ 2148, fall 2019 Multi-armed bandits Maximilian Kasy Department of Economics, Harvard

CS885 Reinforcement Learning Lecture 8b: May 25, 2018 Bayesian and Contextual Bandits [SutBar]

Weighted bandits or: How bandits learn distorted values that are not expected Prashanth L.A.

On adaptive regret bounds for non- stochastic bandits Gergely Neu INRIA Lille, SequeL team

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson Joint

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson Joint

Empirical Evaluation of Latency-Sensitive Application Performance in the Cloud Sean Barker and

Why Do We Need Multiple Recovery Options? TECHNOLOGY PARTNERS www.vembu.com Agenda Introduction

Chapter 20 Intruders Cryptography and Network Security They agreed that Graham should set the

The Hardware & Software Implications of Microservices and How Big Data Can Help Christina

CS 683 - Security and Privacy Fall 2019 Instructor: Karim Eldefrawy University of San Francisco

price discrimina5on in e-commerce Nataliia Bielova INDES team

2014 STATE OF E-COMMERCE IN DISTRIBUTION Featuring: Jonathan Bein, Ranga Bodla and Tom Gale

Will you be PCI DSS Compliant by September 2010? Michael DSa, Visa Canada Presentation to

Differentially-Private Federated Linear Bandits Introduction - PowerPoint PPT Presentation

Differentially- Private Federated Linear Bandits Dubey and Pentland June 2020 Differentially-Private Federated Linear Bandits Introduction Federated Learning Contextual Bandits Summary Background Abhimanyu Dubey and Alex Pentland

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The

Cooperative Multi-Agent Bandits with Heavy Tails Introduction K-Armed Bandits Cooperation

Introduction to Bandits R emi Munos SequeL project: Sequential Learning

Verifying Differentially Private Bayesian Inference Marco Gaboardi University of Dundee Joint

Differentially Private Recommender Systems David Madras University of Toronto April 4, 2017

Federated Learning Min Du Postdoc, UC Berkeley Outline q Preliminary: deep learning and SGD q

Estimating the Variance of Complex Differentially Private Algorithms Robert Ashmead JSM 2019,

Chicag cago o Bandits dits Affili liate te Program ram Junior r Affiliate and Tra vel

Data Poisoning Attack cks on Stoch chastic c Bandits Fang Liu and Ness Shroff Outline

Module 13 Bayesian Bandits CS 886 Sequential Decision Making and Reinforcement Learning

Econ 2148, fall 2019 Multi-armed bandits Maximilian Kasy Department of Economics, Harvard

CS885 Reinforcement Learning Lecture 8b: May 25, 2018 Bayesian and Contextual Bandits [SutBar]

Weighted bandits or: How bandits learn distorted values that are not expected Prashanth L.A.

On adaptive regret bounds for non- stochastic bandits Gergely Neu INRIA Lille, SequeL team

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson Joint

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson Joint

Empirical Evaluation of Latency-Sensitive Application Performance in the Cloud Sean Barker and

Why Do We Need Multiple Recovery Options? TECHNOLOGY PARTNERS www.vembu.com Agenda Introduction

Chapter 20 Intruders Cryptography and Network Security They agreed that Graham should set the

The Hardware &amp; Software Implications of Microservices and How Big Data Can Help Christina

CS 683 - Security and Privacy Fall 2019 Instructor: Karim Eldefrawy University of San Francisco

price discrimina5on in e-commerce Nataliia Bielova INDES team

2014 STATE OF E-COMMERCE IN DISTRIBUTION Featuring: Jonathan Bein, Ranga Bodla and Tom Gale

Will you be PCI DSS Compliant by September 2010? Michael DSa, Visa Canada Presentation to

The Hardware & Software Implications of Microservices and How Big Data Can Help Christina