Optimal p Sequential Resource Sharing and Exchange i in Multi - PowerPoint PPT Presentation

Optimal p Sequential Resource Sharing and Exchange i in Multi ‐ Agent Systems l i S Yuanzhang Xiao Advisor: Prof. Mihaela van der Schaar Electrical Engineering Department UCLA Electrical Engineering Department, UCLA Ph.D. defense, March 3, 2014

Research agenda Sequential resource sharing/exchange in multi ‐ agent systems Sequential resource sharing/exchange in multi agent systems • Sequential: Sequential: • Agents interact over a long time horizon • Agents’ current decisions affect future Agents current decisions affect future • Agents aim to maximize long ‐ term payoffs Different from standard myopic optimization problems y p p p • • Multi ‐ agent: • Multiple agents influencing each other Multiple agents influencing each other • Different from standard Markov decision processes (MDPs) New tools and formalisms! 2

Research dimensions Interactions • • agents interact with all other agents agents interact in pairs g p • • Externalities • one’s action affects the others’ payoffs directly and negatively p y y g y • one’s action affects the others’ payoffs directly and positively • one’s action does not affect the others’ payoffs, but is coupled p y , p with the others’ actions through constraints • Monitoring • perfect / imperfect • State State • none (system stays the same) / public / private Deviation ‐ proof p • • no / yes 3

Resource sharing with strong negative externality • Interactions • everybody interacts with everybody • agents interact in pairs g p • Externalities • one’s action affects the others’ payoffs directly and negatively p y y g y one’s action affects the others’ payoffs directly and positively • • one’s action does not affect the others’ payoffs, but is coupled p y p with the others’ actions through constraints Monitoring • • perfect / imperfect • State State none (system stays the same) / public / private • • Deviation ‐ proof p • no / yes 4

A general resource sharing problem (throughput) A general resource sharing scenario: ge e a esou ce s a g sce a o: Agent 1 A resource shared by agents 1, …, N • Time is slotted t = 0 1 2 Time is slotted t = 0, 1, 2, … • • At each time slot t : • (interference) 1. Agent i chooses action (power level) 1 A t i h ti 2. Receives monitoring signal Resource (wireless spectrum) (wireless spectrum) 3. Receives payoff • Strategy: gy (power level) (interference) • Long ‐ term payoff: Long term payoff: Agent N (throughput) 5

Design optimal resource sharing policies Design problem: g p Social welfare function Minimum payoff guarantees Formally Formally, is deviation ‐ proof, if for all , we have is deviation proof if for all we have 6

A special (but large) class of problems Resource sharing with strong negative externalities Agent 2’s payoff Constant resource usage levels Time ‐ varying resource usage levels Time varying resource usage levels Agent 1’s payoff 0 7

Many resource sharing scenarios Communication networks • power control • Medium Access Control (MAC) • flow control Residential demand ‐ side management, etc. g 8

Engineering literature ‐ I Network Utility Maximization Our work (F. Kelly, M. Chiang, S. Low, etc.) • No externality , N li • Negative externality, N i li or jointly concave not jointly concave in general • Short ‐ term performance p • Long ‐ term performance g p Inefficient • Myopic optimization (find • Foresighted optimization (find the optimal action) the optimal action) the optimal policy) the optimal policy) 9

Engineering literature ‐ II Markov decision processes Our work (D. Bertsekas, J. Tsitsiklis, E. Altman, etc.) • Single agent Si l • Multiple agents M l i l • Stationary policy is optimal • Nonstationary policy 10

Economics literature Existing theory Our work (Fudenberg, Levine, Maskin 1994) • Folk theorem ‐ type results yp • Constructive Not constructive • Cardinality of feedback • Binary feedback regardless of signals proportional to the signals proportional to the the cardinality of action sets the cardinality of action sets cardinality of action sets (exploit strong externality) High overhead h h d • Discount factor  1 • Discount factor lower bounded • Interior • Pareto boundary 11

Challenge 1 – Why not round ‐ robin TDMA? Agent 2’s payoff g p y Why not simply use round ‐ robin TDMA Wh t i l d bi TDMA to achieve the Pareto boundary? Discounting (impatience, delay ‐ sensitivity) 0 A Agent 1’s payoff 1’ ff 12

Challenge 1 – Illustrating Example A simple example abstracted from wireless communication: A simple example abstracted from wireless communication: • 3 homogeneous agents, discount factor 0.7 • ma im m pa off of each agent is 1 • maximum payoff of each agent is 1 • max ‐ min fairness:  optimal (1/3, 1/3, 1/3) Round ‐ robin TDMA policies (and variants): • cycle length of 3: 123 123 123  0.18 (46% loss) y g ( ) • cycle length of 4: 1233 1233 1233  0.26 (22% loss) • cycle length of 8: 12332333  0 29 (13% loss) cycle length of 8: 12332333  0.29 (13% loss) Longer cycles to approach the optimal policy? Longer cycles to approach the optimal policy? 13

Computational Complexity Longer cycles to approach the optimal nonstationary policy? Longer cycles to approach the optimal nonstationary policy? Longer cycles to approach the optimal nonstationary policy? Longer cycles to approach the optimal nonstationary policy? # of non trivial policies (each user has at least one slot) # of non trivial policies (each user has at least one slot) # of non ‐ trivial policies (each user has at least one slot) # of non ‐ trivial policies (each user has at least one slot) grows exponentially with # of users! grows exponentially with # of users! Lower bounded by N L ‐ N ( N : # of users, L : cycle length) Lower bounded by N L ‐ N ( N : # of users, L : cycle length) o e bou ded by o e bou ded by ( ( o use s, o use s, cyc e e gt ) cyc e e gt ) In the 3 ‐ user example, to achieve within ~10% of optimal In the 3 ‐ user example, to achieve within ~10% of optimal p p p p nonstationary policy, we need a cycle length 8  5796 policies nonstationary policy, we need a cycle length 8  5796 policies Under moderate number of users ( N =10), for a good performance Under moderate number of users ( N =10), for a good performance ( L =20), more than 10 10 (ten billion!) policies ( L =20), more than 10 10 (ten billion!) policies ( L 20), more than 10 ( L 20), more than 10 (ten billion!) policies (ten billion!) policies Optimal nonstationary policy: complexity linear with # of users Optimal nonstationary policy: complexity linear with # of users Optimal nonstationary policy: complexity linear with # of users Optimal nonstationary policy: complexity linear with # of users 14

Moral: – Optimal policy is not cyclic O i l li i li Good news: ‐ We construct a simple, intuitive and general algorithm to build such policies algorithm to build such policies ‐ Complexity: linear vs exponential of round ‐ robin p y p 15

Challenge 2 – Imperfect monitoring How to make the schedule deviation ‐ How to make the schedule deviation proof? (e g 122 122 122 ma be (e.g. 122 122 122 may be, but 1122222 1122222 may not) Agent 2’s payoff Revert to an inefficient Nash equilibrium when deviation is detected? when deviation is detected? Punishment will be triggered due to Punishment will be triggered due to imperfect monitoring.  Cannot stay on Pareto boundary!  Cannot stay on Pareto boundary! 0 0 Agent 1’s payoff 16

The design framework Agent 2’s pa off Agent 2’s payoff Step 1: Identify the set of Pareto optimal equilibrium payoffs Challenging! Step 2: Select the optimal operating point Step 2: Select the optimal operating point Relatively easy given step 1. Step 3: Construct the optimal spectrum Step 3: Construct the optimal spectrum sharing policy Challenging! 0 0 Agent 1’s payoff 17

A typical scenario • Action set: compact or finite • Agent i ’s preferred action profile: • Agent i s preferred action profile: • • Strong negative externality: for any action profile S i li f i fil , the payoff vector lies below the hyperplane determined by Agent 2’s payoff g p y A Agent 1’s payoff 1’ ff 18

A typical scenario • Action set: compact or finite • Agent i ’s preferred action profile: • Agent i s preferred action profile: • • Strong negative externality: for any action profile S i li f i fil , the payoff vector lies below the hyperplane determined by • increasing in and decreasing in g g • Binary noisy signal: : resource usage status, increasing in each a i : noise, infinite support 19

Step 1 – Identification When agent is active, agent ’s relative benefit from deviation : Payoff gain from deviation Probability of detecting deviation 20

Step 1 – Identification When agent is active, agent ’s relative benefit from deviation : Payoff gain from deviation Probability of detecting deviation Hyperplane (strong externalities) + Constraints  Part of hyperplane (easily computed) Conditions on the discount factor (delay sensitivity): 21

Optimal p Sequential Resource Sharing and Exchange i in Multi - PowerPoint PPT Presentation

Optimal p Sequential Resource Sharing and Exchange i in Multi Agent Systems l i S Yuanzhang Xiao Advisor: Prof. Mihaela van der Schaar Electrical Engineering Department UCLA Electrical Engineering Department, UCLA Ph.D. defense, March 3,

{Sequential Code} {Sequential Code} {Sequential Code} {Sequential Code} {Sequential Code}

Hardware Design with VHDL Sequential Stmts ECE 443 Sequential Statements This slide set covers

Random Sampling Florian Schoppmann August 24, 2010 Non-Sequential Sequential Sequential with

Sequential Files : Outline ! Overview ! Ordered vs. Unordered ! Physical sequential Files !

Secret Sharing and Visual Cryptography Outline Secret Sharing Visual Secret Sharing

Introduction to Synchronous Sequential Introduction to Synchronous Sequential Circuits Circuits

Chapter 5 Synchronous Sequential Logic 5-1 Outline ! Sequential Circuits ! Latches ! Flip-Flops

Sequential Supervised Learning Sequential Supervised Learning Many Application Problems Require

Advanced Tools from Modern Cryptography Lecture 3 Secret-Sharing (ctd.) Secret-Sharing Last

Research Overview SBA Research Edgar R. Weippl Secure Information Sharing & Self Monitoring

Sequential Optimal Inference for Experiments with Bayesian Particle Filters Remi Daviet Wharton

An optimal sequential procedure for a multiple selling problem Georgy Sofronov Department of

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

ESCRI-SA Knowledge Sharing Sharing Objectives and Components A presentation for the ESCRI-SA

1 Sequential data analysis Sequential data analysis Objects and operators Objects and operators

July 2017 Webinar Agenda Lindsey 2018 Certification and Training NEW Consumer Connector

INTELLIGENT SYSTEMS OVER THE INTERNET Web-Bas Based ed Intellige ligent nt Syst stems

2 3 Markov Decision Process r k+1 s k+1 Environment Environment Action a k State s k Reward r k

A future for agent programming? Brian Logan School of Computer Science University of

A Multi-Agent Prediction Market based on Raj Dasgupta Partially Observable Stochastic Game

Energy-efficient Delivery by Heterogeneous Mobile Agents Andreas B artschi J er emie

A Dynamic Programming-based MCMC Framework for Solving DCOPs with GPUs Ferdinando Fioretto 1(2)

A Framework for Distributed Intrusion Detection using Interest-Driven Cooperating Agents Rajeev

Optimal p Sequential Resource Sharing and Exchange i in Multi - PowerPoint PPT Presentation

Optimal p Sequential Resource Sharing and Exchange i in Multi Agent Systems l i S Yuanzhang Xiao Advisor: Prof. Mihaela van der Schaar Electrical Engineering Department UCLA Electrical Engineering Department, UCLA Ph.D. defense, March 3,

{Sequential Code} {Sequential Code} {Sequential Code} {Sequential Code} {Sequential Code}

Hardware Design with VHDL Sequential Stmts ECE 443 Sequential Statements This slide set covers

Random Sampling Florian Schoppmann August 24, 2010 Non-Sequential Sequential Sequential with

Sequential Files : Outline ! Overview ! Ordered vs. Unordered ! Physical sequential Files !

Secret Sharing and Visual Cryptography Outline Secret Sharing Visual Secret Sharing

Introduction to Synchronous Sequential Introduction to Synchronous Sequential Circuits Circuits

Chapter 5 Synchronous Sequential Logic 5-1 Outline ! Sequential Circuits ! Latches ! Flip-Flops

Sequential Supervised Learning Sequential Supervised Learning Many Application Problems Require

Advanced Tools from Modern Cryptography Lecture 3 Secret-Sharing (ctd.) Secret-Sharing Last

Research Overview SBA Research Edgar R. Weippl Secure Information Sharing &amp; Self Monitoring

Sequential Optimal Inference for Experiments with Bayesian Particle Filters Remi Daviet Wharton

An optimal sequential procedure for a multiple selling problem Georgy Sofronov Department of

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

ESCRI-SA Knowledge Sharing Sharing Objectives and Components A presentation for the ESCRI-SA

1 Sequential data analysis Sequential data analysis Objects and operators Objects and operators

July 2017 Webinar Agenda Lindsey 2018 Certification and Training NEW Consumer Connector

INTELLIGENT SYSTEMS OVER THE INTERNET Web-Bas Based ed Intellige ligent nt Syst stems

2 3 Markov Decision Process r k+1 s k+1 Environment Environment Action a k State s k Reward r k

A future for agent programming? Brian Logan School of Computer Science University of

A Multi-Agent Prediction Market based on Raj Dasgupta Partially Observable Stochastic Game

Energy-efficient Delivery by Heterogeneous Mobile Agents Andreas B artschi J er emie

A Dynamic Programming-based MCMC Framework for Solving DCOPs with GPUs Ferdinando Fioretto 1(2)

A Framework for Distributed Intrusion Detection using Interest-Driven Cooperating Agents Rajeev

Research Overview SBA Research Edgar R. Weippl Secure Information Sharing & Self Monitoring