a mean field games formulation of network based auction
play

A Mean Field Games Formulation of Network Based Auction Dynamics - PowerPoint PPT Presentation

A Mean Field Games Formulation of Network Based Auction Dynamics Peter E. Caines McGill University Information and Control in Networks Lund, October 2012 Joint work with Peng Jia Co-Authors Minyi Huang Roland Malham e Peng Jia


  1. A Mean Field Games Formulation of Network Based Auction Dynamics Peter E. Caines McGill University Information and Control in Networks Lund, October 2012 Joint work with Peng Jia

  2. Co-Authors Minyi Huang Roland Malham´ e Peng Jia

  3. Collaborators & Students Arman Kizilkale Arthur Lazarte Zhongjing Ma Mojtaba Nourian

  4. Basic Ideas of Mean Field Games 1 / 47

  5. Part 1 – CDMA Power Control Base Station & Individual Agents 2 / 47

  6. Part 1 – CDMA Power Control Lognormal channel attenuation: 1 ≤ i ≤ N i th channel: dx i = − a ( x i + b ) dt + σdw i , 1 ≤ i ≤ N Transmitted power = channel attenuation × power = e x i ( t ) p i ( t ) (Charalambous, Menemenlis; 1999) Signal to interference ratio (Agent i ) at the base station � � ( β/N ) � N = e x i p i / j � = i e x j p j + η How to optimize all the individual SIR’s? Self defeating for everyone to increase their power Humans display the “Cocktail Party Effect”: Tune hearing to frequency of friend’s voice (E. Colin Cherry) 3 / 47

  7. Part 1 – CDMA Power Control Can maximize � N i =1 SIR i with centralized control. (HCM, 2004) Since centralized control is not feasible for complex systems, how can such systems be optimized using decentralized control? Idea: Use large population properties of the system together with basic notions of game theory. Massive game theoretic control systems: Large ensembles of partially regulated competing agents Fundamental issue: The relation between the actions of each individual agent and the resulting mass behavior 4 / 47

  8. Part 2 – Basic LQG Game Problem Individual Agent’s Dynamics: dx i = ( a i x i + bu i ) dt + σ i dw i , 1 ≤ i ≤ N. (scalar case only for simplicity of notation) x i : state of the i th agent u i : control w i : disturbance (standard Wiener process) N : population size 5 / 47

  9. Part 2 – Basic LQG Game Problem Individual Agent’s Cost: � ∞ e − ρt [( x i − ν ) 2 + ru 2 J i ( u i , ν ) � E i ] dt 0 � N Basic case: ν � γ. ( 1 k � = i x k + η ) N Main features: Agents are coupled via their costs Tracked process ν : (i) stochastic (ii) depends on other agents’ control laws (iii) not feasible for x i to track all x k trajectories for large N 6 / 47

  10. Part 2 – Large Popn. Models with Game Theory Features Economic models: Cournot-Nash equilibria (Lambson) Advertising competition: game models (Erickson) Wireless network res. alloc.: (Alpcan et al., Altman, HCM) Admission control in communication networks: (Ma, MC) Public health: voluntary vaccination games (Bauch & Earn) Biology: stochastic PDE swarming models (Bertozzi et al.) Sociology: urban economics (Brock and Durlauf et al.) Renewable Energy: Charging control of of PEVs (Ma et al.) 7 / 47

  11. Part 2 – Preliminary Optimal LQG Tracking LQG Tracking: Take x ∗ (bounded continuous) for scalar model: dx i = a i x i dt + bu i dt + σ i dw i � ∞ e − ρt [( x i − x ∗ ) 2 + ru 2 J i ( u i , x ∗ ) = E i ] dt 0 ρ Π i = 2 a i Π i − b 2 r Π 2 Riccati Equation: i + 1 , Π i > 0 Set β 1 = − a i + b 2 r Π i , β 2 = − a i + b 2 r Π i + ρ , and assume β 1 > 0 ρs i = ds i dt + a i s i − b 2 r Π i s i − x ∗ . Mass Offset Control: u i = − b Optimal Tracking Control: r (Π i x i + s i ) Boundedness condition on x ∗ implies existence of unique solution s i . 8 / 47

  12. Part 2 – Key Intuition When the tracked signal is replaced by the deterministic mean state of the mass of agents: Agent’s feedback = feedback of agent’s local stochastic state + feedback of deterministic mass offset Think Globally, Act Locally (Geddes, Alinsky, Rudie-Wonham) 9 / 47

  13. Part 2 – LQG-NCE Equation Scheme The Fundamental NCE Equation System Continuum of Systems: a ∈ A ; common b for simplicity dt + as a − b 2 ρs a = ds a r Π a s a − x ∗ dt = ( a − b 2 r Π a ) x a − b 2 dx a r s a , � x ( t ) = x a ( t ) dF ( a ) , A x ∗ ( t ) = γ ( x ( t ) + η ) t ≥ 0 ρ Π a = 2 a Π a − b 2 r Π 2 Riccati Equation : a + 1 , Π a > 0 Individual control action u a = − b r (Π a x a + s a ) is optimal w.r.t tracked x ∗ . Does there exist a solution ( x a , s a , x ∗ ; a ∈ A ) ? Yes: Fixed Point Theorem 10 / 47

  14. Part 2 – NCE Feedback Control Proposed MF Solution to the Large Population LQG Game Problem The Finite System of N Agents with Dynamics: dx i = a i x i dt + bu i dt + σ i dw i , 1 ≤ i ≤ N, t ≥ 0 Let u − i � ( u 1 , · · · , u i − 1 , u i +1 , · · · , u N ) ; then the individual cost � ∞ N e − ρt { [ x i − γ ( 1 x k + η )] 2 + ru 2 � J i ( u i , u − i ) � E i } dt N 0 k � = i Algorithm: For i th agent with parameter ( a i , b ) compute: • x ∗ using NCE Equation System ρ Π i = 2 a i Π i − b 2  r Π 2 i + 1  dt + a i s i − b 2 • ρs i = ds i r Π i s i − x ∗ u i = − b  r (Π i x i + s i ) 11 / 47

  15. Part 2 – Saddle Point Nash Equilibrium Agent y is a maximizer Agent x is a minimizer 4 3 2 1 0 −1 −2 −3 −4 2 1 0 2 1 −1 0 −1 −2 −2 y x 12 / 47

  16. Part 2 – Nash Equilibrium The Information Pattern: F N � σ ( x j ( τ ); τ ≤ t, 1 ≤ j ≤ N ) F i � σ ( x i ( τ ); τ ≤ t ) F N adapted control: U F i adapted control: U loc,i The Equilibria: The set of controls U 0 = { u 0 i ; u 0 i adapted to U loc,i , 1 ≤ i ≤ N } generates a Nash Equilibrium w.r.t. the costs { J i ; 1 ≤ i ≤ N } if, for each i , J i ( u 0 i , u 0 u i ∈U J i ( u i , u 0 − i ) = inf − i ) 13 / 47

  17. Part 2 – ǫ -Nash Equilibrium ǫ -Nash Equilibria: Given ε > 0 , the set of controls U 0 = { u 0 i ; 1 ≤ i ≤ N } generates an ε -Nash Equilibrium w.r.t. the costs { J i ; 1 ≤ i ≤ N } if, for each i , J i ( u 0 i , u 0 u i ∈U J i ( u i , u 0 − i ) ≤ J i ( u 0 i , u 0 − i ) − ε ≤ inf − i ) 14 / 47

  18. Part 2 – NCE Control: First Main Result Theorem 1: (MH, PEC, RPM, 2003) Subject to technical conditions, the NCE Equations have a unique solution for which the NCE Control Algorithm generates a set of controls U N nce = { u 0 i ; 1 ≤ i ≤ N } , 1 ≤ N < ∞ , where i = − b u 0 r (Π i x i + s i ) which are s.t. (i) All agent systems S ( A i ) , 1 ≤ i ≤ N, are second order stable. (ii) {U N nce ; 1 ≤ N < ∞} yields an ε -Nash equilibrium for all ε , i.e. ∀ ε > 0 ∃ N ( ε ) s.t. ∀ N ≥ N ( ε ) J i ( u 0 i , u 0 u i ∈U J i ( u i , u 0 − i ) ≤ J i ( u 0 i , u 0 − i ) − ε ≤ inf − i ) , where u i ∈ U is adapted to F N . 15 / 47

  19. Network Based Auctions and Applications of MFG 16 / 47

  20. Part 3 – Network Based Auction: Overview Game theoretic methods for market pricing and resource allocation on distributed networks Two-level network structure Lower level: quantized progressive second price auctions with fixed local quantities Higher level: cooperative consensus allocation of local quantities Convergence and efficiency analysis of network based auctions Applications of Mean Field Game to auctions and networks 17 / 47

  21. Part 3 – ISO / RTO 18 / 47

  22. Part 3 – Hydro-Qu´ ebec 60 hydroelectric generating stations 36,971 MW installed capacity 175 TW storage capacity 579 dams, 97 control structures www.hydroforthefuture.com 19 / 47

  23. Part 3 – Worldwide Examples of Extreme Price Volatility Illinois [1] East US [2] Ontario [1] The Netherlands [1] New Zealand [3] West Texas [4] [1] Cho & Meyn, 2010 [2] http://www.ferc.gov [3] http://www.treasury.govt.nz [4] Giberson, 2008 20 / 47

  24. Part 3 – Quantized PSP Auctions (Jia & Caines 2011) A non-cooperative game; N buyer agents bid for a divisible resource C ; Given a finite price set B 0 p , each buyer agent BA i makes a quantized bid : s i = ( p i , q i ) = ( price, quantity ) , p i ∈ B 0 p ; A bid profile is s = ( s 1 , · · · , s N ) ; θ i : R + → R + , is the valuation function , and θ ′ i is the (decreasing) demand function ; A market price function (MPF) for BA i is     � P i ( z, s − i ) = inf  y ≥ 0 : C − q k ≥ z  . p k >y,k � = i Objective: Design a market mechanism (i.e., assignment of allocations) and find a bidding rule for each agent which individually maximizes its utility function and which leads to a Nash equilibria and which is socially efficient (i.e. max sum individual utilities). 21 / 47

  25. Part 3 – PSP Mechanism (celebrated VCG mechanism) The PSP allocation rule and cost function are defined as: q i a i ( s ) = a i (( p i , q i ) , s − i ) = min { q i , Q i ( p i , s − i ) } , � k : p k = p i q k (reasonable: MPF constrained allocation) � c i ( s ) = p j [ a j ((0 , 0) , s − i ) − a j ( s i , s − i )] , j � = i (reasonable: corresponding to opportunity costs) where Q i ( y, s − i ) is the available quantity at price y given s − i . Then BA i ’s utility function u i ( s ) = θ i ( a i ( s )) − c i ( s ) . 22 / 47

  26. Part 3 – Best Reply ′ Given s − i and elastic θ i , utility maximum implies the best (bid) reply, �� + � � ′ i ( v i ) ∈ R + . ′ v i = sup q ≥ 0 : θ i ( q ) > P i ( q, s − i ) , w i = θ 23 / 47

  27. Part 3 – Quantized Strategies A generic buyer, e.g., Agent 2 : Applies the same utility function and allocation rule as PSP. Makes the quantized price and quantity bid: i ∈ B 0 ′ − 1 p k p , q k ( p k i = θ i ) , 1 ≤ i i ≤ N, k ≥ 0 , where there is no bid fee. Bids are made synchronously. 24 / 47

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend