continuous action spaces
play

Continuous Action Spaces Previously, we only allowed the players to - PDF document

Continuous Action Spaces Previously, we only allowed the players to choose from a finite set of actions CS 331: Artificial Intelligence Today, well see how to calculate Nash Game Theory III Equilibria when we have a continuous


  1. Continuous Action Spaces • Previously, we only allowed the players to choose from a finite set of actions CS 331: Artificial Intelligence • Today, we’ll see how to calculate Nash Game Theory III Equilibria when we have a continuous action space 1 2 Tragedy of the Commons Tragedy of the Commons (Hardin 1968) • n farmers in a village graze goats on the commons to eventually fatten and sell • Illustrates the conflict for resources between • The more goats they graze the less well fed individual interests and the common good they are • If citizens respond only to private • And so the less money they get when they incentives, public goods will be sell them underprovided and public resources overutilized 3 4 Tragedy of the Commons Payoff for Goats (Formalized) • n farmers • g i goats allowed to graze on the commons by the i th farmer • Assume goats are continuously divisible ie. g i ε [0, 36] Note: Price per Payoff for farmer i goat = 0 if G > 36 • Total number of goats in the village is = Price per goat * # of goats G = g 1 + … + g n .   36 g G i • Strategy profile (g 1 , g 2 , …, g n ). N    g 36 g i j  j 1 5 6 1

  2. Calculating the Nash Equilibrium Calculating the Nash Equilibrium  • Suppose a Nash Equilibrium exists using the    * * * g 36 g G 0   i i i * , …, g n * strategy profile (g 1 * , g 2 * ) g i       • This means that            * * * * * *  g  36 g G g  36 g G  0    i i i i  i i * *    g   g  Payoff to farmer i assuming i i   * g       * * * i g arg max  the other players play  36 g G 0  i i   i * * 2 36 g G    g i i i * *  *  ( g , g , , g )  * g 1 2 n     * * i 36 g G   • Define   i i   * * * * 2 36 G g g G  i i i j        * * * j i 2 ( 36 g G ) g  i i i    • Therefore     * * * * * g arg max g 36 g G 72 2 g 2 G g   i i i i i i i    g * * 72 2 G 3 g i  i i • Use calculus to compute g i 2 * !    * * g 24 G  i i 3 7 Calculating the Nash Equilibrium Calculating the Nash Equilibrium 2 2             * * * *  * * * * *  * g 24 ( g g g g ) g 24 ( g g g g ) Write g * = g 1* = g 2* = … = g n* 1 2 3 4 n 1 2 3 4 n 3 Could use Linear Programming 3 2 but notice the symmetry in 2 *   *  *  *   * *   *  *  *   *   g 24 ( g g g g ) g 24 ( g g g g ) 2 2 1 3 4 n 2 1 3 4 n    3 3 * * these equations. It turns out g 24 ( n 1 ) g 3 2 2       that:       * * * * * * * * * * 24 (  ) 24 (  )     g g g g g g g g g g * * 3 g 72 2 ( n 1 ) g 3 1 2 4 n 3 1 2 4 n 3 3 g 1* = g 2* = … = g n*     * *   3 g 2 ( n 1 ) g 72     * 2 If you don’t believe me, try 2 g ( 3 2 n 2 ) 72             * * * *  * * * * *  * g 24 ( g g g g ) g 24 ( g g g g )   n 1 2 3 n 1 n 1 2 3 n 1 3 3 72 solving the 2 farmer case:   * g  2 1 n 2   * * g 24 g 1 2 3 2   * * g 24 g 2 1 3 9 10 Calculating the Nash Equilibrium The Tragedy • At the Nash Equilibrium, a rational farmer grazes • How much profit per farmer? 72/(2n+1) goats • How many goats in total will be grazed? 72 72 n   Payoff to a farmer 36   2 1 2 1 n n 72 n 36   36   2 n 1 2 n 1 Suppose there are 24 farmers, then the payoff would be about 1.26 cents • Note that as n →∞, 36 goats will be grazed If they all got together and agreed on 1 goat each, then the (remember that we allow goats to be continuously payoff would have been about 3.46 cents divisible)     Payoff to a farmer 36 24 12 3 . 46 11 12 2

  3. What Went Wrong? Conclusions on Game Theory • Sylvia Nasar's (author of the biography “A • Rational behavior lead to sub-optimal solutions Beautiful Mind”) synopsis of John Nash’s remarks • Maximizing one’s utility is not the same as on winning the Nobel prize: maximizing social welfare “…he [Nash] felt that game theory was like string • To solve this problem, we can define the rules of theory, a subject of great intrinsic intellectual the game to ensure that social welfare is not interest that the world wishes to imagine can be of disregarded some utility. He said it with enough skepticism in his voice to make it funny.” • This is why mechanism design is important since it involves defining the rules of the game 13 14 Conclusions on Game Theory What you should know • How to calculate Nash Equilibria for a • Game theory is mathematically elegant but there are problems in applying it to real world problems: continuous action space game like the – Assumes opponents will play the equilibrium strategy Tragedy of the Commons – What to do with multiple Nash equilibria? • Why the Tragedy of the Commons is tragic – Computing Nash equilibria for complex games is nasty (perhaps even intractable) • Why game theory has difficulties being – Players have non-stationary policies applied to real world problems – Lots of other assumptions that don’t hold… • Game theory used mainly to analyze environments at equilibrium rather than to control agents within an environment • Also good for designing environments (mechanism design) 15 16 3

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend