in succinct games
play

in Succinct Games Hesam Nikpey Pooya Shati Social and Economical - PowerPoint PPT Presentation

Inverse Game Theory: Learning Utilities in Succinct Games Hesam Nikpey Pooya Shati Social and Economical Networks Dr. Fazli Spring 96-97 Inverse Game Theory: Learning Utilities in Succinct Games PAPER Volodymyr Kuleshov and Okke


  1. Inverse Game Theory: Learning Utilities in Succinct Games Hesam Nikpey Pooya Shati Social and Economical Networks Dr. Fazli Spring 96-97

  2.  Inverse Game Theory: Learning Utilities in Succinct Games PAPER  Volodymyr Kuleshov and Okke Schrijvers  WINE 2015 conference 1

  3.  Problem Introduction  Related works  Equilibrium Concepts OUTLINE  Succinct Games  Rationalizing a Game  Learning Utilities 2

  4.  Classic Game Theory PROBLEM  Inverse Game Theory INTRODUCTION  Succinct Games 3

  5.  Economics; design mechanisms  Machine learning; helicopter autopilots APPLICATIONS  Developing predictive techniques  Forecasting the agents ’ behavior 4

  6.  Computer science:  Computational complexity of rationalizing stable matchings  Correlated equilibria RELATED  Economics: WORKS  Inferring utilities of bidders in online ad auctions  Rationalizing agent behavior 5

  7.  Each player chooses a mixed strategy:  𝑞 𝑗 ∈ 𝐸(𝐵 𝑗 )  And no one is interested in changing her choice: NASH  ∀𝑟 𝑗 ∈ 𝐸 𝐵 𝑗 : 𝑣 𝑗 𝑞 𝑗 , 𝑞 −𝑗 ≥ 𝑣 𝑗 𝑟 𝑗 , 𝑞 −𝑗 EQUILIBRIUM 𝑞 1 𝑞 2 6

  8.  𝑞 not necessarily product of distributions  Equilibrium defined as 𝑗 ,𝑏 −𝑗  σ 𝑏 −𝑗 𝑞 𝑏 𝑘 𝑗 ,𝑏 −𝑗 𝑣 𝑗 𝑏 𝑘 𝑗 , 𝑏 −𝑗 ≥ σ 𝑏 −𝑗 𝑞 𝑏 𝑘 𝑗 , 𝑏 −𝑗 𝑣 𝑗 𝑏 𝑙 CORRELATED EQUILIBRIUM 𝑞 1,1 𝑞 1,2 𝑞 1,|𝐵 𝑗 | 𝑞 |𝐵 𝑘 |,1 𝑞 |𝐵 𝑘 |,|𝐵 𝑗 | 7

  9.  A specific kind of correlated equilibriums  Probability distribution is sum of products of distributions POLYNOMIAL  𝑞 = σ 𝑙=1 𝐿 𝑟 𝑙 MIXTURE OF  Where K is polynomial in input size and PRODUCTS every 𝑟 𝑙 is a product of distributions  Every game has an easy to compute PMP equilibrium 8

  10.  Every player ’ s utility is determined by a limited number of observations SUCCINCT  Interesting for the small number of parameters GAMES required to represent the utility  Covering a vast number of games 9

  11. LINEAR SUCCINCT GAMES  A set of (not necessarily disjoint) factors for every player and a utility for every factor SUCCINCT  𝐻 ≔ [ 𝐵 𝑗 𝑗=1 𝑜 𝑜 𝑜 , 𝑤 𝑗 𝑗=1 , 𝑃 𝑗 𝑗=1 ] GAMES  𝑃 𝑗 ∈ 0,1 𝑛×𝑒 , 𝑤 𝑗 ∈ 𝑆 𝑒  ∀ 𝑗 : 𝑣 𝑗 = 𝑃 𝑗 𝑤 𝑗 Definition  𝑃 𝑗 is dimensionally large but has a compact representation 10

  12. GRAPHICAL GAMES  Player 𝑗 ’ s utility depends solely on her SUCCINCT and her neighbors ’ actions. GAMES if 𝑏, 𝑏 𝑂 𝑗 agree on the actions of 𝑂(𝑗)  𝑃 𝑗 𝑏,𝑏 𝑂 𝑗 = ቊ1 otherwise. 0 Example 1 11

  13. CONGESTION GAMES  Players choose from possible subsets of the set of resources.  Each player should pay the cost of it ’ s chosen SUCCINCT resources according to the function: GAMES  σ 𝑓∈𝑏 𝑗 𝑒 𝑓 (𝑚 𝑓 )  Where 𝑒 𝑓 is 𝑓 ’ s cost function and 𝑚 𝑓 is the Example 2 number of player ’ s using 𝑓  General case of network flow games if 𝑓 ∈ 𝑏 𝑗 and 𝑚 𝑓 𝑏 = 𝑀  𝑃 𝑗 𝑏,(𝑓,𝑀) = ቊ1 otherwise. 0 12

  14.  First we write the correlated equilibrium as a linear constraint: 𝑗 , 𝑏 −𝑗  σ 𝑏 −𝑗 𝑞 𝑏 𝑘 𝑗 , 𝑏 −𝑗 𝑣 𝑗 𝑏 𝑘 𝑗 , 𝑏 −𝑗 ≥ σ 𝑏 −𝑗 𝑞 𝑏 𝑘 𝑗 , 𝑏 −𝑗 𝑣 𝑗 𝑏 𝑙 RATIONALIZING A → 𝑞 𝑈 𝐷 𝑗𝑘𝑙 𝑣 𝑗 = 𝑞 𝑈 𝐷 𝑗𝑘𝑙 𝑃 𝑗 𝑤 𝑗 ≥ 0 GAME  Where 𝐷 𝑗𝑘𝑙 is if 𝑏 𝑠𝑝𝑥 = (𝑏 𝑘 , 𝑏 −𝑗 𝑑𝑝𝑚 ) −1  𝐷 𝑗𝑘𝑙 (𝑏 𝑠𝑝𝑥 ,𝑏 𝑑𝑝𝑚 ) = ቐ if 𝑏 𝑠𝑝𝑥 = (𝑏 𝑙 , 𝑏 −𝑗 𝑑𝑝𝑚 ) 1 otherwise. 0 13

  15. 2 ∗ 𝑣 1 𝑏 1 2 + p 𝑏 1 2 ∗ 𝑣 1 𝑏 1  p 𝑏 1 1 , 𝑏 1 1 , 𝑏 1 1 , 𝑏 2 1 , 𝑏 2 2 2 ∗ 𝑣 1 𝑏 2 2 ∗ 𝑣 1 𝑏 2 2 + p 𝑏 1 1 , 𝑏 1 1 , 𝑏 1 1 , 𝑏 2 1 , 𝑏 2 2 ≥ p 𝑏 1 1 , 𝑏 1 2 𝑣 𝑏 1 1 , 0, −1, 0 RATIONALIZING A 1 , 𝑏 2 2 𝑣 𝑏 1 0, 1 , 0, −1  𝑟 1 , 𝑟 2 , 𝑟 3 , 𝑟 4 ≥ 0 GAME 1 , 𝑏 1 2 0 , 0 , 0 , 0 𝑣 𝑏 2 0 , 0 , 0 , 0 1 , 𝑏 2 2 𝑣 𝑏 2 Example  Where:  𝑟 1 = 𝑞 𝑏 1 1 , 𝑏 1 2  𝑟 2 = 𝑞 𝑏 1 1 , 𝑏 2 2  𝑟 3 = 𝑞 𝑏 2 1 , 𝑏 1 2  𝑟 4 = 𝑞 𝑏 2 1 , 𝑏 2 2 14

  16.  To avoid trivial un-interesting solutions like 𝑤 𝑗 = 0  We add the condition:  ∀𝑗: σ 𝑙=1 𝑒 𝑤 𝑗 𝑙 = 1  Furthermore by adding constraints or tweaking the NON- objective function of the optimization problem: DEGENERACY  We can limit the answer space CONDITION  We can add conditions based on prior knowledge of valuations and their coupling  We can encourage properties like sparsity and entropy 15

  17. FORMAL DEFINITION  A set of 𝑀 partially observed succinct n-player games: INVERSE-  𝐻 𝑚 = 𝑜 𝑜 𝐵 𝑗𝑚 𝑗=1 , , 𝑃 𝑗𝑚 𝑗=1 for 𝑚 ∈ {1,2, … , 𝑀} UTILITY  Each with an equilibria : 𝑞 𝑚 𝑚=1 𝑀 PROBLEM  Find 𝑤 𝑗 𝑗=1 𝑂  Such that ∀𝑚, 𝑗, 𝑘, 𝑙: 𝑞 𝑚 𝑈 𝐷 𝑗𝑘𝑙𝑚 𝑃 𝑗𝑚 𝑤 𝑗 ≥ 0 16

  18. T = 𝑞 𝑈 𝐷 𝑗𝑘𝑙 𝑃 𝑗 efficiently  We need to compute c ijk  Computing the probability of each factor in games that possess this property is feasible: COMPUTABILITY  The following sum can be computed in PROPERTY polynomial time for any factor 𝑝 , product distribution 𝑞 and action 𝑏 𝑘 𝑗  σ 𝑏 −𝑗 : 𝑏 𝑘 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 (𝑃) 𝑞(𝑏 −𝑗 ) 17

  19. CONGESTION GAMES  Each factor is a tuple (𝑓, 𝑀) meaning that the player 𝑗 and 𝑀 − 1 other players used the resource 𝑓 COMPUTABILITY 𝑗 case is trivial  The answer for the 𝑓 ∉ 𝑏 𝑘 PROPERTY  Otherwise we use dynamic programming to compute the probability of the sum of Bernoulli Example random variables being 𝑀 − 1 18

  20.  We had 𝑗 , 𝑏 −𝑗 𝑗 , 𝑏 −𝑗 𝑣 𝑗 𝑏 𝑘 𝑗 , 𝑏 −𝑗 ≥ σ 𝑏 −𝑗 𝑞 𝑏 𝑘 𝑗 , 𝑏 −𝑗 𝑣 𝑗 𝑏 𝑙 σ 𝑏 −𝑗 𝑞 𝑏 𝑘  Rewriting the left-hand side:  σ 𝑏 −𝑗 𝑞(𝑏 𝑘 𝑗 , 𝑏 −𝑗 ) σ 𝑝∈𝑈 𝑗 (𝑏 𝑘 𝑗 ,𝑏 −𝑗 ) 𝑤 𝑗 (𝑝) LEARNING  = σ 𝑝∈𝑃 𝑗 σ 𝑏 −𝑗 : 𝑏 𝑘 UTILITIES 𝑗 , 𝑏 −𝑗 𝑤 𝑗 (𝑝) 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 (𝑝) 𝑞 𝑏 𝑘  = σ 𝑝∈𝑃 𝑗 𝑤 𝑗 (𝑝) σ 𝑏 −𝑗 : 𝑏 𝑘 𝑗 , 𝑏 −𝑗 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 (𝑝) 𝑞 𝑏 𝑘 𝑈 (1) Computing 𝐷 𝑗𝑘𝑙  Where 𝑈 𝑗 𝑏 = 𝑝 𝑃 𝑏,𝑝 = 1} represents the set of factors triggered by 𝑏  Similarly for the right-hand side we have:  = σ 𝑝∈𝑃 𝑗 𝑤 𝑗 (𝑝) σ 𝑏 −𝑗 : 𝑏 𝑙 𝑗 , 𝑏 −𝑗 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 (𝑝) 𝑞 𝑏 𝑘 19

  21.  Subtracting the two results we have:  σ 𝑝∈𝑃 𝑗 𝑤 𝑗 𝑝 [σ 𝑏 −𝑗 : 𝑏 𝑘 𝑗 , 𝑏 −𝑗 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 (𝑝) 𝑞 𝑏 𝑘 𝑗 , 𝑏 −𝑗 ] ≥ 0 − ෍ 𝑞 𝑏 𝑘 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 𝑝 𝑏 −𝑗 : 𝑏 𝑘 LEARNING  We can factor 𝑞 out considering that it is a product UTILITIES of distributions.  σ 𝑝∈𝑃 𝑗 𝑤 𝑗 𝑝 [σ 𝑏 −𝑗 : 𝑏 𝑘 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 (𝑝) 𝑞 𝑏 −𝑗 𝑈 (2) Computing 𝐷 𝑗𝑘𝑙 − σ 𝑏 −𝑗 : 𝑏 𝑘 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 𝑝 𝑞 𝑏 −𝑗 ] ≥ 0  The remaining inequality resembles the dot product T ) which we of 𝑤 𝑗 and another vector (namely c ijk know how to compute efficiently 20

  22.  Combination of these Linear Programs for every game results in valid valuations for each player: LEARNING  Minimize σ 𝑗=1 𝑜 𝑔(𝑤 𝑗 ) UTILITIES 𝑈 𝑤 𝑗 ≥ 0 ∀𝑗, 𝑘, 𝑙  Subject to 𝑑 𝑗𝑘𝑙 1 𝑈 𝑤 𝑗 = 1 ∀𝑗 Optimization Problem  Of course the resulting program is not necessarily feasible 21

  23. FORMAL DEFINITION  A set of 𝑀 partially observed succinct n-player games: INVERSE-  𝐻 𝑚 = 𝑜 𝐵 𝑗𝑚 𝑗=1 , , for 𝑚 ∈ {1,2, … , 𝑀} GAME  Each with an equilibria : 𝑞 𝑚 𝑚=1 𝑀 PROBLEM  Each with a set of candidate structures 𝑇 𝑚 𝑚=1 𝑀 𝑜 for  Find 𝑤 𝑗 𝑗=1 𝑂 and choose a structure (𝑃 𝑗𝑚ℎ ) 𝑗=1 each game  Such that ∀𝑚, 𝑗, 𝑘, 𝑙: 𝑞 𝑚 𝑈 𝐷 𝑗𝑘𝑙𝑚 𝑃 𝑗𝑚ℎ 𝑤 𝑗 ≥ 0 22

  24. PROOF SKETCH  3-SAT reduction to a sequence of graphical games INVERSE-  For every variable, a vertex with true and false GAME actions plus one base player with only one action PROBLEM  For every clause, a game with three candidate structures. Each containing a single edge between one of the literals and the base node NP-HARDNESS  Positive nodes play true, and negative nodes play false purely. 23

  25. THANKS FOR YOUR ATTENTION Q & A

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend