in succinct games
play

in Succinct Games Hesam Nikpey Pooya Shati Social and Economical - PowerPoint PPT Presentation

Inverse Game Theory: Learning Utilities in Succinct Games Hesam Nikpey Pooya Shati Social and Economical Networks Dr. Fazli Spring 96-97 Inverse Game Theory: Learning Utilities in Succinct Games PAPER Volodymyr Kuleshov and Okke


  1. Inverse Game Theory: Learning Utilities in Succinct Games Hesam Nikpey Pooya Shati Social and Economical Networks Dr. Fazli Spring 96-97

  2.  Inverse Game Theory: Learning Utilities in Succinct Games PAPER  Volodymyr Kuleshov and Okke Schrijvers  WINE 2015 conference 1

  3.  Problem Introduction  Related works  Equilibrium Concepts OUTLINE  Succinct Games  Rationalizing a Game  Learning Utilities 2

  4.  Classic Game Theory PROBLEM  Inverse Game Theory INTRODUCTION  Succinct Games 3

  5.  Economics; design mechanisms  Machine learning; helicopter autopilots APPLICATIONS  Developing predictive techniques  Forecasting the agents ’ behavior 4

  6.  Computer science:  Computational complexity of rationalizing stable matchings  Correlated equilibria RELATED  Economics: WORKS  Inferring utilities of bidders in online ad auctions  Rationalizing agent behavior 5

  7.  Each player chooses a mixed strategy:  𝑞 𝑗 ∈ 𝐸(𝐵 𝑗 )  And no one is interested in changing her choice: NASH  ∀𝑟 𝑗 ∈ 𝐸 𝐵 𝑗 : 𝑣 𝑗 𝑞 𝑗 , 𝑞 −𝑗 ≥ 𝑣 𝑗 𝑟 𝑗 , 𝑞 −𝑗 EQUILIBRIUM 𝑞 1 𝑞 2 6

  8.  𝑞 not necessarily product of distributions  Equilibrium defined as 𝑗 ,𝑏 −𝑗  σ 𝑏 −𝑗 𝑞 𝑏 𝑘 𝑗 ,𝑏 −𝑗 𝑣 𝑗 𝑏 𝑘 𝑗 , 𝑏 −𝑗 ≥ σ 𝑏 −𝑗 𝑞 𝑏 𝑘 𝑗 , 𝑏 −𝑗 𝑣 𝑗 𝑏 𝑙 CORRELATED EQUILIBRIUM 𝑞 1,1 𝑞 1,2 𝑞 1,|𝐵 𝑗 | 𝑞 |𝐵 𝑘 |,1 𝑞 |𝐵 𝑘 |,|𝐵 𝑗 | 7

  9.  A specific kind of correlated equilibriums  Probability distribution is sum of products of distributions POLYNOMIAL  𝑞 = σ 𝑙=1 𝐿 𝑟 𝑙 MIXTURE OF  Where K is polynomial in input size and PRODUCTS every 𝑟 𝑙 is a product of distributions  Every game has an easy to compute PMP equilibrium 8

  10.  Every player ’ s utility is determined by a limited number of observations SUCCINCT  Interesting for the small number of parameters GAMES required to represent the utility  Covering a vast number of games 9

  11. LINEAR SUCCINCT GAMES  A set of (not necessarily disjoint) factors for every player and a utility for every factor SUCCINCT  𝐻 ≔ [ 𝐵 𝑗 𝑗=1 𝑜 𝑜 𝑜 , 𝑤 𝑗 𝑗=1 , 𝑃 𝑗 𝑗=1 ] GAMES  𝑃 𝑗 ∈ 0,1 𝑛×𝑒 , 𝑤 𝑗 ∈ 𝑆 𝑒  ∀ 𝑗 : 𝑣 𝑗 = 𝑃 𝑗 𝑤 𝑗 Definition  𝑃 𝑗 is dimensionally large but has a compact representation 10

  12. GRAPHICAL GAMES  Player 𝑗 ’ s utility depends solely on her SUCCINCT and her neighbors ’ actions. GAMES if 𝑏, 𝑏 𝑂 𝑗 agree on the actions of 𝑂(𝑗)  𝑃 𝑗 𝑏,𝑏 𝑂 𝑗 = ቊ1 otherwise. 0 Example 1 11

  13. CONGESTION GAMES  Players choose from possible subsets of the set of resources.  Each player should pay the cost of it ’ s chosen SUCCINCT resources according to the function: GAMES  σ 𝑓∈𝑏 𝑗 𝑒 𝑓 (𝑚 𝑓 )  Where 𝑒 𝑓 is 𝑓 ’ s cost function and 𝑚 𝑓 is the Example 2 number of player ’ s using 𝑓  General case of network flow games if 𝑓 ∈ 𝑏 𝑗 and 𝑚 𝑓 𝑏 = 𝑀  𝑃 𝑗 𝑏,(𝑓,𝑀) = ቊ1 otherwise. 0 12

  14.  First we write the correlated equilibrium as a linear constraint: 𝑗 , 𝑏 −𝑗  σ 𝑏 −𝑗 𝑞 𝑏 𝑘 𝑗 , 𝑏 −𝑗 𝑣 𝑗 𝑏 𝑘 𝑗 , 𝑏 −𝑗 ≥ σ 𝑏 −𝑗 𝑞 𝑏 𝑘 𝑗 , 𝑏 −𝑗 𝑣 𝑗 𝑏 𝑙 RATIONALIZING A → 𝑞 𝑈 𝐷 𝑗𝑘𝑙 𝑣 𝑗 = 𝑞 𝑈 𝐷 𝑗𝑘𝑙 𝑃 𝑗 𝑤 𝑗 ≥ 0 GAME  Where 𝐷 𝑗𝑘𝑙 is if 𝑏 𝑠𝑝𝑥 = (𝑏 𝑘 , 𝑏 −𝑗 𝑑𝑝𝑚 ) −1  𝐷 𝑗𝑘𝑙 (𝑏 𝑠𝑝𝑥 ,𝑏 𝑑𝑝𝑚 ) = ቐ if 𝑏 𝑠𝑝𝑥 = (𝑏 𝑙 , 𝑏 −𝑗 𝑑𝑝𝑚 ) 1 otherwise. 0 13

  15. 2 ∗ 𝑣 1 𝑏 1 2 + p 𝑏 1 2 ∗ 𝑣 1 𝑏 1  p 𝑏 1 1 , 𝑏 1 1 , 𝑏 1 1 , 𝑏 2 1 , 𝑏 2 2 2 ∗ 𝑣 1 𝑏 2 2 ∗ 𝑣 1 𝑏 2 2 + p 𝑏 1 1 , 𝑏 1 1 , 𝑏 1 1 , 𝑏 2 1 , 𝑏 2 2 ≥ p 𝑏 1 1 , 𝑏 1 2 𝑣 𝑏 1 1 , 0, −1, 0 RATIONALIZING A 1 , 𝑏 2 2 𝑣 𝑏 1 0, 1 , 0, −1  𝑟 1 , 𝑟 2 , 𝑟 3 , 𝑟 4 ≥ 0 GAME 1 , 𝑏 1 2 0 , 0 , 0 , 0 𝑣 𝑏 2 0 , 0 , 0 , 0 1 , 𝑏 2 2 𝑣 𝑏 2 Example  Where:  𝑟 1 = 𝑞 𝑏 1 1 , 𝑏 1 2  𝑟 2 = 𝑞 𝑏 1 1 , 𝑏 2 2  𝑟 3 = 𝑞 𝑏 2 1 , 𝑏 1 2  𝑟 4 = 𝑞 𝑏 2 1 , 𝑏 2 2 14

  16.  To avoid trivial un-interesting solutions like 𝑤 𝑗 = 0  We add the condition:  ∀𝑗: σ 𝑙=1 𝑒 𝑤 𝑗 𝑙 = 1  Furthermore by adding constraints or tweaking the NON- objective function of the optimization problem: DEGENERACY  We can limit the answer space CONDITION  We can add conditions based on prior knowledge of valuations and their coupling  We can encourage properties like sparsity and entropy 15

  17. FORMAL DEFINITION  A set of 𝑀 partially observed succinct n-player games: INVERSE-  𝐻 𝑚 = 𝑜 𝑜 𝐵 𝑗𝑚 𝑗=1 , , 𝑃 𝑗𝑚 𝑗=1 for 𝑚 ∈ {1,2, … , 𝑀} UTILITY  Each with an equilibria : 𝑞 𝑚 𝑚=1 𝑀 PROBLEM  Find 𝑤 𝑗 𝑗=1 𝑂  Such that ∀𝑚, 𝑗, 𝑘, 𝑙: 𝑞 𝑚 𝑈 𝐷 𝑗𝑘𝑙𝑚 𝑃 𝑗𝑚 𝑤 𝑗 ≥ 0 16

  18. T = 𝑞 𝑈 𝐷 𝑗𝑘𝑙 𝑃 𝑗 efficiently  We need to compute c ijk  Computing the probability of each factor in games that possess this property is feasible: COMPUTABILITY  The following sum can be computed in PROPERTY polynomial time for any factor 𝑝 , product distribution 𝑞 and action 𝑏 𝑘 𝑗  σ 𝑏 −𝑗 : 𝑏 𝑘 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 (𝑃) 𝑞(𝑏 −𝑗 ) 17

  19. CONGESTION GAMES  Each factor is a tuple (𝑓, 𝑀) meaning that the player 𝑗 and 𝑀 − 1 other players used the resource 𝑓 COMPUTABILITY 𝑗 case is trivial  The answer for the 𝑓 ∉ 𝑏 𝑘 PROPERTY  Otherwise we use dynamic programming to compute the probability of the sum of Bernoulli Example random variables being 𝑀 − 1 18

  20.  We had 𝑗 , 𝑏 −𝑗 𝑗 , 𝑏 −𝑗 𝑣 𝑗 𝑏 𝑘 𝑗 , 𝑏 −𝑗 ≥ σ 𝑏 −𝑗 𝑞 𝑏 𝑘 𝑗 , 𝑏 −𝑗 𝑣 𝑗 𝑏 𝑙 σ 𝑏 −𝑗 𝑞 𝑏 𝑘  Rewriting the left-hand side:  σ 𝑏 −𝑗 𝑞(𝑏 𝑘 𝑗 , 𝑏 −𝑗 ) σ 𝑝∈𝑈 𝑗 (𝑏 𝑘 𝑗 ,𝑏 −𝑗 ) 𝑤 𝑗 (𝑝) LEARNING  = σ 𝑝∈𝑃 𝑗 σ 𝑏 −𝑗 : 𝑏 𝑘 UTILITIES 𝑗 , 𝑏 −𝑗 𝑤 𝑗 (𝑝) 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 (𝑝) 𝑞 𝑏 𝑘  = σ 𝑝∈𝑃 𝑗 𝑤 𝑗 (𝑝) σ 𝑏 −𝑗 : 𝑏 𝑘 𝑗 , 𝑏 −𝑗 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 (𝑝) 𝑞 𝑏 𝑘 𝑈 (1) Computing 𝐷 𝑗𝑘𝑙  Where 𝑈 𝑗 𝑏 = 𝑝 𝑃 𝑏,𝑝 = 1} represents the set of factors triggered by 𝑏  Similarly for the right-hand side we have:  = σ 𝑝∈𝑃 𝑗 𝑤 𝑗 (𝑝) σ 𝑏 −𝑗 : 𝑏 𝑙 𝑗 , 𝑏 −𝑗 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 (𝑝) 𝑞 𝑏 𝑘 19

  21.  Subtracting the two results we have:  σ 𝑝∈𝑃 𝑗 𝑤 𝑗 𝑝 [σ 𝑏 −𝑗 : 𝑏 𝑘 𝑗 , 𝑏 −𝑗 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 (𝑝) 𝑞 𝑏 𝑘 𝑗 , 𝑏 −𝑗 ] ≥ 0 − ෍ 𝑞 𝑏 𝑘 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 𝑝 𝑏 −𝑗 : 𝑏 𝑘 LEARNING  We can factor 𝑞 out considering that it is a product UTILITIES of distributions.  σ 𝑝∈𝑃 𝑗 𝑤 𝑗 𝑝 [σ 𝑏 −𝑗 : 𝑏 𝑘 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 (𝑝) 𝑞 𝑏 −𝑗 𝑈 (2) Computing 𝐷 𝑗𝑘𝑙 − σ 𝑏 −𝑗 : 𝑏 𝑘 𝑗 ,𝑏 −𝑗 ∈𝐵 𝑗 𝑝 𝑞 𝑏 −𝑗 ] ≥ 0  The remaining inequality resembles the dot product T ) which we of 𝑤 𝑗 and another vector (namely c ijk know how to compute efficiently 20

  22.  Combination of these Linear Programs for every game results in valid valuations for each player: LEARNING  Minimize σ 𝑗=1 𝑜 𝑔(𝑤 𝑗 ) UTILITIES 𝑈 𝑤 𝑗 ≥ 0 ∀𝑗, 𝑘, 𝑙  Subject to 𝑑 𝑗𝑘𝑙 1 𝑈 𝑤 𝑗 = 1 ∀𝑗 Optimization Problem  Of course the resulting program is not necessarily feasible 21

  23. FORMAL DEFINITION  A set of 𝑀 partially observed succinct n-player games: INVERSE-  𝐻 𝑚 = 𝑜 𝐵 𝑗𝑚 𝑗=1 , , for 𝑚 ∈ {1,2, … , 𝑀} GAME  Each with an equilibria : 𝑞 𝑚 𝑚=1 𝑀 PROBLEM  Each with a set of candidate structures 𝑇 𝑚 𝑚=1 𝑀 𝑜 for  Find 𝑤 𝑗 𝑗=1 𝑂 and choose a structure (𝑃 𝑗𝑚ℎ ) 𝑗=1 each game  Such that ∀𝑚, 𝑗, 𝑘, 𝑙: 𝑞 𝑚 𝑈 𝐷 𝑗𝑘𝑙𝑚 𝑃 𝑗𝑚ℎ 𝑤 𝑗 ≥ 0 22

  24. PROOF SKETCH  3-SAT reduction to a sequence of graphical games INVERSE-  For every variable, a vertex with true and false GAME actions plus one base player with only one action PROBLEM  For every clause, a game with three candidate structures. Each containing a single edge between one of the literals and the base node NP-HARDNESS  Positive nodes play true, and negative nodes play false purely. 23

  25. THANKS FOR YOUR ATTENTION Q & A

Recommend


More recommend