sparsified linear programming for zero sum equilibrium
play

Sparsified Linear Programming for Zero-Sum Equilibrium Finding Brian - PowerPoint PPT Presentation

Sparsified Linear Programming for Zero-Sum Equilibrium Finding Brian Zhang 1 and Tuomas Sandholm 1 2 3 4 1 Carnegie Mellon University 2 Strategic Machine, Inc. 3 Strategy Robot, Inc. 4 Optimized Markets, Inc. Imperfect-information games Extensive


  1. Sparsified Linear Programming for Zero-Sum Equilibrium Finding Brian Zhang 1 and Tuomas Sandholm 1 2 3 4 1 Carnegie Mellon University 2 Strategic Machine, Inc. 3 Strategy Robot, Inc. 4 Optimized Markets, Inc.

  2. Imperfect-information games

  3. Extensive form Information sets Metrics of game size: C • Sequences : 4 + 2 = 6 P1 P1 • Terminal nodes : 6 -0.5 +0.5 P2 P2 +1 +1 -1 -1 “Coin Toss” [Brown & Sandholm ‘17] In general:

  4. Solving (zero-sum) imperfect- information games Convergence rate Iteration time Space* Speed in practice** O(1/ ε 2 ) Modern variants of O(# terminal nodes) O(# sequences) Really fast Counterfactual Regret in worst case; Minimization (CFR) O(# sequences) Zinkevich et al. ‘07; w/ game-specific ideas Brown & Sandholm ‘19 First-order methods O(1/ε) or even O(# terminal nodes) O(# sequences) Almost as fast as Hoda et al. ‘10; O(log(1/ε)) in worst case; modern CFR variants Kroer et al. ’18 [Gilpin et al. ‘12] O(# sequences) w/ game-specific ideas Linear programming O(polylog(1/ ε)) poly(# terminal nodes) poly(# terminal nodes) Fast Koller et al. ‘94 O(log 2 (1/ ε)) Our contribution O(# terminal nodes) O(# terminal nodes) Really fast Improvements to the in worst case; in worst case; LP method Õ(# sequences) Õ(# sequences) in many practical cases in many practical cases *assuming payoff matrix given implicitly **assuming scalability for memory

  5. Extensive-form games as LPs [Koller et al. ’94] • Sequence-form bilinear saddle-point problem • Dual of inner minimization ⇒ LP – nnz ( A ) = # terminal nodes; A = payoff matrix – nnz ( B ) = # P1 sequences – nnz ( C ) = # P2 sequences Not great…

  6. Fast linear programming: [Yen et al., 2015] • Iteration time: O(nnz(constraint matrix)) • Convergence rate: O(log 2 (1/ ε))

  7. Fast linear programming: Adapting to Games • Iteration time: O(nnz(constraint matrix)) • Iteration time: O(# terminal nodes) • Convergence rate: O(log 2 (1/ ε)) • Problem : Returns an infeasible solution • Solution : Normalize strategy after returning • Theorem : This doesn’t hurt convergence substantially

  8. Factoring the payoff matrix Suppose the payoff matrix A were factorable… Then: Goal : Given A implicitly , factor it.

  9. What about low-rank factorization? e.g., singular vector decomposition (SVD) A = = + 0 Rank 1 Two subproblems

  10. Factorization algorithm Idea: Think about singular vector decomposition, and adapt it When ‖ ⋅ ‖ is the 2-norm, this is power iteration How to solve it?

  11. Exact Solutions to --------------------------- • 2-norm: v = Au (power iteration) • 1- norm: Meng & Xu ’12 • 0-norm: Is the 1-norm better because it is convex? Not really… the overall factorization problem is NP - hard no matter what [Gillis and Vasasvis ‘18] Key: 0-norm computation can be done implicitly ! (i.e., without storing whole payoff matrix!)

  12. So, what have we managed? Matrix factorization ⇒ much sparser LP • Best case: # nonzero elements = O(# sequences) • Upper triangular matrices (e.g. Poker): Õ(# sequences) Does it work in practice? Yes! • Experiment 1: Wide variety of games – Some games factorable, some not – LP solver faster than CFR in all cases – Commercial solver (Gurobi) faster than Yen et al., despite theoretical guarantees

  13. So, what have we managed? Matrix factorization ⇒ much sparser LP • Best case: # nonzero elements = O(# sequences) • Upper triangular matrices (e.g. Poker): Õ(# sequences) Does it work in practice? Yes! • Experiment 2: No-limit Texas Hold’em river endgames – size of payoff matrix reduced >50x – memory usage of LP solver reduced by ~ 20x, time usage by ~ 5x – now feasible as an alternative to poker-specific CFR

  14. Experiment 2

  15. So, what have we managed? • LP algorithm for game solving with good theoretical guarantees and strong practical performance • Moral/Takeaway: LP can be practical for solving even very large games!

  16. Thank you!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend