Competing Against Nash Equilibria in Adversarially Changing Zero-Sum - - PowerPoint PPT Presentation

competing against nash equilibria in adversarially
SMART_READER_LITE
LIVE PREVIEW

Competing Against Nash Equilibria in Adversarially Changing Zero-Sum - - PowerPoint PPT Presentation

Competing Against Nash Equilibria in Adversarially Changing Zero-Sum Games Adrian Rivera Cardoso Joint work with: Jacob Abernethy, He Wang, Huan Xu Georgia Institute of Technology June 7, 2019 Adrian Rivera Cardoso (GATECH) June 7, 2019 1 /


slide-1
SLIDE 1

Competing Against Nash Equilibria in Adversarially Changing Zero-Sum Games

Adrian Rivera Cardoso

Joint work with: Jacob Abernethy, He Wang, Huan Xu Georgia Institute of Technology

June 7, 2019

Adrian Rivera Cardoso (GATECH) June 7, 2019 1 / 10

slide-2
SLIDE 2

Overview

1

Matrix Games

2

Online Matrix Games

3

An Impossibility Result

4

Good News

Adrian Rivera Cardoso (GATECH) June 7, 2019 2 / 10

slide-3
SLIDE 3

Matrix Games

One of the canonical problems in game theory are zero-sum matrix

  • games. Finding a Nash Equilibrium is core to many problems in statistics,
  • ptimization and economics.

Setup: Player 1 chooses a probability distribution x over d1 actions Player 2 chooses a probability distribution y over d2 actions The payoffs are specified by matrix A ∈ Rd1×d2. Aij encodes the loss of Player 1 = reward of Player 2 when they play actions i, j respectively

Adrian Rivera Cardoso (GATECH) June 7, 2019 3 / 10

slide-4
SLIDE 4

Matrix Games

The goal is to find a Nash Equilibrium (NE) of the game. A NE is a pair (x∗, y∗) such that for all x ∈ ∆d1, y ∈ ∆d2 it holds that x∗⊤Ay ≤ x∗⊤Ay∗ ≤ x⊤Ay∗ (x∗)⊤Ay∗ is called the value of the game an it holds that x∗⊤Ay∗ = min

x∈∆d1 max y∈∆d2 x⊤Ay = max y∈∆d2 min x∈∆d1 x⊤Ay.

How to find a NE? Run two OCO algorithms in parallel and then average the history of iterates. But what if the payoff matrix changes with time?!?

Adrian Rivera Cardoso (GATECH) June 7, 2019 4 / 10

slide-5
SLIDE 5

Online Matrix Games: Problem Setup

Two players play a sequence of Matrix Games for T time steps In step t they must each choose a distribution over actions xt ∈ ∆d1, yt ∈ ∆d2 An adversary chooses payoff matrix At They receive loss/reward equal to x⊤

t Atyt and observe At

Using this new information they choose xt+1, yt+1 Their goal is to achieve sublinear Nash Equilibrium Regret NE.Regret |

T

  • t=1

x⊤

t Atyt − min x∈∆d1 max y∈∆d2 T

  • t=1

x⊤Aty|

Adrian Rivera Cardoso (GATECH) June 7, 2019 5 / 10

slide-6
SLIDE 6

An Impossibility Result

We know that when At = A for all t = 1, ..., T, if each player minimizes its own Individual Regret,

T

  • t=1

ft(xt) − min

x∈X T

  • t=1

ft(x), and we average their iterates we find a NE equilibrium. Is this still a good strategy to minimize Individual Regret when At = A for all t = 1, ..., T?

Adrian Rivera Cardoso (GATECH) June 7, 2019 6 / 10

slide-7
SLIDE 7

An Impossibility Result

Theorem

Consider any algorithm that selects a sequence of xt, yt pairs given the past payoff matrices A1, . . . , At−1. Consider the following three objectives:

  • T
  • t=1

x⊤

t Atyt − min x∈∆d1 max yt∈∆d2 T

  • t=1

x⊤Aty

  • =
  • (T),

(1)

T

  • t=1

x⊤

t Atyt − min x∈∆X T

  • t=1

x⊤Atyt =

  • (T),

(2) max

y∈∆Y T

  • t=1

x⊤

t Aty − T

  • t=1

x⊤

t Atyt

=

  • (T).

(3) Then there exists an (adversarially-chosen) sequence A1, A2, . . . such that not all of (1), (2), and (3), are true.

Adrian Rivera Cardoso (GATECH) June 7, 2019 7 / 10

slide-8
SLIDE 8

Good News

Theorem

There exists an algorithm (see paper or poster) that guarantees: NE.Regret ≤ O( √ T ln(T) + max{ln(d1), ln(d2)} √ T)

Adrian Rivera Cardoso (GATECH) June 7, 2019 8 / 10

slide-9
SLIDE 9

Some Preliminary Results

Our algorithm seems to be useful for training GANs. Different algorithms used for training GANs on the mixture of Gaussians data set.

Adrian Rivera Cardoso (GATECH) June 7, 2019 9 / 10

slide-10
SLIDE 10

Thank you!

See you at Pacific Ballroom 151 from 6:30-9:00 pm

Adrian Rivera Cardoso (GATECH) June 7, 2019 10 / 10