No-Regret Learning in Convex Games
Geoff Gordon, Amy Greenwald, Casey Marks, and Martin Zinkevich
No-Regret Learning in Convex Games – p. 1
No-Regret Learning in Convex Games Geoff Gordon, Amy Greenwald, - - PowerPoint PPT Presentation
No-Regret Learning in Convex Games Geoff Gordon, Amy Greenwald, Casey Marks, and Martin Zinkevich No-Regret Learning in Convex Games p. 1 Introduction The connection between regret and equilibria is well understood in matrix games. Most
Geoff Gordon, Amy Greenwald, Casey Marks, and Martin Zinkevich
No-Regret Learning in Convex Games – p. 1
No-Regret Learning in Convex Games – p. 2
No-Regret Learning in Convex Games – p. 3
No-Regret Learning in Convex Games – p. 4
No-Regret Learning in Convex Games – p. 5
No-Regret Learning in Convex Games – p. 6
i=1 , {Ri}N i=1 , {ri}N i=1
No-Regret Learning in Convex Games – p. 7
No-Regret Learning in Convex Games – p. 8
No-Regret Learning in Convex Games – p. 9
No-Regret Learning in Convex Games – p. 10
No-Regret Learning in Convex Games – p. 11
i
j
1 , a(t) 2 , . . . , a(t) N
i,φ = r
i
−i
(1)
φ∈Φ
T
i,φ → (−∞, 0]
No-Regret Learning in Convex Games – p. 13
No-Regret Learning in Convex Games – p. 14
No-Regret Learning in Convex Games – p. 15
No-Regret Learning in Convex Games – p. 16
No-Regret Learning in Convex Games – p. 17
No-Regret Learning in Convex Games – p. 18
No-Regret Learning in Convex Games – p. 19
No-Regret Learning in Convex Games – p. 20
t = sup φ∈Φ t
t
t
No-Regret Learning in Convex Games – p. 21
No-Regret Learning in Convex Games – p. 22
No-Regret Learning in Convex Games – p. 23
No-Regret Learning in Convex Games – p. 24
No-Regret Learning in Convex Games – p. 25
No-Regret Learning in Convex Games – p. 26
No-Regret Learning in Convex Games – p. 27
No-Regret Learning in Convex Games – p. 28
No-Regret Learning in Convex Games – p. 29
No-Regret Learning in Convex Games – p. 30
No-Regret Learning in Convex Games – p. 31
No-Regret Learning in Convex Games – p. 32
No-Regret Learning in Convex Games – p. 33
2 1 2 0
No-Regret Learning in Convex Games – p. 35
No-Regret Learning in Convex Games – p. 36
Note that:
T
X
t=1
mt · Ct ≤
T
X
t=1
mt · C + f(T, C, LBT) ∀C ∈ C where LBT = {lbT | l ∈ L, b ∈ B}, and f is sublinear in T. So
T
X
t=1
ltTCtB(at) ≤
T
X
t=1
ltTCB(at) + f(T, C, LBT) ∀C ∈ C But, since CtB(at) = φCt(at) = at, and since each φ ∈ Φ can be represented as φ(a) = CB(a) with C ∈ C, this implies
T
X
t=1
ltTat ≤
T
X
t=1
ltTφ(at) + f(T, C, LBT) ∀φ ∈ Φ which is exactly the required no-Φ-regret guarantee.
No-Regret Learning in Convex Games – p. 37
No-Regret Learning in Convex Games – p. 38
No-Regret Learning in Convex Games – p. 39
PROOF: Every swap transformation in the ODP can be
No-Regret Learning in Convex Games – p. 40
No-Regret Learning in Convex Games – p. 41
No-Regret Learning in Convex Games – p. 42