Fictitious Play beats Simplex for fractional packing and covering - - PowerPoint PPT Presentation

fictitious play beats simplex for fractional packing and
SMART_READER_LITE
LIVE PREVIEW

Fictitious Play beats Simplex for fractional packing and covering - - PowerPoint PPT Presentation

Fictitious Play beats Simplex for fractional packing and covering Christos Koufogiannakis and Neal E. Young University of California, Riverside June 28, 2007 fractional packing and covering Linear programming with non-negative coefficents.


slide-1
SLIDE 1

Fictitious Play beats Simplex for fractional packing and covering

Christos Koufogiannakis and Neal E. Young University of California, Riverside June 28, 2007

slide-2
SLIDE 2

fractional packing and covering

Linear programming with non-negative coefficents. Equivalent to solving a zero-sum matrix game A with non-negative coefficients:

Theorem (von Neumann’s Min-Max Theorem 1928)

min

x

max

i

Aix = max

ˆ x

min

j

AT

j ˆ

x

x: mixed strategy for min (column) player ˆ x: mixed strategy for max (row) player i: row, j: column

◮ How to compute (1 ± ε)-optimal x and ˆ

x quickly?

◮ Simplex algorithm: Ω(n3) time for dense n × n matrix.

This talk: O(n2 + n log(n)/ε2) time.

slide-3
SLIDE 3

practical performance versus simplex

speedup:

0.0625 0.25 1 4 16 64 256 1024 4096 16384 1024 2048 4096 8192 16384 32768 65536 epsilon = 0.02 epsilon = 0.01 epsilon = 0.005

n = rows, columns

slide-4
SLIDE 4

playing a zero-sum game

◮ x = mixed strategy for min ◮ Aix = payoff if max plays row i against mixed strategy x

min x : .5 .5 Ax 1 .5 max : A : 1 1 .5 1 1 .5 ← max gets ≤ 5 Min plays x = (.5, 0, .5), max gets at most .5 ⇒ game val ≤ .5.

slide-5
SLIDE 5

playing a zero-sum game

◮ x = mixed strategy for min ◮ Aix = payoff if max plays row i against mixed strategy x ◮ ˆ

x = mixed strategy for max

◮ AT j ˆ

x = payoff if min plays column j against mixed strategy ˆ x min ˆ x .2 1 max : .4 A : 1 1 .4 1 1 ATˆ x : .6 .8 .4 ↑ Max plays ˆ x = (.2, .4, .4), min pays at least .4 ⇒ game val ≥ .4.

slide-6
SLIDE 6

playing a zero-sum game

◮ x = mixed strategy for min ◮ Aix = payoff if max plays row i against mixed strategy x ◮ ˆ

x = mixed strategy for max

◮ AT j ˆ

x = payoff if min plays column j against mixed strategy ˆ x min x : .5 .5 ˆ x Ax .2 1 .5 max : .4 A : 1 1 .5 .4 1 1 .5 ATˆ x : .6 .8 .4 Min plays x = (.5, 0, .5), max gets at most .5 ⇒ game val ≤ .5. Max plays ˆ x = (.2, .4, .4), min pays at least .4 ⇒ game val ≥ .4.

slide-7
SLIDE 7

mixed strategies via fictitious play (Brown, Robinson 1951)

Repeated play. In each round each player plays single pure strategy, chosen by considering only opponent’s past plays.

◮ xj = #times column j played so far. ◮ ˆ

xi = #times row i played so far. ... note |x| = |ˆ x| = 1 e.g. in 21’st round...

min x : 8 1 11 1 max : 1 1 1 1 Robinson’s update rule (x/|x|, ˆ x/|ˆ x| converge to optimal):

◮ Max plays best row against x. ◮ Min plays best col against ˆ

x.

slide-8
SLIDE 8

mixed strategies via fictitious play (Brown, Robinson 1951)

Repeated play. In each round each player plays single pure strategy, chosen by considering only opponent’s past plays.

◮ xj = #times column j played so far. ◮ ˆ

xi = #times row i played so far. ... note |x| = |ˆ x| = 1 e.g. in 21’st round...

min x : 8 1 11 Ax 1 8 max : 1 1 9 → 1 1 12 ← max plays

best row against x

Robinson’s update rule (x/|x|, ˆ x/|ˆ x| converge to optimal):

◮ Max plays best row against x. ◮ Min plays best col against ˆ

x.

slide-9
SLIDE 9

mixed strategies via fictitious play (Brown, Robinson 1951)

Repeated play. In each round each player plays single pure strategy, chosen by considering only opponent’s past plays.

◮ xj = #times column j played so far. ◮ ˆ

xi = #times row i played so far. ... note |x| = |ˆ x| = 1 e.g. in 21’st round...

min ↓ ˆ x 1 1 max : 10 1 1 9 1 1 ATˆ x : 11 19 9 ↑ min plays best col against ˆ

x

Robinson’s update rule (x/|x|, ˆ x/|ˆ x| converge to optimal):

◮ Max plays best row against x. ◮ Min plays best col against ˆ

x.

slide-10
SLIDE 10

mixed strategies via fictitious play (Brown, Robinson 1951)

Repeated play. In each round each player plays single pure strategy, chosen by considering only opponent’s past plays.

◮ xj = #times column j played so far. ◮ ˆ

xi = #times row i played so far. ... note |x| = |ˆ x| = 1 e.g. in 21’st round...

min ↓ x : 8 1 11 ˆ x Ax 1 1 8 max : 10 1 1 9 → 9 1 1 12 ← max plays

best row against x

ATˆ x : 11 19 9 ↑ min plays best col against ˆ

x

Robinson’s update rule (x/|x|, ˆ x/|ˆ x| converge to optimal):

◮ Max plays best row against x. ◮ Min plays best col against ˆ

x.

slide-11
SLIDE 11

algorithm = smoothed fictitious play

random play from exp. distribution (a la Grigoriadis/Khachiyan 1995, expert advice)

e.g. in round 201: ε = .1 min x : 80 10 110 Ax p 1 80 e8 max : 1 1 90 e9 1 1 120 e12

◮ max plays random row i from distribution p/|p|

where pi = exp( εAix) – concentrated on best columns against x

slide-12
SLIDE 12

algorithm = smoothed fictitious play

random play from exp. distribution (a la Grigoriadis/Khachiyan 1995, expert advice)

e.g. in round 201: ε = .1 min ˆ x 10 1 max : 100 1 1 90 1 1 ATˆ x : 110 190 90 ˆ p : e−11 e−19 e−9

◮ min plays random column j from distribution ˆ

p/|ˆ p| where ˆ pj = exp(−εAT

j ˆ

x) – concentrated on best rows against ˆ

x

slide-13
SLIDE 13

algorithm = smoothed fictitious play

random play from exp. distribution (a la Grigoriadis/Khachiyan 1995, expert advice)

e.g. in round 201: ε = .1 min x : 80 10 110 ˆ x Ax p 10 1 80 e8 max : 100 1 1 90 e9 90 1 1 120 e12 ATˆ x : 110 190 90 ˆ p : e−11 e−19 e−9

◮ max plays random row i from distribution p/|p|

where pi = exp( εAix) – concentrated on best columns against x

◮ min plays random column j from distribution ˆ

p/|ˆ p| where ˆ pj = exp(−εAT

j ˆ

x) – concentrated on best rows against ˆ

x

STOP when maxi Aix ≈ ln(n)/ε2 or minj AT

j ˆ

x ≈ ln(n)/ε2.

slide-14
SLIDE 14

correctness

With high probability, mixed strategies x/|x| for min and ˆ x/|ˆ x| for max are (1 ± O(ε))-optimal.

Proof.

Recall pi = exp(εAix), ˆ pj = exp(−εAT

j ˆ

x), min plays from ˆ p, max from p.

By algebra: |p′| × |ˆ p′| |p| × |ˆ p| ≈ 1 + ε pT |p|A∆x − ε ˆ pT |ˆ p|AT∆ˆ x. By update rule, E[∆x] =

ˆ p |ˆ p| and E[∆ˆ

x] =

p |p|

⇒ expectation of r.h.s. equals 1 (i.e., |p| × |ˆ p| non-increasing) ⇒ (w.h.p.) |p| × |ˆ p| = nO(1) ⇒ maxi Aix ≤ minj AT

j ˆ

x + O(ln(n)/ε). Stopping cond’n and weak duality ⇒ (1 ± O(ε))-optimal.

slide-15
SLIDE 15

implementation in time O(n2 + n log(n)/ε2)

◮ max plays random i from p, where pi = exp( εAix) ◮ min plays random j from ˆ

p, where ˆ pj = exp(−εAT

j ˆ

x) STOP when maxi Aix ≈ ln(n)/ε2 or minj AT

j ˆ

x ≈ ln(n)/ε2. Bottleneck is maintaining p, ˆ p (i.e., Ax, ATˆ x): ∆x : + 1 ∆Ax 1 1 1 + 1 1 1 + 1 Do work for each increase in a row payoff Aix... but Aix ≤ ln(n)/ε2, so total work O(n log(n)/ε2).

slide-16
SLIDE 16

implementation in time O(n2 + n log(n)/ε2)

◮ max plays random i from p, where pi = exp( εAix) ◮ min plays random j from ˆ

p, where ˆ pj = exp(−εAT

j ˆ

x) STOP when maxi Aix ≈ ln(n)/ε2 or minj AT

j ˆ

x ≈ ln(n)/ε2. Bottleneck is maintaining p, ˆ p (i.e., Ax, ATˆ x): ∆ˆ x 1 1 1 + 1 1 1 ∆ATˆ x : + 1 + 1 Do work for each increase in a row payoff Aix...

  • r a column payoff AT

j ˆ

x... (?!) but Aix ≤ ln(n)/ε2, so total work O(n log(n)/ε2).

slide-17
SLIDE 17

implementation in time O(n2 + n log(n)/ε2)

◮ max plays random i from p, where pi = exp( εAix) ◮ min plays random j from ˆ

p, where ˆ pj = exp(−εAT

j ˆ

x) STOP when maxi Aix ≈ ln(n)/ε2 or minj AT

j ˆ

x ≈ ln(n)/ε2. Bottleneck is maintaining p, ˆ p (i.e., Ax, ATˆ x): ∆ˆ x 1 1 1 + 1 1 1 ∆ATˆ x : + 1 + 1 Do work for each increase in a row payoff Aix...

  • r a column payoff AT

j ˆ

x... (?!) but Aix ≤ ln(n)/ε2, so total work O(n log(n)/ε2). fix: delete column j when AT

j ˆ

x ≥ ln(n)/ε2... (O(n2) time)

slide-18
SLIDE 18

generalizing to any non-negative matrix A

◮ adapt ideas for width-independence (Garg/K¨

  • nemann 1998)

◮ random sampling to deal with small Aij ◮ preprocess matrix — approximately sort within each row & column

running time for N non-zeros, r rows, c cols: O(N + (r + c) log(N)/ε2).

slide-19
SLIDE 19

practical performance

◮ first implementation: 10n2 + 75n log(n)/ε2 basic op’s ◮ simplex (GLPK): at least 5n3 basic op’s for ε ≤ 0.05

speedup:

0.0625 0.25 1 4 16 64 256 1024 4096 16384 1024 2048 4096 8192 16384 32768 65536 epsilon = 0.02 epsilon = 0.01 epsilon = 0.005

n = rows, columns

slide-20
SLIDE 20

conclusion

For dense matrices with thousands of rows and columns, the algorithm finds near-optimal solution much faster than Simplex!

  • pen problems:

◮ improve Luby & Nisan’s parallel algorithm (1993) ◮ mixed packing/covering problems ◮ implicitly defined problems (e.g. multicommodity flow) ◮ dynamic problems