[PPT] - Efficient Algorithms for Online Decision Problems Dave Buchfuhrer PowerPoint Presentation

SLIDE 1

Efficient Algorithms for Online Decision Problems

Dave Buchfuhrer January 15, 2009

SLIDE 2

The Model

◮ In this model, we have n

experts

SLIDE 3

The Model

◮ In this model, we have n

experts

e1 e2 e3 e4

SLIDE 4

The Model

◮ In this model, we have n

experts

◮ Every round, we must pick

an expert

e1 e2 e3 e4

SLIDE 5

The Model

◮ In this model, we have n

experts

◮ Every round, we must pick

an expert

e1 e2 e3 e4

SLIDE 6

The Model

◮ In this model, we have n

experts

◮ Every round, we must pick

an expert

◮ After this choice, the cost of

each expert is revealed

e1 e2 e3 e4

SLIDE 7

The Model

◮ In this model, we have n

experts

◮ Every round, we must pick

an expert

◮ After this choice, the cost of

each expert is revealed

e1 e2 e3 e4 .2 .5 .1 .8

SLIDE 8

The Model

◮ In this model, we have n

experts

◮ Every round, we must pick

an expert

◮ After this choice, the cost of

each expert is revealed

e1 e2 e3 e4 .2 .5 .1 .8

SLIDE 9

The Model

◮ In this model, we have n

experts

◮ Every round, we must pick

an expert

◮ After this choice, the cost of

each expert is revealed

e1 e2 e3 e4 .2 .5 .1 .8 .5 .3 .6

SLIDE 10

The Model

◮ In this model, we have n

experts

◮ Every round, we must pick

an expert

◮ After this choice, the cost of

each expert is revealed

e1 e2 e3 e4 .2 .5 .1 .8 .5 .3 .6

SLIDE 11

The Model

◮ In this model, we have n

experts

◮ Every round, we must pick

an expert

◮ After this choice, the cost of

each expert is revealed

e1 e2 e3 e4 .2 .5 .1 .8 .5 .3 .6 .9 .4 .2 .3

SLIDE 12

The Model

◮ In this model, we have n

experts

◮ Every round, we must pick

an expert

◮ After this choice, the cost of

each expert is revealed

e1 e2 e3 e4 .2 .5 .1 .8 .5 .3 .6 .9 .4 .2 .3

SLIDE 13

The Model

◮ In this model, we have n

experts

◮ Every round, we must pick

an expert

◮ After this choice, the cost of

each expert is revealed

e1 e2 e3 e4 .2 .5 .1 .8 .5 .3 .6 .9 .4 .2 .3 .1 .6 .8 .9

SLIDE 14

The Model

◮ In this model, we have n

experts

◮ Every round, we must pick

an expert

◮ After this choice, the cost of

each expert is revealed

◮ The goal is to minimize the

total cost incurred

e1 e2 e3 e4 .2 .5 .1 .8 .5 .3 .6 .9 .4 .2 .3 .1 .6 .8 .9

SLIDE 15

The Model

◮ In this model, we have n

experts

◮ Every round, we must pick

an expert

◮ After this choice, the cost of

each expert is revealed

◮ The goal is to minimize the

total cost incurred

e1 e2 e3 e4 .2 .5 .1 .8 .5 .3 .6 .9 .4 .2 .3 .1 .6 .8 .9

Total cost: 1.9

SLIDE 16

Limit to Single Expert

e1 e2 e3 e4

SLIDE 17

Limit to Single Expert

e1 e2 e3 e4 1 1 1

SLIDE 18

Limit to Single Expert

e1 e2 e3 e4 1 1 1 1 1 1

SLIDE 19

Limit to Single Expert

e1 e2 e3 e4 1 1 1 1 1 1 1 1 1

SLIDE 20

Limit to Single Expert

e1 e2 e3 e4 1 1 1 1 1 1 1 1 1 1 1 1

SLIDE 21

Purely Random Strategies are Bad

e1 e2 e3 e4 1 1 1 1 1 1 1 1 1 1 1 1

SLIDE 22

Purely Random Strategies are Bad

e1 e2 e3 e4 1 1 1 1 1 1 1 1 1 1 1 1

SLIDE 23

Purely Random Strategies are Bad

e1 e2 e3 e4 1 1 1 1 1 1 1 1 1 1 1 1

SLIDE 24

Purely Random Strategies are Bad

e1 e2 e3 e4 1 1 1 1 1 1 1 1 1 1 1 1

SLIDE 25

Purely Random Strategies are Bad

e1 e2 e3 e4 1 1 1 1 1 1 1 1 1 1 1 1

SLIDE 26

Following the Best Track Record

e1 e2 e3 e4

SLIDE 27

Following the Best Track Record

e1 e2 e3 e4

SLIDE 28

Following the Best Track Record

e1 e2 e3 e4 1

SLIDE 29

Following the Best Track Record

e1 e2 e3 e4 1

SLIDE 30

Following the Best Track Record

e1 e2 e3 e4 1

1

SLIDE 31

Following the Best Track Record

e1 e2 e3 e4 1

1

SLIDE 32

Following the Best Track Record

e1 e2 e3 e4 1

1
1

SLIDE 33

I’m feeling good about this one!

e1 e2 e3 e4 1

1
1

SLIDE 34

Damnit!

e1 e2 e3 e4 1

1
1
1

SLIDE 35

Failure of Follow the Leader

At each step t in follow the leader, we can

1. Pick the expert with the best total so far
2. Fail to do so

SLIDE 36

Failure of Follow the Leader

At each step t in follow the leader, we can

1. Pick the expert with the best total so far
2. Fail to do so

Case 1: we increase our total cost by at most the same amount as the best strategy

SLIDE 37

Failure of Follow the Leader

At each step t in follow the leader, we can

1. Pick the expert with the best total so far
2. Fail to do so

Case 1: we increase our total cost by at most the same amount as the best strategy Case 2: we increase our total cost by at most 1 more than the cost increase of the best strategy

SLIDE 38

Example

e1 e2 e3 e4 guess leader

SLIDE 39

Example

e1 e2 e3 e4 guess leader

SLIDE 40

Example

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2)

SLIDE 41

Example

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2)

SLIDE 42

Example

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2) .7 .2 .3 .1 e1 (.9) e4 (.6)

SLIDE 43

Example

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2) .7 .2 .3 .1 e1 (.9) e4 (.6)

SLIDE 44

Example

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2) .7 .2 .3 .1 e1 (.9) e4 (.6) .3 .6 .8 1 e4 (1.9) e1 (1.2)

SLIDE 45

Example

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2) .7 .2 .3 .1 e1 (.9) e4 (.6) .3 .6 .8 1 e4 (1.9) e1 (1.2)

SLIDE 46

Example

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2) .7 .2 .3 .1 e1 (.9) e4 (.6) .3 .6 .8 1 e4 (1.9) e1 (1.2) .1 .6 .4 e1 (2.0) e1 (1.3)

SLIDE 47

Example

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2) .7 .2 .3 .1 e1 (.9) e4 (.6) .3 .6 .8 1 e4 (1.9) e1 (1.2) .1 .6 .4 e1 (2.0) e1 (1.3)

SLIDE 48

Example

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2) .7 .2 .3 .1 e1 (.9) e4 (.6) .3 .6 .8 1 e4 (1.9) e1 (1.2) .1 .6 .4 e1 (2.0) e1 (1.3) .5 .2 .3 .4 e1 (2.5) e1 (1.8)

SLIDE 49

Reason for Failure

So the total cost of Follow the Leader is at most best cost + # times leader guess was wrong

SLIDE 50

Reason for Failure

So the total cost of Follow the Leader is at most best cost + # times leader guess was wrong

r in other words,

final leader’s cost + # times the leader guess changed

SLIDE 51

k-Armed Bandit Connection

◮ Confidence intervals helped with k-armed bandits

SLIDE 52

k-Armed Bandit Connection

◮ Confidence intervals helped with k-armed bandits ◮ Here, we’ll just fudge the numbers to prevent leader changes

SLIDE 53

k-Armed Bandit Connection

◮ Confidence intervals helped with k-armed bandits ◮ Here, we’ll just fudge the numbers to prevent leader changes ◮ We add a random perturbation pert[i] to each expert i

SLIDE 54

Adding Randomness

e1 e2 e3 e4 1 1 1 1

SLIDE 55

Adding Randomness

e1 e2 e3 e4 3 10 2 8 1 1 1 1

SLIDE 56

Too Much Randomness?

e1 e2 e3 e4 3 10 2 8 1 1 1 1 1 1 1 1

SLIDE 57

Too Much Randomness?

e1 e2 e3 e4 3 10 2 8 1 1 1 1 1 1 1 1

SLIDE 58

Getting it Right

In order to do well, we add a random variable to each expert with exponential density function ǫeǫx for negative perturbations x

SLIDE 59

Getting it Right

In order to do well, we add a random variable to each expert with exponential density function ǫeǫx for negative perturbations x We hope that

◮ The expected number of leader changes is small compared to

the final leader cost

SLIDE 60

Getting it Right

In order to do well, we add a random variable to each expert with exponential density function ǫeǫx for negative perturbations x We hope that

◮ The expected number of leader changes is small compared to

the final leader cost

◮ The final leader cost is close to the min cost

SLIDE 61

Number of Leader Changes

We wish to show that E[# changes of leader] ≤ ǫE[total cost]

SLIDE 62

Number of Leader Changes

We wish to show that E[# changes of leader] ≤ ǫE[total cost] which shows us that E[total cost] ≤ E[final leader cost] + ǫE[total cost]

SLIDE 63

Number of Leader Changes

We wish to show that E[# changes of leader] ≤ ǫE[total cost] which shows us that E[total cost] ≤ E[final leader cost] + ǫE[total cost] giving us E[total cost] ≤ 1 1 − ǫE[final leader cost]

SLIDE 64

Chance of Changing Leader

◮ If expert i is the current leader, consider his current costs, as

compared to the costs of all other experts, as well as their perturbations

SLIDE 65

Chance of Changing Leader

◮ If expert i is the current leader, consider his current costs, as

compared to the costs of all other experts, as well as their perturbations

◮ Given this info, i must have a sufficiently small perturbation

to be leader

SLIDE 66

Chance of Changing Leader

◮ If expert i is the current leader, consider his current costs, as

compared to the costs of all other experts, as well as their perturbations

◮ Given this info, i must have a sufficiently small perturbation

to be leader

◮ Since the exponential distribution is memoryless, the chances

that it’s c smaller than necessary only depend on c

SLIDE 67

Chance of Changing Leader

◮ If expert i is the current leader, consider his current costs, as

compared to the costs of all other experts, as well as their perturbations

◮ Given this info, i must have a sufficiently small perturbation

to be leader

◮ Since the exponential distribution is memoryless, the chances

that it’s c smaller than necessary only depend on c

◮ This chance happens to be greater than 1 − ǫc

SLIDE 68

Leader Change

◮ So there’s only an ǫc chance of the leader being leader by less

than a margin of c

SLIDE 69

Leader Change

◮ So there’s only an ǫc chance of the leader being leader by less

than a margin of c

◮ Let ct be the current leader’s next cost at time t ◮ t ct = total cost

SLIDE 70

Leader Change

◮ So there’s only an ǫc chance of the leader being leader by less

than a margin of c

◮ Let ct be the current leader’s next cost at time t ◮ t ct = total cost ◮ So total number of changes is ǫ(total cost)

SLIDE 71

Final Leader Cost

This leaves us with the need to bound E[final leader cost], as the final leader is not necessarily optimal

SLIDE 72

Final Leader Cost

This leaves us with the need to bound E[final leader cost], as the final leader is not necessarily optimal

◮ Our leader can only be as much worse as the biggest

perturbation

SLIDE 73

Final Leader Cost

This leaves us with the need to bound E[final leader cost], as the final leader is not necessarily optimal

◮ Our leader can only be as much worse as the biggest

perturbation

◮ Because the distribution is exponential, the expected max

perturbation grows logarithmically

SLIDE 74

Final Leader Cost

This leaves us with the need to bound E[final leader cost], as the final leader is not necessarily optimal

◮ Our leader can only be as much worse as the biggest

perturbation

◮ Because the distribution is exponential, the expected max

perturbation grows logarithmically

◮ In particular, we get a bound of (1 + ln n)/ǫ

SLIDE 75

Tying it Together

Combining the bounds on the number of wrong guesses with the bound on the error in our final guess, we get E[total cost](1 − ǫ) ≤ min cost + ln n ǫ

SLIDE 76

Tying it Together

Combining the bounds on the number of wrong guesses with the bound on the error in our final guess, we get E[total cost](1 − ǫ) ≤ min cost + ln n ǫ which shows an interesting tradeoff between ǫ and 1 − ǫ when balancing the amount of randomness

SLIDE 77

Refreshing the Randomness

e1 e2 e3 e4 8 8 6 7

SLIDE 78

Refreshing the Randomness

e1 e2 e3 e4 9 3 6 4 1 1

SLIDE 79

Refreshing the Randomness

e1 e2 e3 e4 3 2 1 1 1 1 1

SLIDE 80

Refreshing the Randomness

e1 e2 e3 e4 6 2 1 4 1 1 1 1 1 1

SLIDE 81

Refreshing the Randomness

e1 e2 e3 e4 1 1 1 1 1 1 1 1

SLIDE 82

Linear Generalization

◮ Fix some D ⊂ Rn

SLIDE 83

Linear Generalization

◮ Fix some D ⊂ Rn ◮ At time t, choose some dt ∈ D

SLIDE 84

Linear Generalization

◮ Fix some D ⊂ Rn ◮ At time t, choose some dt ∈ D ◮ After dt is chosen, a vector st is revealed

SLIDE 85

Linear Generalization

◮ Fix some D ⊂ Rn ◮ At time t, choose some dt ∈ D ◮ After dt is chosen, a vector st is revealed ◮ The cost incurred is dt · st

SLIDE 86

Linear Generalization

◮ Fix some D ⊂ Rn ◮ At time t, choose some dt ∈ D ◮ After dt is chosen, a vector st is revealed ◮ The cost incurred is dt · st ◮ We wish to compete with the best fixed choice dt = d ∀t

SLIDE 87

Linear Generalization

◮ Fix some D ⊂ Rn ◮ At time t, choose some dt ∈ D ◮ After dt is chosen, a vector st is revealed ◮ The cost incurred is dt · st ◮ We wish to compete with the best fixed choice dt = d ∀t ◮ In the 4-player expert case,

D = (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1) and the st are the cost vectors

SLIDE 88

Algorithm for Linear Generalization

With this generalization, the same algorithm works:

◮ Choose a random vector pt ◮ Find the d ∈ D that minimizes d · pt + i d · si and choose it

SLIDE 89

Other Problems in this Framework

The linear generalization covers many interesting online

ptimization problems, including online shortest path:

SLIDE 90

Other Problems in this Framework

The linear generalization covers many interesting online

ptimization problems, including online shortest path:

◮ We are given a graph with 2 labeled vertices s and t

SLIDE 91

Other Problems in this Framework

The linear generalization covers many interesting online

ptimization problems, including online shortest path:

◮ We are given a graph with 2 labeled vertices s and t ◮ Every round, we pick a path from s to t

SLIDE 92

Other Problems in this Framework

The linear generalization covers many interesting online

ptimization problems, including online shortest path:

◮ We are given a graph with 2 labeled vertices s and t ◮ Every round, we pick a path from s to t ◮ Afterward, all edge weights are revealed

SLIDE 93

Other Problems in this Framework

The linear generalization covers many interesting online

ptimization problems, including online shortest path:

◮ We are given a graph with 2 labeled vertices s and t ◮ Every round, we pick a path from s to t ◮ Afterward, all edge weights are revealed ◮ We wish to minimize the sum of all path lengths

SLIDE 94

Other Problems in this Framework

The linear generalization covers many interesting online

ptimization problems, including online shortest path:

◮ We are given a graph with 2 labeled vertices s and t ◮ Every round, we pick a path from s to t ◮ Afterward, all edge weights are revealed ◮ We wish to minimize the sum of all path lengths ◮ We are competing against the optimal fixed path choice

SLIDE 95

Other Problems in this Framework

The linear generalization covers many interesting online

ptimization problems, including online shortest path:

◮ We are given a graph with 2 labeled vertices s and t ◮ Every round, we pick a path from s to t ◮ Afterward, all edge weights are revealed ◮ We wish to minimize the sum of all path lengths ◮ We are competing against the optimal fixed path choice ◮ Here d ∈ D is a vector indicating the edges contained in a

path, and st represents the edge weights

SLIDE 96

Online Shortest Paths Example

s t

SLIDE 97

Online Shortest Paths Example

s t

SLIDE 98

Online Shortest Paths Example

.1 .1 .1 .1 .1 1 s t

SLIDE 99

Online Shortest Paths Example

s t

SLIDE 100

Online Shortest Paths Example

.1 .1 1 .1 1 1 s t

SLIDE 101

Online Shortest Paths Example

s t

SLIDE 102

Online Shortest Paths Example

1 1 1 1 1 .1 s t

SLIDE 103

Follow the Leader

1.2 1.2 2.1 1.2 2.1 2.1 s t

SLIDE 104

Follow the Leader

1.2 1.2 2.1 1.2 2.1 2.1 s t

SLIDE 105

e1 e2 e3 e4

e1 e2 e3 e4 1 1 1

e1 e2 e3 e4 1 1 1 1 1 1

e1 e2 e3 e4 1 1 1 1 1 1 1 1 1

e1 e2 e3 e4 1 1 1 1 1 1 1 1 1 1 1 1

e1 e2 e3 e4 1 1 1 1 1 1 1 1 1 1 1 1

e1 e2 e3 e4 1 1 1 1 1 1 1 1 1 1 1 1

e1 e2 e3 e4 1 1 1 1 1 1 1 1 1 1 1 1

e1 e2 e3 e4 1 1 1 1 1 1 1 1 1 1 1 1

e1 e2 e3 e4 1 1 1 1 1 1 1 1 1 1 1 1

e1 e2 e3 e4

e1 e2 e3 e4

e1 e2 e3 e4 1

e1 e2 e3 e4 1

e1 e2 e3 e4 1

e1 e2 e3 e4 1

e1 e2 e3 e4 1

e1 e2 e3 e4 1

e1 e2 e3 e4 1

e1 e2 e3 e4 guess leader

e1 e2 e3 e4 guess leader

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2)

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2)

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2) .7 .2 .3 .1 e1 (.9) e4 (.6)

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2) .7 .2 .3 .1 e1 (.9) e4 (.6)

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2) .7 .2 .3 .1 e1 (.9) e4 (.6) .3 .6 .8 1 e4 (1.9) e1 (1.2)

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2) .7 .2 .3 .1 e1 (.9) e4 (.6) .3 .6 .8 1 e4 (1.9) e1 (1.2)

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2) .7 .2 .3 .1 e1 (.9) e4 (.6) .3 .6 .8 1 e4 (1.9) e1 (1.2) .1 .6 .4 e1 (2.0) e1 (1.3)

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2) .7 .2 .3 .1 e1 (.9) e4 (.6) .3 .6 .8 1 e4 (1.9) e1 (1.2) .1 .6 .4 e1 (2.0) e1 (1.3)

e1 e2 e3 e4 guess leader .2 .5 1 .5 e1 (.2) e1 (.2) .7 .2 .3 .1 e1 (.9) e4 (.6) .3 .6 .8 1 e4 (1.9) e1 (1.2) .1 .6 .4 e1 (2.0) e1 (1.3) .5 .2 .3 .4 e1 (2.5) e1 (1.8)

e1 e2 e3 e4 1 1 1 1

e1 e2 e3 e4 3 10 2 8 1 1 1 1

e1 e2 e3 e4 3 10 2 8 1 1 1 1 1 1 1 1

e1 e2 e3 e4 3 10 2 8 1 1 1 1 1 1 1 1

e1 e2 e3 e4 8 8 6 7

e1 e2 e3 e4 9 3 6 4 1 1

e1 e2 e3 e4 3 2 1 1 1 1 1

e1 e2 e3 e4 6 2 1 4 1 1 1 1 1 1

e1 e2 e3 e4 1 1 1 1 1 1 1 1

Any Questions?