Multiagent Problem Formulation Jos e M Vidal Department of - - PowerPoint PPT Presentation

multiagent problem formulation
SMART_READER_LITE
LIVE PREVIEW

Multiagent Problem Formulation Jos e M Vidal Department of - - PowerPoint PPT Presentation

Multiagent Problem Formulation Multiagent Problem Formulation Jos e M Vidal Department of Computer Science and Engineering, University of South Carolina January 5, 2010 Abstract We cover the most popular formal models for representing


slide-1
SLIDE 1

Multiagent Problem Formulation

Multiagent Problem Formulation

Jos´ e M Vidal

Department of Computer Science and Engineering, University of South Carolina

January 5, 2010 Abstract

We cover the most popular formal models for representing agents and multiagent problems.

slide-2
SLIDE 2

Multiagent Problem Formulation Introduction

Why Study Multiagent Systems?

Multiagent systems everywhere!

slide-3
SLIDE 3

Multiagent Problem Formulation Introduction

Why Study Multiagent Systems?

Multiagent systems everywhere! Internet: peer-to-peer programs (bittorrent), web applications (REST), social networks, the routers themselves.

slide-4
SLIDE 4

Multiagent Problem Formulation Introduction

Why Study Multiagent Systems?

Multiagent systems everywhere! Internet: peer-to-peer programs (bittorrent), web applications (REST), social networks, the routers themselves. Economics: just-in-time manufacturing and procurement, sourcing, ad auctions.

slide-5
SLIDE 5

Multiagent Problem Formulation Introduction

Why Study Multiagent Systems?

Multiagent systems everywhere! Internet: peer-to-peer programs (bittorrent), web applications (REST), social networks, the routers themselves. Economics: just-in-time manufacturing and procurement, sourcing, ad auctions. Political Science and Sociology: negotiations among self-interested parties.

slide-6
SLIDE 6

Multiagent Problem Formulation Introduction

Why Study Multiagent Systems?

Multiagent systems everywhere! Internet: peer-to-peer programs (bittorrent), web applications (REST), social networks, the routers themselves. Economics: just-in-time manufacturing and procurement, sourcing, ad auctions. Political Science and Sociology: negotiations among self-interested parties. Nanofabrication and MEMS: sensor networks.

slide-7
SLIDE 7

Multiagent Problem Formulation Introduction

Why Study Multiagent Systems?

Multiagent systems everywhere! Internet: peer-to-peer programs (bittorrent), web applications (REST), social networks, the routers themselves. Economics: just-in-time manufacturing and procurement, sourcing, ad auctions. Political Science and Sociology: negotiations among self-interested parties. Nanofabrication and MEMS: sensor networks. Biology: social insects, ontogeny, neurology.

slide-8
SLIDE 8

Multiagent Problem Formulation Introduction

Science: How stuff works.

slide-9
SLIDE 9

Multiagent Problem Formulation Introduction

Science: How stuff works. Engineering: How to build stuff.

slide-10
SLIDE 10

Multiagent Problem Formulation Introduction

Science: How stuff works. Engineering: How to build stuff. Multiagent Systems: We want to build systems

  • f, mostly, artificial agents. To do this we need

to understand the science and math.

slide-11
SLIDE 11

Multiagent Problem Formulation Introduction

Fundamentals of Multiagent Systems

Theory: Game Theory, Economics, Sociology, Biology, AI, Multiagent algorithms. Practice: NetLogo

slide-12
SLIDE 12

Multiagent Problem Formulation Introduction

History

1970s AI Boom

slide-13
SLIDE 13

Multiagent Problem Formulation Introduction

History

1970s AI Boom 1980s AI Bust, Blackboard Systems, DAI

slide-14
SLIDE 14

Multiagent Problem Formulation Introduction

History

1970s AI Boom 1980s AI Bust, Blackboard Systems, DAI 1990s The Web, Multiagent Systems

slide-15
SLIDE 15

Multiagent Problem Formulation Introduction

History

1970s AI Boom 1980s AI Bust, Blackboard Systems, DAI 1990s The Web, Multiagent Systems 2000s Ad auctions, Algorithmic Game Theory, Social Networks, REST

slide-16
SLIDE 16

Multiagent Problem Formulation Introduction

History

1970s AI Boom 1980s AI Bust, Blackboard Systems, DAI 1990s The Web, Multiagent Systems 2000s Ad auctions, Algorithmic Game Theory, Social Networks, REST 2010s ?

slide-17
SLIDE 17

Multiagent Problem Formulation Introduction

Grading

Problem sets. Tests.

slide-18
SLIDE 18

Multiagent Problem Formulation Utility

Our Model: The Utility Function

ui : S → ℜ

slide-19
SLIDE 19

Multiagent Problem Formulation Utility

Utility Requirements

reflexive: ui(s) ≥ ui(s) transitive: If ui(a) ≥ ui(b) and ui(b) ≥ ui(c) then ui(a) ≥ ui(c). comparable: ∀a,b either ui(a) ≥ ui(b) or ui(b) ≥ ui(a).

slide-20
SLIDE 20

Multiagent Problem Formulation Utility

Utility is Not Money

Which one do you prefer

1

50/50 chance at winning $10 dollars,

2

$5 dollars?

slide-21
SLIDE 21

Multiagent Problem Formulation Utility

Utility is Not Money

Which one do you prefer

1

50/50 chance at winning $1,000,000 dollars,

2

$500,000 dollars?

slide-22
SLIDE 22

Multiagent Problem Formulation Utility

Expected Utility

E[ui,s,a] = ∑

s′∈S

T(s,a,s′)ui(s′),

slide-23
SLIDE 23

Multiagent Problem Formulation Utility

Expected Utility

E[ui,s,a] = ∑

s′∈S

T(s,a,s′)ui(s′), T(s,a,s′) probability of reaching s′ from s by taking action a.

slide-24
SLIDE 24

Multiagent Problem Formulation Utility

Maximum Expected Utility

πi(s) = argmax

a∈A E[ui,s,a]

slide-25
SLIDE 25

Multiagent Problem Formulation Utility

Value of Information

Value of information that tells agent it is not in s but is in t instead: E[ui,t,πi(t)]−E[ui,t,πi(s)]

slide-26
SLIDE 26

Multiagent Problem Formulation Markov Decision Processes The Model

Markovian Assumption

Andrey Markov. –.

slide-27
SLIDE 27

Multiagent Problem Formulation Markov Decision Processes The Model

Markov Decision Process

s1 s2 s3 1 s4 a1 : .8 a1 : .2 a2 : .2 a2 : .8 a3 : .8 a2 : .8 a2 : .2 a3 : .2 a4 : 1 a3 : 1 a1 : .9 a1 : .1 a4 : .2 a4 : .8

slide-28
SLIDE 28

Multiagent Problem Formulation Markov Decision Processes The Model

What To Do?

Reward when arriving at each state. Must take action each time.

slide-29
SLIDE 29

Multiagent Problem Formulation Markov Decision Processes The Model

What To Do?

Reward when arriving at each state. Must take action each time. Take a high reward now or go to states with higher reward?

slide-30
SLIDE 30

Multiagent Problem Formulation Markov Decision Processes The Model

Discount Future Rewards

Let γ be a discount factor, then reward at s0 is γ0r(s0)+γ1r(s1)+γ2r(s2)+···

slide-31
SLIDE 31

Multiagent Problem Formulation Markov Decision Processes The Model

Define Utility

u(s) = r(s)+γ max

a ∑ s′

T(s,a,s′)u(s′)

slide-32
SLIDE 32

Multiagent Problem Formulation Markov Decision Processes The Model

Define Utility

u(s) = r(s)+γ max

a ∑ s′

T(s,a,s′)u(s′) Then its easy to calculate: π∗(s) = argmax

a ∑ s′

T(s,a,s′)u(s′)

slide-33
SLIDE 33

Multiagent Problem Formulation Markov Decision Processes The Model

Define Utility

u(s) = r(s)+γ max

a ∑ s′

T(s,a,s′)u(s′) Then its easy to calculate: π∗(s) = argmax

a ∑ s′

T(s,a,s′)u(s′) How do we calculate u(s)?

slide-34
SLIDE 34

Multiagent Problem Formulation Markov Decision Processes The Solution

Bellman Update

Richard Bellman. –. Inventor of dynamic programming.

slide-35
SLIDE 35

Multiagent Problem Formulation Markov Decision Processes The Solution

value-iteration(T,r,γ,ε) 1 repeat 2 u ← u′ 3 δ ← 0 4 for s ∈ S 5 do u′(s) ← r(s)+γ maxa ∑s′ T(s,a,s′)u(s′) 6 if |u′(s)−u(s)| > δ 7 then δ ← |u′(s)−u(s)| 8 until δ < ε(1−γ)/γ 9 return u

slide-36
SLIDE 36

Multiagent Problem Formulation Markov Decision Processes The Solution

Value Iteration Example

s1 s2 s3 1 s4 a1 : .8 a1 : .2 a2 : .2 a2 : .8 a3 : .8 a2 : .8 a2 : .2 a3 : .2 a4 : 1 a3 : 1 a1 : .9 a1 : .1 a4 : .2 a4 : .8 γ = .5

slide-37
SLIDE 37

Multiagent Problem Formulation Markov Decision Processes The Solution

Value Iteration Example

s1 s2 s3 1 s4 a1 : .8 a1 : .2 a2 : .2 a2 : .8 a3 : .8 a2 : .8 a2 : .2 a3 : .2 a4 : 1 a3 : 1 a1 : .9 a1 : .1 a4 : .2 a4 : .8 1 γ = .5

slide-38
SLIDE 38

Multiagent Problem Formulation Markov Decision Processes The Solution

Value Iteration Example

s1 s2 s3 1 s4 a1 : .8 a1 : .2 a2 : .2 a2 : .8 a3 : .8 a2 : .8 a2 : .2 a3 : .2 a4 : 1 a3 : 1 a1 : .9 a1 : .1 a4 : .2 a4 : .8 .4 = .5(.8)1 1 .5(.9)1 = .45 γ = .5

slide-39
SLIDE 39

Multiagent Problem Formulation Markov Decision Processes The Solution

Value Iteration Example

s1 s2 s3 1 s4 a1 : .8 a1 : .2 a2 : .2 a2 : .8 a3 : .8 a2 : .8 a2 : .2 a3 : .2 a4 : 1 a3 : 1 a1 : .9 a1 : .1 a4 : .2 a4 : .8 .18 = .5(.8).45 .44 = .5(.88) 1+.5(.45) = 1.225 .5(.945) = .4725 γ = .5

slide-40
SLIDE 40

Multiagent Problem Formulation Markov Decision Processes The Solution

Value Iteration Example

s1 s2 s3 1 s4 a1 : .8 a1 : .2 a2 : .2 a2 : .8 a3 : .8 a2 : .8 a2 : .2 a3 : .2 a4 : 1 a3 : 1 a1 : .9 a1 : .1 a4 : .2 a4 : .8 .234 = .5(.09+.378) .57 = .5(.176+.98) 1+.5(.47) = 1.2 .5(1.1+.047) = .57 γ = .5

slide-41
SLIDE 41

Multiagent Problem Formulation Markov Decision Processes The Solution

Value Iteration Example

s1 s2 s3 1 s4 a1 : .8 a1 : .2 a2 : .2 a2 : .8 a3 : .8 a2 : .8 a2 : .2 a3 : .2 a4 : 1 a3 : 1 a1 : .9 a1 : .1 a4 : .2 a4 : .8 .234 .57 1.2 .57 γ = .5

slide-42
SLIDE 42

Multiagent Problem Formulation Markov Decision Processes Extensions

Multiagent MDPs

Instead of individual actions use a vector of actions. T(s,a,s′) becomes T(s, a,s′). r(s) becomes ri(s).

slide-43
SLIDE 43

Multiagent Problem Formulation Markov Decision Processes Extensions

Multiagent MDPs

Instead of individual actions use a vector of actions. T(s,a,s′) becomes T(s, a,s′). r(s) becomes ri(s). But, the other agents are messing up my rewards!

slide-44
SLIDE 44

Multiagent Problem Formulation Markov Decision Processes Partially Observable MDPs

When Agent Can’t See Everything

Use a Partially Observable MDP (POMDP). Belief state: b Observation model: O(s,o) → [0,1]

slide-45
SLIDE 45

Multiagent Problem Formulation Markov Decision Processes Partially Observable MDPs

Belief Update

An agent with beliefs b takes action a and now

  • bserves o. It can update its beliefs to

∀s′ b′(s′) = αO(s′,o)∑

s

T(s,a,s′) b(s) (1)

slide-46
SLIDE 46

Multiagent Problem Formulation Markov Decision Processes Partially Observable MDPs

Build MDP with Belief States

τ(

  • b,a,

b′) =

  • ∑s′ O(s′,o)∑s T(s,a,s′)

b(s) if (1) is true

  • therwise,

and reward function ρ(

  • b) = ∑

s

  • b(s)r(s)
slide-47
SLIDE 47

Multiagent Problem Formulation Markov Decision Processes Partially Observable MDPs

Still Problematic

There is now one state for each possible b(s). These are continuous values. But, there are value iteration algorithms that use regions to solve these problems. Better to use dynamic decision networks.

slide-48
SLIDE 48

Multiagent Problem Formulation Planning

AI Planning

Operators with pre-conditions and effects. It is a special case of an MDP. More succinct representation (sometimes). Many planning algorithms exist. Turns out, most use a graphical representation.

slide-49
SLIDE 49

Multiagent Problem Formulation Planning

Hierarchical Planning

Divide and Conquer: build plans using big (general) operators then make each operator its own planning problem.

First plan plane,taxi,car then plan how to get from gate to taxi.

Plan hierarchy can be used to share partial plans:

If agent plans to be in Charleston then the one in Columbia knows it won’t bump into him: no need to know exactly where in Charleston.

slide-50
SLIDE 50

Multiagent Problem Formulation Summary

Summary

Utility functions and MDP are most used: power, analyzable in very small cases. Most multiagent research tries to solve the large cases.