Introduction Approach Experiments
Reasoning about Hypothetical Agent Behaviours and their Parameters
Stefano Albrecht and Peter Stone
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 1
Reasoning about Hypothetical Agent Behaviours and their Parameters - - PowerPoint PPT Presentation
Introduction Approach Experiments Reasoning about Hypothetical Agent Behaviours and their Parameters Stefano Albrecht and Peter Stone Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 1
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 1
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 2
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 3
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 4
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 5
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 5
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 5
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 6
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 6
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 7
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 7
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 8
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 8
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 9
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 10
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 10
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 10
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 10
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 11
j
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 12
j
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 13
j
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 14
j
j
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 15
j
j
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 16
j
j
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 17
j
P(a0
j|H0, θj, p1, p2)
P(a1
j|H1, θj, p1, p2)
P(a2
j|H2, θj, p1, p2)
5
p1 p2
5
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 18
Introduction Approach Experiments
j
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 18
Introduction Approach Experiments
j
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 18
Introduction Approach Experiments
j
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 18
Introduction Approach Experiments
0.2 0.4 0.6 0.8 1
p
1 2 3 4 5 6
Belief density
P(p|Ht−1
i
, θj) p∗
Prior
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 19
Introduction Approach Experiments
0.2 0.4 0.6 0.8 1
p
1 2 3 4 5 6
Belief density
P(p|Ht−1
i
, θj) p∗
Prior Past action at−1
j Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 19
Introduction Approach Experiments
0.2 0.4 0.6 0.8 1
p
1 2 3 4 5 6
Belief density
P(p|Ht−1
i
, θj) p∗
Prior Past action at−1
j
0.2 0.4 0.6 0.8 1
p
0.5 1 1.5
ˆ f(p)
Samples from f Fitted ˆ f
Likelihood of at−1
j
given type θj f (p) = P(at−1
j
|Ht−1, θj, p)
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 19
Introduction Approach Experiments
0.2 0.4 0.6 0.8 1
p
1 2 3 4 5 6
Belief density
P(p|Ht−1
i
, θj) p∗
Prior Past action at−1
j
0.2 0.4 0.6 0.8 1
p
0.5 1 1.5
ˆ f(p)
Samples from f Fitted ˆ f
Likelihood of at−1
j
given type θj f (p) = P(at−1
j
|Ht−1, θj, p)
0.2 0.4 0.6 0.8 1
p
1 2 3 4
Belief density
Samples from ˆ g Fitted ˆ h P(p|Ht
i , θj)
p∗
Posterior (blue)
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 19
Introduction Approach Experiments
0.2 0.4 0.6 0.8 1
p
1 2 3 4 5 6
Belief density
P(p|Ht−1
i
, θj) p∗
Prior Past action at−1
j
0.2 0.4 0.6 0.8 1
p
0.5 1 1.5
ˆ f(p)
Samples from f Fitted ˆ f
Likelihood of at−1
j
given type θj f (p) = P(at−1
j
|Ht−1, θj, p)
0.2 0.4 0.6 0.8 1
p
1 2 3 4
Belief density
Samples from ˆ g Fitted ˆ h P(p|Ht
i , θj)
p∗
Posterior (blue) ← Generate estimate pt from posterior
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 19
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 20
p t
j
P(a0
j|H0, θj, p1, p2)
P(a1
j|H1, θj, p1, p2)
P(a2
j|H2, θj, p1, p2)
p1 p2
5
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 21
j
j
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 22
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 23
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 23
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 24
Introduction Approach Experiments
n
k − pt−1 k
n
k
k
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 24
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 25
Introduction Approach Experiments
Blue = our agent, red = other agent Goal: collect all items in minimal time Agents and items have skill levels ∈ [0, 1] ⇒ Have to coordinate skills
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 26 0.15 0.58 0.83 0.53 0.23 0.48 0.55
Introduction Approach Experiments
Red has one of 4 types: θL1
j : Search for item, try to load
θL2
j : Search for feasible item, try to load
θF1
j : Search for agent, load item closest to agent
θF2
j : Search for agent, load closest feasible item
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 27 0.15 0.58 0.83 0.53 0.23 0.48 0.55
p1 p2 p3
Introduction Approach Experiments
Red has one of 4 types: θL1
j : Search for item, try to load
θL2
j : Search for feasible item, try to load
θF1
j : Search for agent, load item closest to agent
θF2
j : Search for agent, load closest feasible item
Each type has 3 parameters: level p1 view radius p2 view angle p3
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 27 0.15 0.58 0.83 0.53 0.23 0.48 0.55
p1 p2 p3
Introduction Approach Experiments
Red has one of 4 types: θL1
j : Search for item, try to load
θL2
j : Search for feasible item, try to load
θF1
j : Search for agent, load item closest to agent
θF2
j : Search for agent, load closest feasible item
Each type has 3 parameters: level p1 view radius p2 view angle p3 Blue does not know true type, parameter values, or meaning of parameters Uses MCTS to plan own actions
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 27 0.15 0.58 0.83 0.53 0.23 0.48 0.55
p1 p2 p3
Introduction Approach Experiments
2 agents, 5 items, 10x10 world Starting with random parameter estimates First video without updating Second video with updating, using bandit selection and EGO 3 agents, 10 items, 15x15 world Starting with random parameter estimates First video without updating Second video with updating, using bandit selection and EGO
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 28
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 29
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 30
Introduction Approach Experiments
AGA ABU EGO 0.01 0.1 1
Seconds (log)
Average seconds (log-scale) needed per parameter update for single type
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 31
Introduction Approach Experiments
1 2 3 4 5 6 7 8 9 10 11 12 13 14 end
Time step
0.20 0.25 0.30 0.35 0.40
Mean error
AGA ABU EGO
Mean error in estimates of view radius p2 for true type in 15x15 world (updating all types in each time step)
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 32
Introduction Approach Experiments
1 2 3 4 5 6 7 8 9 end
Time step
0.2 0.4 0.6 0.8 1
Probability
AGA ABU EGO Cor Rnd
Average belief P(θ∗
j |Ht) for true type θ∗ j in 10x10 world
(updating all types in each time step)
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 33
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 34
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 34
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 34
Introduction Approach Experiments
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 34
Introduction Approach Experiments Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 35
Introduction Approach Experiments
1: Observe action at−1
j
2: Select a subset Φ ⊂ Θj for parameter updates 3: For each θj ∈ Φ: 4:
5:
6: Set pt = pt−1 for all θj ∈ Φ 7: For each θj ∈ Θj, update belief:
j
Stefano Albrecht, Peter Stone Reasoning about Hypothetical Agent Behaviours and their Parameters 36