SLIDE 1
2 3 Markov Decision Process r k+1 s k+1 Environment Environment - - PowerPoint PPT Presentation
2 3 Markov Decision Process r k+1 s k+1 Environment Environment - - PowerPoint PPT Presentation
2 3 Markov Decision Process r k+1 s k+1 Environment Environment Action a k State s k Reward r k Agent 4 5 6 7 8 9 r k+1 s k+1 Environment Action a k Reward r k Critic Value Function State s k TD Error Policy Actor Agent 10 11 12
SLIDE 2
SLIDE 3
3
SLIDE 4
4
Environment
State sk Action ak
Agent
Reward rk
rk+1 sk+1 Markov Decision Process Environment
SLIDE 5
5
SLIDE 6
6
SLIDE 7
7
SLIDE 8
8
SLIDE 9
9
SLIDE 10
10
State sk Action ak
Actor
Reward rk rk+1 sk+1
Critic
Policy Value Function
TD Error
Environment Agent
SLIDE 11
11
SLIDE 12
12
SLIDE 13
13
SLIDE 14
14
SLIDE 15
15
- L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent
reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.
SLIDE 16
16
- L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent
reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.
SLIDE 17
17
- L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent
reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.
SLIDE 18
18
- L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent
reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.
Temporal- difference RL Game Theory Direct Policy Search
SLIDE 19
19
- L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent
reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. Task Type -> Agent Awareness Cooperative Competitive Mixed Independent Coordination-free Opponent- independent Agent-independent Tracking Coordination-based
- Agent-tracking
Aware Indirect coordination Opponent-aware Agent-aware
SLIDE 20
20
- L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent
reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.
SLIDE 21
21
- L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent
reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. Q L2 S2 R2 L1 10
- 5
S1
- 5
- 10
- 5
R1
- 10
- 5
10
1 2
Obstacle L1 S1 R1 L2 S2 R2
SLIDE 22
22
- C. Guestrin, M. Lagoudakis, and R. Parr, “Coordinated reinforcement learning,” in Proc.
Int’l Conf. Machine Learning (ICML-02), Jul. 2002.
2 3
Q1 Q2 Q3 Q4
1 4
f4
SLIDE 23
23
- C. Guestrin, M. Lagoudakis, and R. Parr, “Coordinated reinforcement learning,” in Proc.
Int’l Conf. Machine Learning (ICML-02), Jul. 2002.
2 3
Q1 Q2 Q3 Q4
1 4
f4
SLIDE 24
24
- L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent
reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.
SLIDE 25
25
- L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent
reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. Q1 L2 R2 L1 1 R1
- 10
10
1 2
L1 R1 L2 R2
Q2 L2 R2 L1
- 1
R1 10
- 10
SLIDE 26
26
- L. M. Littman, “Markov games as a framework for multi-agent reinforcement learning,”
in Proc. Int’l Conf. Machine Learning (ICML-94), Jul. 1994.
SLIDE 27
27
- L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent
reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.
SLIDE 28
28
- L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent
reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. Q1 L2 R2 L1 3 R1 2 Q2 L2 R2 L1 2 R1 3
1 2
L1 R1 L2 R2 Right Room Left Room
SLIDE 29
29
- L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent
reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.
SLIDE 30
30
SLIDE 31