2 3 Markov Decision Process r k+1 s k+1 Environment Environment - - PowerPoint PPT Presentation

▶

Dec 06, 2023 307 likes •630 views

2 3 Markov Decision Process r k+1 s k+1 Environment Environment Action a k State s k Reward r k Agent 4 5 6 7 8 9 r k+1 s k+1 Environment Action a k Reward r k Critic Value Function State s k TD Error Policy Actor Agent 10 11 12

SLIDE 1

SLIDE 2

2

SLIDE 3

3

SLIDE 4

4

Environment

State sk Action ak

Agent

Reward rk

rk+1 sk+1 Markov Decision Process Environment

SLIDE 5

5

SLIDE 6

6

SLIDE 7

7

SLIDE 8

8

SLIDE 9

9

SLIDE 10

10

State sk Action ak

Actor

Reward rk rk+1 sk+1

Critic

Policy Value Function

TD Error

Environment Agent

SLIDE 11

11

SLIDE 12

12

SLIDE 13

13

SLIDE 14

14

SLIDE 15

15

L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent

reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.

SLIDE 16

16

L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent

reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.

SLIDE 17

17

L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent

reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.

SLIDE 18

18

L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent

reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.

Temporal- difference RL Game Theory Direct Policy Search

SLIDE 19

19

L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent

reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. Task Type -> Agent Awareness Cooperative Competitive Mixed Independent Coordination-free Opponent- independent Agent-independent Tracking Coordination-based

Agent-tracking

Aware Indirect coordination Opponent-aware Agent-aware

SLIDE 20

20

L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent

reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.

SLIDE 21

21

L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent

reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. Q L2 S2 R2 L1 10

S1

R1

10

1 2

Obstacle L1 S1 R1 L2 S2 R2

SLIDE 22

22

C. Guestrin, M. Lagoudakis, and R. Parr, “Coordinated reinforcement learning,” in Proc.

Int’l Conf. Machine Learning (ICML-02), Jul. 2002.

2 3

Q1 Q2 Q3 Q4

1 4

f4

SLIDE 23

23

C. Guestrin, M. Lagoudakis, and R. Parr, “Coordinated reinforcement learning,” in Proc.

Int’l Conf. Machine Learning (ICML-02), Jul. 2002.

2 3

Q1 Q2 Q3 Q4

1 4

f4

SLIDE 24

24

L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent

reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.

SLIDE 25

25

L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent

reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. Q1 L2 R2 L1 1 R1

10

1 2

L1 R1 L2 R2

Q2 L2 R2 L1

R1 10

SLIDE 26

26

L. M. Littman, “Markov games as a framework for multi-agent reinforcement learning,”

in Proc. Int’l Conf. Machine Learning (ICML-94), Jul. 1994.

SLIDE 27

27

L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent

reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.

SLIDE 28

28

L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent

reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. Q1 L2 R2 L1 3 R1 2 Q2 L2 R2 L1 2 R1 3

1 2

L1 R1 L2 R2 Right Room Left Room

SLIDE 29

29

L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent

reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008.

SLIDE 30

30

SLIDE 31