Multi-Robot Learning for Continuous Area Sweeping Peter Stone Joint - - PowerPoint PPT Presentation

multi robot learning for continuous area sweeping
SMART_READER_LITE
LIVE PREVIEW

Multi-Robot Learning for Continuous Area Sweeping Peter Stone Joint - - PowerPoint PPT Presentation

Multi-Robot Learning for Continuous Area Sweeping Peter Stone Joint work with Mazda Ahmadi Learning Agents Research Group (LARG) Department of Computer Sciences The University of Texas at Austin LAMAS, July 2005 Peter Stone, UT Austin


slide-1
SLIDE 1

Multi-Robot Learning for Continuous Area Sweeping

Peter Stone

Joint work with Mazda Ahmadi Learning Agents Research Group (LARG) Department of Computer Sciences The University of Texas at Austin

LAMAS, July 2005

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-2
SLIDE 2

Introduction Problem Specification Algorithm Results Multi-robot Learning

Multiagent Learning in LARG

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-3
SLIDE 3

Introduction Problem Specification Algorithm Results Multi-robot Learning

Multiagent Learning in LARG

Transfer Learning in Keepaway

[Taylor, Wed., 10:30]

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-4
SLIDE 4

Introduction Problem Specification Algorithm Results Multi-robot Learning

Multiagent Learning in LARG

Transfer Learning in Keepaway

[Taylor, Wed., 10:30]

Multiagent Traffic Management

[Dresner, 10:45]

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-5
SLIDE 5

Introduction Problem Specification Algorithm Results Multi-robot Learning

Multiagent Learning in LARG

Transfer Learning in Keepaway

[Taylor, Wed., 10:30]

Multiagent Traffic Management

[Dresner, 10:45]

General Game Playing

[Kuhlmann, Dresner]

Winner, 2005 RoboCup coach comp.

[Kuhlmann, Knox]

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-6
SLIDE 6

Introduction Problem Specification Algorithm Results Multi-robot Learning

Multiagent Learning in LARG

Transfer Learning in Keepaway

[Taylor, Wed., 10:30]

Multiagent Traffic Management

[Dresner, 10:45]

General Game Playing

[Kuhlmann, Dresner]

Winner, 2005 RoboCup coach comp.

[Kuhlmann, Knox]

Learning for Continuous Area Sweeping

[Ahmadi, 2005]

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-7
SLIDE 7

Introduction Problem Specification Algorithm Results Multi-robot Learning

Multiagent Learning in LARG

Transfer Learning in Keepaway

[Taylor, Wed., 10:30]

Multiagent Traffic Management

[Dresner, 10:45]

General Game Playing

[Kuhlmann, Dresner]

Winner, 2005 RoboCup coach comp.

[Kuhlmann, Knox]

Learning for Continuous Area Sweeping

[Ahmadi, 2005] Mostly single-robot Initial multi-robot results

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-8
SLIDE 8

Introduction Problem Specification Algorithm Results Multi-robot Learning

Project Description

Definitions Area sweeping Continuous area sweeping

Examples: cleaning robots, surveillance robots. Non-uniform sweeping Multi-robot sweeping

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-9
SLIDE 9

Introduction Problem Specification Algorithm Results Multi-robot Learning

Project Description

Definitions Area sweeping Continuous area sweeping

Examples: cleaning robots, surveillance robots. Non-uniform sweeping Multi-robot sweeping

  • Closet

Bathroom

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-10
SLIDE 10

Introduction Problem Specification Algorithm Results Multi-robot Learning

Project Description

Definitions Area sweeping Continuous area sweeping

Examples: cleaning robots, surveillance robots. Non-uniform sweeping Multi-robot sweeping

  • Closet

Bathroom

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-11
SLIDE 11

Introduction Problem Specification Algorithm Results Multi-robot Learning

Project Description

Definitions Area sweeping Continuous area sweeping

Examples: cleaning robots, surveillance robots. Non-uniform sweeping Multi-robot sweeping

  • Closet

Bathroom

  • Peter Stone, UT Austin

Multi-Robot Continuous Area Sweeping

slide-12
SLIDE 12

Introduction Problem Specification Algorithm Results Multi-robot Learning

Project Description

Definitions Area sweeping Continuous area sweeping

Examples: cleaning robots, surveillance robots. Non-uniform sweeping Multi-robot sweeping

Closet Bathroom

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-13
SLIDE 13

Introduction Problem Specification Algorithm Results Multi-robot Learning

Project Description

Definitions Area sweeping Continuous area sweeping

Examples: cleaning robots, surveillance robots. Non-uniform sweeping Multi-robot sweeping

Closet Bathroom

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-14
SLIDE 14

Introduction Problem Specification Algorithm Results Multi-robot Learning

Project Description

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-15
SLIDE 15

Introduction Problem Specification Algorithm Results Multi-robot Learning

Outline

1

Introduction and Motivation

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-16
SLIDE 16

Introduction Problem Specification Algorithm Results Multi-robot Learning

Outline

1

Introduction and Motivation

2

Single Robot Problem Specification

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-17
SLIDE 17

Introduction Problem Specification Algorithm Results Multi-robot Learning

Outline

1

Introduction and Motivation

2

Single Robot Problem Specification

3

Exploration Algorithm Learning Expected Rewards Planning Correctness

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-18
SLIDE 18

Introduction Problem Specification Algorithm Results Multi-robot Learning

Outline

1

Introduction and Motivation

2

Single Robot Problem Specification

3

Exploration Algorithm Learning Expected Rewards Planning Correctness

4

Results Simulation Results Results on Real Robots

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-19
SLIDE 19

Introduction Problem Specification Algorithm Results Multi-robot Learning

Outline

1

Introduction and Motivation

2

Single Robot Problem Specification

3

Exploration Algorithm Learning Expected Rewards Planning Correctness

4

Results Simulation Results Results on Real Robots

5

Multi-robot Extensions Overview Negotiation Algorithm Results

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-20
SLIDE 20

Introduction Problem Specification Algorithm Results Multi-robot Learning

Outline

1

Introduction and Motivation

2

Single Robot Problem Specification

3

Exploration Algorithm Learning Expected Rewards Planning Correctness

4

Results Simulation Results Results on Real Robots

5

Multi-robot Extensions Overview Negotiation Algorithm Results

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-21
SLIDE 21

Introduction Problem Specification Algorithm Results Multi-robot Learning

Assumptions

Closet Bathroom

The environment

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-22
SLIDE 22

Introduction Problem Specification Algorithm Results Multi-robot Learning

Assumptions

Closet Bathroom

The environment is divided into grid cells (G).

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-23
SLIDE 23

Introduction Problem Specification Algorithm Results Multi-robot Learning

Assumptions

Closet Bathroom

The orientations: east, west, north and south.

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-24
SLIDE 24

Introduction Problem Specification Algorithm Results Multi-robot Learning

Assumptions

Closet Bathroom 12’:50’’ 13’:02’’ 12’:42’’ 12’:13’’ 12’:18’’ 12’:30’’

LV[G]: last time that robot has visited cell g.

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-25
SLIDE 25

Introduction Problem Specification Algorithm Results Multi-robot Learning

Assumptions (cont.)

Time is considered in sequence of discrete steps. impe: importance of detecting event e.

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-26
SLIDE 26

Introduction Problem Specification Algorithm Results Multi-robot Learning

Definitions

Formal Definition The problem is defined as: (S, A ; Tsa

; Peg ; CF):

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-27
SLIDE 27

Introduction Problem Specification Algorithm Results Multi-robot Learning

Definitions

Formal Definition The problem is defined as: (S, A ; Tsa

; Peg ; CF):

S: Set of states G

O LV

Closet Bathroom 12’:50’’ 13’:02’’ 12’:42’’ 12’:13’’ 12’:18’’ 12’:30’’

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-28
SLIDE 28

Introduction Problem Specification Algorithm Results Multi-robot Learning

Definitions

Formal Definition The problem is defined as: (S

; A ; Tsa ; Peg ; CF):

A: Set of possible actions

Closet Bathroom

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-29
SLIDE 29

Introduction Problem Specification Algorithm Results Multi-robot Learning

Definitions

Formal Definition The problem is defined as: (S

; A ; Tsa ; Peg ; CF):

A: Set of possible actions

Closet Bathroom

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-30
SLIDE 30

Introduction Problem Specification Algorithm Results Multi-robot Learning

Definitions

Formal Definition The problem is defined as: (S

; A ; Tsa ; Peg ; CF):

A: Set of possible actions

Closet Bathroom

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-31
SLIDE 31

Introduction Problem Specification Algorithm Results Multi-robot Learning

Definitions

Formal Definition The problem is defined as: (S

; A ; Tsa ; Peg ; CF):

Tsa: State transition probabilities

Closet Bathroom

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-32
SLIDE 32

Introduction Problem Specification Algorithm Results Multi-robot Learning

Definitions

Formal Definition The problem is defined as: (S

; A ; Tsa ; Peg ; CF):

Peg: Probability of appearance of event e in cell g; Initially unknown; possibly non-stationary

Closet Bathroom

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-33
SLIDE 33

Introduction Problem Specification Algorithm Results Multi-robot Learning

Definitions

Formal Definition The problem is defined as: (S

; A ; Tsa ; Peg ;CF):

CF: Cost function of the policy. Average time between appearance and detection, weighted by impe.

Closet Bathroom

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-34
SLIDE 34

Introduction Problem Specification Algorithm Results Multi-robot Learning

Definitions

The Goal The goal is to find a policy

  • : S
! A which minimizes the

cost function.

  • Closet

Bathroom

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-35
SLIDE 35

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Outline

1

Introduction and Motivation

2

Single Robot Problem Specification

3

Exploration Algorithm Learning Expected Rewards Planning Correctness

4

Results Simulation Results Results on Real Robots

5

Multi-robot Extensions Overview Negotiation Algorithm Results

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-36
SLIDE 36

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-37
SLIDE 37

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

5 2 1

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-38
SLIDE 38

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

4 10 2

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-39
SLIDE 39

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

6 3 15

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-40
SLIDE 40

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

6 3

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-41
SLIDE 41

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

8 4 5

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-42
SLIDE 42

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

10 5 10

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-43
SLIDE 43

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

5 10

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-44
SLIDE 44

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

6 15 2

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-45
SLIDE 45

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

20 7 4

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-46
SLIDE 46

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

25 8 6

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-47
SLIDE 47

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

8 6

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-48
SLIDE 48

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

9 8 5

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-49
SLIDE 49

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

10 10 10

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-50
SLIDE 50

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

12 15 11

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-51
SLIDE 51

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

15 11

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-52
SLIDE 52

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

2 20 12

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-53
SLIDE 53

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

4 13 25

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-54
SLIDE 54

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

4 13

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-55
SLIDE 55

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

6 14 5

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-56
SLIDE 56

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

8 15 10

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-57
SLIDE 57

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

10 16 15

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-58
SLIDE 58

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

12 20 17

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-59
SLIDE 59

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Algorithm Overview

12 20

Closet Bathroom Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-60
SLIDE 60

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Learning

exp rewardgt

= (t LV [g ℄)
  • X

all e

Peg

impe

(1)

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-61
SLIDE 61

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Learning

exp rewardgt

= (t LV [g ℄)
  • X

all e

Peg

impe

(1) pot rewardgt

= X

all e

Peg

impe

(2)

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-62
SLIDE 62

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Learning

exp rewardgt

= (t LV [g ℄)
  • X

all e

Peg

impe

(1) pot rewardgt

= X

all e

Peg

impe

(2) Approximate pot reward Compute a new approximation of pot reward (new pot).

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-63
SLIDE 63

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Learning

exp rewardgt

= (t LV [g ℄)
  • X

all e

Peg

impe

(1) pot rewardgt

= X

all e

Peg

impe

(2) Approximate pot reward Compute a new approximation of pot reward (new pot). pot reward

:=
  • new pot
+ (1
  • )
pot reward

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-64
SLIDE 64

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Learning

exp rewardgt

= (t LV [g ℄)
  • X

all e

Peg

impe

(1) pot rewardgt

= X

all e

Peg

impe

(2) Approximate pot reward Compute a new approximation of pot reward (new pot). pot reward

:=
  • new pot
+ (1
  • )
pot reward

No updates to zero, instead decay over time.

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-65
SLIDE 65

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Planning

One step greedy action selection Set of actions: going to different grids with one

  • f the four orientations.

What to maximize: Sum

  • f collected expected

rewards per time.

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-66
SLIDE 66

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Planning

One step greedy action selection Set of actions: going to different grids with one

  • f the four orientations.

What to maximize: Sum

  • f collected expected

rewards per time.

Closet Bathroom

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-67
SLIDE 67

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Planning

One step greedy action selection Set of actions: going to different grids with one

  • f the four orientations.

What to maximize: Sum

  • f collected expected

rewards per time.

Closet Bathroom

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-68
SLIDE 68

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Planning

One step greedy action selection Set of actions: going to different grids with one

  • f the four orientations.

What to maximize: Sum

  • f collected expected

rewards per time.

Closet Bathroom

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-69
SLIDE 69

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Correctness Proof

With optimal planning, the cost function is minimized

Maximizing exp reward at individual cells minimizes CF

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-70
SLIDE 70

Introduction Problem Specification Algorithm Results Multi-robot Learning learning planning correctness

Correctness Proof

With optimal planning, the cost function is minimized

Maximizing exp reward at individual cells minimizes CF

Formal proof in [Ahmadi & S, 2005]

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-71
SLIDE 71

Introduction Problem Specification Algorithm Results Multi-robot Learning Simulation Results Robots Results

Outline

1

Introduction and Motivation

2

Single Robot Problem Specification

3

Exploration Algorithm Learning Expected Rewards Planning Correctness

4

Results Simulation Results Results on Real Robots

5

Multi-robot Extensions Overview Negotiation Algorithm Results

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-72
SLIDE 72

Introduction Problem Specification Algorithm Results Multi-robot Learning Simulation Results Robots Results

Simulation Results

Region 1 Region 2 Region 3 Region 4 A B C D

The path that the robot traverses in uniform distribution of the appearance of the ball. Average detection time: 106 seconds.

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-73
SLIDE 73

Introduction Problem Specification Algorithm Results Multi-robot Learning Simulation Results Robots Results

Simulation Results (cont.)

Region 1 Region 2 Region 3 Region 4 B D A C

The path the robot traverse when the ball always appears in region 2. Average detection time: 47 seconds.

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-74
SLIDE 74

Introduction Problem Specification Algorithm Results Multi-robot Learning Simulation Results Robots Results

Simulation Results (cont.)

Region 1 Region 2 Region 3 Region 4 C D A B

Biased distribution: Probability of the ball appearance is 60% in region 2, 30% in region 1 and 5% in region 3 and 4. Average detection time: 79 seconds.

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-75
SLIDE 75

Introduction Problem Specification Algorithm Results Multi-robot Learning Simulation Results Robots Results

Simulation Results (cont.)

Changing Distribution From the previous distribution to uniform distribution, it took about 9 loops to adapt the correct distribution.

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-76
SLIDE 76

Introduction Problem Specification Algorithm Results Multi-robot Learning Simulation Results Robots Results

Results from Real Robots

Movies!

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-77
SLIDE 77

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Outline

1

Introduction and Motivation

2

Single Robot Problem Specification

3

Exploration Algorithm Learning Expected Rewards Planning Correctness

4

Results Simulation Results Results on Real Robots

5

Multi-robot Extensions Overview Negotiation Algorithm Results

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-78
SLIDE 78

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Problem Overview

Multiple robots divide the sweeping area Goal: minimize global cost function (fully cooperative)

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-79
SLIDE 79

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Problem Overview

Multiple robots divide the sweeping area Goal: minimize global cost function (fully cooperative)

Equalized (weighted) average detection time among robots

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-80
SLIDE 80

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Problem Overview

Multiple robots divide the sweeping area Goal: minimize global cost function (fully cooperative)

Equalized (weighted) average detection time among robots

Team members change dynamically

Robots regularly added and removed

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-81
SLIDE 81

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Problem Overview

Multiple robots divide the sweeping area Goal: minimize global cost function (fully cooperative)

Equalized (weighted) average detection time among robots

Team members change dynamically

Robots regularly added and removed

Peg’s still change dynamically

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-82
SLIDE 82

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Solution Framework

Robots each use single-agent algorithm in limited region Continual negotiation at region boundaries

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-83
SLIDE 83

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Solution Framework

Robots each use single-agent algorithm in limited region Continual negotiation at region boundaries New robots take minimal area in immediate neighborhood Area of removed robot initially taken by neighbor

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-84
SLIDE 84

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Negotiation Algorithm Sketch

1

Periodically communicate visit intervals for boundary cells

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-85
SLIDE 85

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Negotiation Algorithm Sketch

1

Periodically communicate visit intervals for boundary cells

2

Consider “taking over” neighbor’s worst cell

Compute hypothetical plans, report visit intervals

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-86
SLIDE 86

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Negotiation Algorithm Sketch

1

Periodically communicate visit intervals for boundary cells

2

Consider “taking over” neighbor’s worst cell

Compute hypothetical plans, report visit intervals

3

Single best neighboring offer accepted

biggest coverage improvement

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-87
SLIDE 87

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Negotiation Algorithm Sketch

1

Periodically communicate visit intervals for boundary cells

2

Consider “taking over” neighbor’s worst cell

Compute hypothetical plans, report visit intervals

3

Single best neighboring offer accepted

biggest coverage improvement

4

Repeat next cycle

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-88
SLIDE 88

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Simulation Configuration I

Region 1 Area Robot 2 Responsibility Region 3 Region 2 Robot 1 Responsibility Area

2 homogeneous robots, uniform Peg’s

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-89
SLIDE 89

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

3 homogeneous robots

Robot 3 Responsibility

  • Region 3

Region 2 Region 1 Robot 1 Responsibility Robot 2 Responsibility

Uniform Peg’s

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-90
SLIDE 90

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

3 heterogeneous robots

Robot 3 Responsibility

  • Region 3

Region 2 Region 1 Robot 1 Responsibility Robot 2 Responsibility Robot 3 Responsibility

Robot 3 moves at half speed Time between visits, before negotiation: 54s, after:50s.

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-91
SLIDE 91

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

3 homogeneous robots, non-uniform Peg’s

Robot 3 Responsibility

  • X

Robot 1 Robot 2 Responsibility

Pex 10 times greater Average detection time, before negotiation: 48s, after: 32s.

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-92
SLIDE 92

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

3 homogeneous robots, non-uniform Peg’s

Robot 3 Responsibility

  • X

Robot 1 Robot 2 Responsibility

PeX 1000 times greater Average detection time, before negotiation: 48s, after: 1s.

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-93
SLIDE 93

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Simulation Configuration II

  • Peter Stone, UT Austin

Multi-Robot Continuous Area Sweeping

slide-94
SLIDE 94

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

8 heterogeneous robots

  • 2

1 3 4 5 6 6 7 7 5 8 8

Robot speeds differ from 10 (1 & 3) to 50 (8)

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-95
SLIDE 95

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Results from Real Robots

Movie!

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-96
SLIDE 96

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Related Work

Kalra, Stentz, and Ferguson, Hoplites: A market framework for complex tight coordination in multi-agent teams, Robotics Institute, CMU Kurabayashi and Ota, Cooperative sweeping by multiple mobile robots, ICRA 1996 Choset, Coverage for robotics; a survey of recent results, Annals of Math. and AI, 2001. Parker, Distributed algorithms for multi-robot

  • bservation of multiple moving targets, Autonomous

Robots, 2002. Koenig, Szymanski, and Liu. Efficient and Inefficient Ant Coverage Methods. Annals of Math. and AI, 2001

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-97
SLIDE 97

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Conclusion and Future Work

Conclusion Continuous area sweeping interesting and challenging. Good initial progress

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-98
SLIDE 98

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Conclusion and Future Work

Conclusion Continuous area sweeping interesting and challenging. Good initial progress Future Work Non-greedy planning Continuous representations Better representation and analysis of noise Reasoning about communicative connectivity

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-99
SLIDE 99

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Acknowledgements

Joint work with Mazda Ahmadi

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping

slide-100
SLIDE 100

Introduction Problem Specification Algorithm Results Multi-robot Learning Overview Negotiation Algorithm Results Conclusion

Acknowledgements

Joint work with Mazda Ahmadi Built on UT Austin Villa robot soccer code

Kurt Dresner, Peggy Fidelman, Nate Kohl Greg Kuhlmann, Mohan Sridharan, Dan Stronger And others

Peter Stone, UT Austin Multi-Robot Continuous Area Sweeping