CS344M Autonomous Multiagent Systems Patrick MacAlpine Department - - PowerPoint PPT Presentation

cs344m autonomous multiagent systems
SMART_READER_LITE
LIVE PREVIEW

CS344M Autonomous Multiagent Systems Patrick MacAlpine Department - - PowerPoint PPT Presentation

CS344M Autonomous Multiagent Systems Patrick MacAlpine Department of Computer Science The University of Texas at Austin Good Afternoon, Colleagues Are there any questions? Patrick MacAlpine Good Afternoon, Colleagues Are there any


slide-1
SLIDE 1

CS344M Autonomous Multiagent Systems

Patrick MacAlpine Department of Computer Science The University of Texas at Austin

slide-2
SLIDE 2

Good Afternoon, Colleagues

Are there any questions?

Patrick MacAlpine

slide-3
SLIDE 3

Good Afternoon, Colleagues

Are there any questions?

  • From last week: Difference between open and closed

loop?

Patrick MacAlpine

slide-4
SLIDE 4

Logistics

  • Thesis defense Monday 11/30 at 10am: GDC 3.516

− Daniel Urieli: Autonomous Trading in Modern Electricity Markets

Patrick MacAlpine

slide-5
SLIDE 5

Logistics

  • Thesis defense Monday 11/30 at 10am: GDC 3.516

− Daniel Urieli: Autonomous Trading in Modern Electricity Markets

  • All grades should now be out

Patrick MacAlpine

slide-6
SLIDE 6

Logistics

  • Thesis defense Monday 11/30 at 10am: GDC 3.516

− Daniel Urieli: Autonomous Trading in Modern Electricity Markets

  • All grades should now be out
  • Extra credit for taking class survey (provide screenshot as

proof)

Patrick MacAlpine

slide-7
SLIDE 7

Logistics

  • Thesis defense Monday 11/30 at 10am: GDC 3.516

− Daniel Urieli: Autonomous Trading in Modern Electricity Markets

  • All grades should now be out
  • Extra credit for taking class survey (provide screenshot as

proof)

  • Final projects due next week (team on Tuesday, report on

Thursday)!

Patrick MacAlpine

slide-8
SLIDE 8

Class Tournament Teams TODO

  • Have penalty kick behavior ready

Patrick MacAlpine

slide-9
SLIDE 9

Class Tournament Teams TODO

  • Have penalty kick behavior ready
  • No ground truth measurements provided during games

Patrick MacAlpine

slide-10
SLIDE 10

Class Tournament Teams TODO

  • Have penalty kick behavior ready
  • No ground truth measurements provided during games
  • 2D: You can create and compile in a custom banner (not

required)

Patrick MacAlpine

slide-11
SLIDE 11

Class Tournament Teams TODO

  • Have penalty kick behavior ready
  • No ground truth measurements provided during games
  • 2D: You can create and compile in a custom banner (not

required)

  • 3D: Make sure that you’re using a legal set of agent types

Patrick MacAlpine

slide-12
SLIDE 12

Class Tournament Teams TODO

  • Have penalty kick behavior ready
  • No ground truth measurements provided during games
  • 2D: You can create and compile in a custom banner (not

required)

  • 3D: Make sure that you’re using a legal set of agent types
  • Include source code with a README

Patrick MacAlpine

slide-13
SLIDE 13

Class Tournament Teams TODO

  • Have penalty kick behavior ready
  • No ground truth measurements provided during games
  • 2D: You can create and compile in a custom banner (not

required)

  • 3D: Make sure that you’re using a legal set of agent types
  • Include source code with a README
  • Include a log file of your team playing

Patrick MacAlpine

slide-14
SLIDE 14

Important Items for Final Reports

  • Have at least 3 citations (2 non-RoboCup)

Patrick MacAlpine

slide-15
SLIDE 15

Important Items for Final Reports

  • Have at least 3 citations (2 non-RoboCup)

− Citations include title, authors(s), venue of publication, year

Patrick MacAlpine

slide-16
SLIDE 16

Important Items for Final Reports

  • Have at least 3 citations (2 non-RoboCup)

− Citations include title, authors(s), venue of publication, year − For “RoboCup-X: Robot Soccer World Cup X” RoboCup symposium papers editors are not authors!

Patrick MacAlpine

slide-17
SLIDE 17

Important Items for Final Reports

  • Have at least 3 citations (2 non-RoboCup)

− Citations include title, authors(s), venue of publication, year − For “RoboCup-X: Robot Soccer World Cup X” RoboCup symposium papers editors are not authors!

  • Include some statistical significance test – you can run

games in parallel on condor

Patrick MacAlpine

slide-18
SLIDE 18

Paper Sections

Patrick MacAlpine

slide-19
SLIDE 19

Paper Sections

  • Abstract: brief summary of what paper is about and the

results it will show

Patrick MacAlpine

slide-20
SLIDE 20

Paper Sections

  • Abstract: brief summary of what paper is about and the

results it will show

  • Introduction/Motivation:

briefly discuss problems/ideas that will be addressed and why the topic/focus of the paper is important

Patrick MacAlpine

slide-21
SLIDE 21

Paper Sections

  • Abstract: brief summary of what paper is about and the

results it will show

  • Introduction/Motivation:

briefly discuss problems/ideas that will be addressed and why the topic/focus of the paper is important

  • Background:

give technical background information necessary for understanding the paper

Patrick MacAlpine

slide-22
SLIDE 22

Paper Sections

  • Abstract: brief summary of what paper is about and the

results it will show

  • Introduction/Motivation:

briefly discuss problems/ideas that will be addressed and why the topic/focus of the paper is important

  • Background:

give technical background information necessary for understanding the paper

  • Methodology/Algorithm Description:

explain the new ideas/algorithms that the paper is presenting

Patrick MacAlpine

slide-23
SLIDE 23

Paper Sections

  • Experimental Setup: detail the experimental setup used

to test out the ideas/algorithms/hypothesis in the paper

Patrick MacAlpine

slide-24
SLIDE 24

Paper Sections

  • Experimental Setup: detail the experimental setup used

to test out the ideas/algorithms/hypothesis in the paper

  • Results/Analysis: results and analysis of experiments

Patrick MacAlpine

slide-25
SLIDE 25

Paper Sections

  • Experimental Setup: detail the experimental setup used

to test out the ideas/algorithms/hypothesis in the paper

  • Results/Analysis: results and analysis of experiments
  • Related Work: work related to what has been presented

and possibly compares and contrasts related work with that of the work presented in the paper

Patrick MacAlpine

slide-26
SLIDE 26

Paper Sections

  • Experimental Setup: detail the experimental setup used

to test out the ideas/algorithms/hypothesis in the paper

  • Results/Analysis: results and analysis of experiments
  • Related Work: work related to what has been presented

and possibly compares and contrasts related work with that of the work presented in the paper

  • Summary/Conclusion: short summary of work presented

in the paper as well as possibly mentioning future work

Patrick MacAlpine

slide-27
SLIDE 27

Last week: Trading Agent Competition

  • Put forth as a benchmark problem for e-marketplaces

[Wellman, Wurman, et al., 2000]

  • Autonomous agents act as travel agents

Patrick MacAlpine

slide-28
SLIDE 28

Last week: Trading Agent Competition

  • Put forth as a benchmark problem for e-marketplaces

[Wellman, Wurman, et al., 2000]

  • Autonomous agents act as travel agents

− Game: 8 agents, 12 min. − Agent: simulated travel agent with 8 clients − Client: TACtown ↔ Tampa within 5-day period

Patrick MacAlpine

slide-29
SLIDE 29

Last week: Trading Agent Competition

  • Put forth as a benchmark problem for e-marketplaces

[Wellman, Wurman, et al., 2000]

  • Autonomous agents act as travel agents

− Game: 8 agents, 12 min. − Agent: simulated travel agent with 8 clients − Client: TACtown ↔ Tampa within 5-day period

  • Auctions for flights, hotels, entertainment tickets

− Server maintains markets, sends prices to agents − Agent sends bids to server over network Goal: analytically calculate optimal bids

Patrick MacAlpine

slide-30
SLIDE 30

High-Level Strategy

  • Learn model of expected hotel price

Patrick MacAlpine

slide-31
SLIDE 31

High-Level Strategy

  • Learn model of expected hotel price distributions

Patrick MacAlpine

slide-32
SLIDE 32

High-Level Strategy

  • Learn model of expected hotel price distributions
  • For each auction:

– Repeatedly sample price vector from distributions

Patrick MacAlpine

slide-33
SLIDE 33

High-Level Strategy

  • Learn model of expected hotel price distributions
  • For each auction:

– Repeatedly sample price vector from distributions – Bid avg marginal expected utility

Patrick MacAlpine

slide-34
SLIDE 34

Finals

Team Avg. Adj. Institution ATTac 3622 4154 AT&T livingagents 3670 4094 Living Systems (Germ.) whitebear 3513 3931 Cornell Urlaub01 3421 3909 Penn State Retsina 3352 3812 CMU CaiserSose 3074 3766 Essex (UK) Southampton 3253∗ 3679 Southampton (UK) TacsMan 2859 3338 Stanford

  • ATTac improves over time
  • livingagents is an open-loop strategy

Patrick MacAlpine

slide-35
SLIDE 35

Other TAC competitions

  • Supply Chain Management
  • Ad Auctions
  • Power

Patrick MacAlpine

slide-36
SLIDE 36

Reading Overview — Vidal and Durfee

Recursive Modeling Method

  • What should I do?

Patrick MacAlpine

slide-37
SLIDE 37

Reading Overview — Vidal and Durfee

Recursive Modeling Method

  • What should I do?
  • What should I do given what I think you’ll do?

Patrick MacAlpine

slide-38
SLIDE 38

Reading Overview — Vidal and Durfee

Recursive Modeling Method

  • What should I do?
  • What should I do given what I think you’ll do?
  • What should I think you’ll do given what I think you think I’ll

do?

Patrick MacAlpine

slide-39
SLIDE 39

Reading Overview — Vidal and Durfee

Recursive Modeling Method

  • What should I do?
  • What should I do given what I think you’ll do?
  • What should I think you’ll do given what I think you think I’ll

do?

  • etc.

Patrick MacAlpine

slide-40
SLIDE 40

Prediction Method

  • Watch for patterns of others

Patrick MacAlpine

slide-41
SLIDE 41

Prediction Method

  • Watch for patterns of others

− Might have incorrect expectations, especially if environment changes

Patrick MacAlpine

slide-42
SLIDE 42

Prediction Method

  • Watch for patterns of others

− Might have incorrect expectations, especially if environment changes

  • Use deeper models

− Includes physical and mental states

Patrick MacAlpine

slide-43
SLIDE 43

Prediction Method

  • Watch for patterns of others

− Might have incorrect expectations, especially if environment changes

  • Use deeper models

− Includes physical and mental states − Could be computationally expensive

Patrick MacAlpine

slide-44
SLIDE 44

Prediction Method

  • Watch for patterns of others

− Might have incorrect expectations, especially if environment changes

  • Use deeper models

− Includes physical and mental states − Could be computationally expensive

  • Trade-off between time and performance gain

Patrick MacAlpine

slide-45
SLIDE 45

Prediction Method

  • Watch for patterns of others

− Might have incorrect expectations, especially if environment changes

  • Use deeper models

− Includes physical and mental states − Could be computationally expensive

  • Trade-off between time and performance gain
  • When is it worthwhile to model deeper?

Patrick MacAlpine

slide-46
SLIDE 46

Lessons

  • Modeling can help
  • There is a lot of useless information in recursive models
  • Approximations (limited rationality) can be useful

Patrick MacAlpine

slide-47
SLIDE 47

PLASTIC-policy for Ad Hoc Teamwork

  • Forced to work with a group of unknown teammates on

HFO task

Patrick MacAlpine

slide-48
SLIDE 48

PLASTIC-policy for Ad Hoc Teamwork

  • Forced to work with a group of unknown teammates on

HFO task

  • Start with learned models of prior teammates - FQI

Patrick MacAlpine

slide-49
SLIDE 49

PLASTIC-policy for Ad Hoc Teamwork

  • Forced to work with a group of unknown teammates on

HFO task

  • Start with learned models of prior teammates - FQI
  • Select model that is believed to be closest to current

teammate(s) - polynomial weights algorithm from regret minimization

Patrick MacAlpine

slide-50
SLIDE 50

PLASTIC-policy for Ad Hoc Teamwork

  • Forced to work with a group of unknown teammates on

HFO task

  • Start with learned models of prior teammates - FQI
  • Select model that is believed to be closest to current

teammate(s) - polynomial weights algorithm from regret minimization

  • Plan using selected model to perform well on task

Patrick MacAlpine

slide-51
SLIDE 51

Where do Models Come From

Observation:

  • Tambe and RMM: use existing model

– No building a model

Patrick MacAlpine

slide-52
SLIDE 52

Where do Models Come From

Observation:

  • Tambe and RMM: use existing model

– No building a model What if we can’t build a full model in advance?

Patrick MacAlpine

slide-53
SLIDE 53

Where do Models Come From

Observation:

  • Tambe and RMM: use existing model

– No building a model What if we can’t build a full model in advance?

  • What are some incremental approaches for building a

predictive model?

Patrick MacAlpine

slide-54
SLIDE 54

Play me at RoShamBo

  • Rock beats scissors
  • Scissors beats paper
  • Paper beats rock

Patrick MacAlpine

slide-55
SLIDE 55

Play me at RoShamBo

  • Rock beats scissors
  • Scissors beats paper
  • Paper beats rock
  • What is your strategy before modeling me?

Patrick MacAlpine

slide-56
SLIDE 56

Play me at RoShamBo

  • Rock beats scissors
  • Scissors beats paper
  • Paper beats rock
  • What is your strategy before modeling me?
  • What is your strategy after modeling me?

Patrick MacAlpine

slide-57
SLIDE 57

Play me at RoShamBo

  • Rock beats scissors
  • Scissors beats paper
  • Paper beats rock
  • What is your strategy before modeling me?
  • What is your strategy after modeling me?
  • Am I modeling you?

Patrick MacAlpine

slide-58
SLIDE 58

Play me at RoShamBo

  • Rock beats scissors
  • Scissors beats paper
  • Paper beats rock
  • What is your strategy before modeling me?
  • What is your strategy after modeling me?
  • Am I modeling you?
  • Would your end strategy change if I can?

Patrick MacAlpine

slide-59
SLIDE 59

Discussion

  • How do you deal with a teammate/opponent who is

adapting to you as well?

Patrick MacAlpine

slide-60
SLIDE 60

Discussion

  • How do you deal with a teammate/opponent who is

adapting to you as well?

  • Applications of ad hoc teamwork?

Patrick MacAlpine

slide-61
SLIDE 61

Discussion

  • How do you deal with a teammate/opponent who is

adapting to you as well?

  • Applications of ad hoc teamwork?
  • What if there was communication?

Patrick MacAlpine

slide-62
SLIDE 62

Discussion

  • How do you deal with a teammate/opponent who is

adapting to you as well?

  • Applications of ad hoc teamwork?
  • What if there was communication?
  • How would you build an ad hoc teammate?

Patrick MacAlpine