CS344M Autonomous Multiagent Systems Patrick MacAlpine Department - - PowerPoint PPT Presentation
CS344M Autonomous Multiagent Systems Patrick MacAlpine Department - - PowerPoint PPT Presentation
CS344M Autonomous Multiagent Systems Patrick MacAlpine Department or Computer Science The University of Texas at Austin Good Afternoon, Colleagues Patrick MacAlpine Good Afternoon, Colleagues Are there any questions? Patrick MacAlpine
Good Afternoon, Colleagues
Patrick MacAlpine
Good Afternoon, Colleagues
Are there any questions?
Patrick MacAlpine
Logistics
- Questions about the syllabus?
Patrick MacAlpine
Logistics
- Questions about the syllabus?
- Class registration
Patrick MacAlpine
Logistics
- Questions about the syllabus?
- Class registration
- Problems with the assignment?
Patrick MacAlpine
Logistics
- Questions about the syllabus?
- Class registration
- Problems with the assignment?
- Piazza and Canvas — announcements yesterday
Patrick MacAlpine
Logistics
- Questions about the syllabus?
- Class registration
- Problems with the assignment?
- Piazza and Canvas — announcements yesterday
- Last week’s slides are up
Patrick MacAlpine
Logistics
- Questions about the syllabus?
- Class registration
- Problems with the assignment?
- Piazza and Canvas — announcements yesterday
- Last week’s slides are up
- Next week’s readings are up:
− Brooks’ reactive robots − A more deliberative architecture − RoboCup challenge paper
Patrick MacAlpine
Logistics
- Questions about the syllabus?
- Class registration
- Problems with the assignment?
- Piazza and Canvas — announcements yesterday
- Last week’s slides are up
- Next week’s readings are up:
− Brooks’ reactive robots − A more deliberative architecture − RoboCup challenge paper
- Seating arrangement
Patrick MacAlpine
Thermostats
- Are they agents or not?
- How does Wooldridge resolve this?
Patrick MacAlpine
Intelligent (autonomous) Agents
- Autonomous robot
Patrick MacAlpine
Intelligent (autonomous) Agents
- Autonomous robot
- Information gathering agent
− Find me the cheapest?
Patrick MacAlpine
Intelligent (autonomous) Agents
- Autonomous robot
- Information gathering agent
− Find me the cheapest?
- E-commerce agents
− Decides what to buy/sell and does it
Patrick MacAlpine
Intelligent (autonomous) Agents
- Autonomous robot
- Information gathering agent
− Find me the cheapest?
- E-commerce agents
− Decides what to buy/sell and does it
- Air-traffic controller
Patrick MacAlpine
Intelligent (autonomous) Agents
- Autonomous robot
- Information gathering agent
− Find me the cheapest?
- E-commerce agents
− Decides what to buy/sell and does it
- Air-traffic controller
- Meeting scheduler
Patrick MacAlpine
Intelligent (autonomous) Agents
- Autonomous robot
- Information gathering agent
− Find me the cheapest?
- E-commerce agents
− Decides what to buy/sell and does it
- Air-traffic controller
- Meeting scheduler
- Computer-game-playing agent
Patrick MacAlpine
Not Intelligent Agents
- Thermostat
- Telephone
- Answering machine
- Pencil
- Java object
Patrick MacAlpine
Your Agent Examples
Patrick MacAlpine
Your Agent Examples
Simple home alarm; cat food dispenser Software: anti-virus/malware agent; spam filter; web crawler; iOS autocorrect daemon Automotive: smart keys; digitial highway speed sign; traffic light with sensors; autonomous car; cruise control Telecom: GPS device; cell phone Physical Control: Roomba; lawn watering system Health: pacemaker Game/Entertainment: chess player; first person shooter AI
Patrick MacAlpine
An Example
Patrick MacAlpine
An Example
- You, as a class, act as a learning agent
Patrick MacAlpine
An Example
- You, as a class, act as a learning agent
- Actions: Wave, Stand, Clap
Patrick MacAlpine
An Example
- You, as a class, act as a learning agent
- Actions: Wave, Stand, Clap
- Observations: colors, reward
Patrick MacAlpine
An Example
- You, as a class, act as a learning agent
- Actions: Wave, Stand, Clap
- Observations: colors, reward
- Goal: Find an optimal policy
Patrick MacAlpine
An Example
- You, as a class, act as a learning agent
- Actions: Wave, Stand, Clap
- Observations: colors, reward
- Goal: Find an optimal policy
− Way of selecting actions that gets you the most reward
Patrick MacAlpine
How did you do it?
Patrick MacAlpine
How did you do it?
- What is your policy?
- What does the world look like?
Patrick MacAlpine
Formalizing My Example
Knowns:
Patrick MacAlpine
Formalizing My Example
Knowns:
- O = {Blue, Red, Green, Yellow, . . .}
- Rewards in IR
- A = {Wave, Clap, Stand}
- 0, a0, r0, o1, a1, r1, o2, . . .
Patrick MacAlpine
Formalizing My Example
Knowns:
- O = {Blue, Red, Green, Yellow, . . .}
- Rewards in IR
- A = {Wave, Clap, Stand}
- 0, a0, r0, o1, a1, r1, o2, . . .
Unknowns:
Patrick MacAlpine
Formalizing My Example
Knowns:
- O = {Blue, Red, Green, Yellow, . . .}
- Rewards in IR
- A = {Wave, Clap, Stand}
- 0, a0, r0, o1, a1, r1, o2, . . .
Unknowns:
- S = 4x3 grid
- R : S × A → IR
- P = S → O
- T : S × A → S
Patrick MacAlpine
Formalizing My Example
Knowns:
- O = {Blue, Red, Green, Yellow, . . .}
- Rewards in IR
- A = {Wave, Clap, Stand}
- 0, a0, r0, o1, a1, r1, o2, . . .
Unknowns:
- S = 4x3 grid
- R : S × A → IR
- P = S → O
- T : S × A → S
- i = P(si)
Patrick MacAlpine
Formalizing My Example
Knowns:
- O = {Blue, Red, Green, Yellow, . . .}
- Rewards in IR
- A = {Wave, Clap, Stand}
- 0, a0, r0, o1, a1, r1, o2, . . .
Unknowns:
- S = 4x3 grid
- R : S × A → IR
- P = S → O
- T : S × A → S
- i = P(si)
ri = R(si, ai)
Patrick MacAlpine
Formalizing My Example
Knowns:
- O = {Blue, Red, Green, Yellow, . . .}
- Rewards in IR
- A = {Wave, Clap, Stand}
- 0, a0, r0, o1, a1, r1, o2, . . .
Unknowns:
- S = 4x3 grid
- R : S × A → IR
- P = S → O
- T : S × A → S
- i = P(si)
ri = R(si, ai) si+1 = T (si, ai)
Patrick MacAlpine