Incorporating Learning in BDI Agents ephane Airiau 1 Lin Padgham 2 - - PowerPoint PPT Presentation

incorporating learning in bdi agents
SMART_READER_LITE
LIVE PREVIEW

Incorporating Learning in BDI Agents ephane Airiau 1 Lin Padgham 2 - - PowerPoint PPT Presentation

Incorporating Learning in BDI Agents ephane Airiau 1 Lin Padgham 2 Sebastian Sardina 2 St Sandip Sen 3 1 ILLC - University of Amsterdam 2 RMIT University, Melbourne, Australia 3 University of Tulsa, OK, USA ALAMS+ALAg 2008 Worshop at AAMAS


slide-1
SLIDE 1

Incorporating Learning in BDI Agents

St´ ephane Airiau1 Lin Padgham 2 Sebastian Sardina 2 Sandip Sen 3

1ILLC - University of Amsterdam 2RMIT University, Melbourne, Australia 3University of Tulsa, OK, USA

ALAMS+ALAg 2008

Worshop at AAMAS

Estoril, Portugal, 2008

Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-2
SLIDE 2

BDI Agents in a nutshell

Belief, Desire, Intentions Belief: knowledge about the world and its own internal state Desires (or goals): what the agent has decided to work towards achieving Intentions: how the agents has decided to takle these goals. No planning from first principles: agents use a plan library

(library of partially instantiated plans to be used to achieve the goals)

Practical reasoning agents: quickly reason and react to asynchronous events.

Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-3
SLIDE 3

Definition (Plan) e : ψ ← P where e is an event that triggers the plan ψ is the context for which the plan can be applied P is the body of the plan (succession of actions and/or subgoals) Goal-Plan tree

G P1 SG1 P3 SG2 P4 P2 SG3 P5 OR AND Pi: plan Gi: goals SGi: sub- goals

Failure recovery when a step fails, causing a plan to fail, an alternative plan is tried.

ex: if both P1 and P2 are applicable, when P4 fails, P2 can be tried Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-4
SLIDE 4

BDI execution algorithm

1

Take the next event

(internal/external)

2

Modify any goals, beliefs, intentions (new event may cause

an update of the belief, causing a modification of the goals and/or intentions)

3

Select an applicable plan to respond to this event

4

Place this plan in the intention base;

5

Take the next step on a selected intention (may execute

an action, generate a new event)

dynamic static Beliefs Event Queue plan library Intentions Reasoning Deliberation actions input Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-5
SLIDE 5

BDI execution algorithm

1

Take the next event

(internal/external)

2

Modify any goals, beliefs, intentions (new event may cause

an update of the belief, causing a modification of the goals and/or intentions)

3

Select an applicable plan to respond to this event

4

Place this plan in the intention base;

5

Take the next step on a selected intention (may execute

an action, generate a new event)

dynamic static Beliefs Event Queue plan library Intentions Reasoning Deliberation actions input

BDI agents are well suited for complex applications with soft real-time reasoning and control requirements.

Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-6
SLIDE 6

BDI execution algorithm

1

Take the next event

(internal/external)

2

Modify any goals, beliefs, intentions (new event may cause

an update of the belief, causing a modification of the goals and/or intentions)

3

Select an applicable plan to respond to this event

4

Place this plan in the intention base;

5

Take the next step on a selected intention (may execute

an action, generate a new event)

dynamic static Beliefs Event Queue plan library Intentions Reasoning Deliberation actions input

BDI agents are well suited for complex applications with soft real-time reasoning and control requirements.

Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-7
SLIDE 7

Issues

BDI agents lack learning capabilities to modify their behavior (e.g. in case of frequent failures) Plans and context conditions are programmed by a user.

In a complex environment, context conditions may be hard to capture precisely too loose: plan is applicable when it is not → failures too tight: plan is not applicable when it actually is → a goal may not appear achievable when it is

Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-8
SLIDE 8

Issues

BDI agents lack learning capabilities to modify their behavior (e.g. in case of frequent failures) Plans and context conditions are programmed by a user.

In a complex environment, context conditions may be hard to capture precisely too loose: plan is applicable when it is not → failures too tight: plan is not applicable when it actually is → a goal may not appear achievable when it is

Research goal Add learning capabilities to adapt and precise context conditions of plans

Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-9
SLIDE 9

Issues

BDI agents lack learning capabilities to modify their behavior (e.g. in case of frequent failures) Plans and context conditions are programmed by a user.

In a complex environment, context conditions may be hard to capture precisely too loose: plan is applicable when it is not → failures too tight: plan is not applicable when it actually is → a goal may not appear achievable when it is

Research goal Add learning capabilities to adapt and precise context conditions of plans A first step Use a decision tree (DT) in addition to the context condition Each plan has a decision tree telling whether it is applicable

Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-10
SLIDE 10

Example of a DT

the environment is described by three boolean attributes a, b and c

b a

40+, 10- 4+,25-

c a

110+, 0- 1+, 50-

a

1+ 35- 20+ 5-

true false true false true false true false true false Context condition converted from the decision tree : (a ∧ b) ∨ (a ∧ ¬b ∧ c) ∨ (a ∧ ¬b ∧ ¬c).

Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-11
SLIDE 11

Learning Issues

When to collect data?

In case of failure, did the failure occur because the current plan was not applicable? did it fail because other plans below were mistakenly chosen?

G0 P01 SG1 P11 P12 SG2 P21 P22 SG3 P31 P02 SG4 P41 SG5 P51

OR AND AND OR OR

Pi: plan Gi: goals SGi: sub- goals

When to start to use the decision tree?

Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-12
SLIDE 12

Learning Issues

When to collect data?

In case of failure, did the failure occur because the current plan was not applicable? → Correct data did it fail because other plans below were mistakenly chosen?

G0 P01 SG1 P11 P12 SG2 P21 P22 SG3 P31 P02 SG4 P41 SG5 P51

OR AND AND OR OR

Pi: plan Gi: goals SGi: sub- goals

When to start to use the decision tree?

Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-13
SLIDE 13

Learning Issues

When to collect data?

In case of failure, did the failure occur because the current plan was not applicable? → Correct data did it fail because other plans below were mistakenly chosen? → Incorrect data

G0 P01 SG1 P11 P12 SG2 P21 P22 SG3 P31 P02 SG4 P41 SG5 P51

OR AND AND OR OR

Pi: plan Gi: goals SGi: sub- goals

When to start to use the decision tree?

Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-14
SLIDE 14

Initial Experiments

Three mechanisms for plan selection CL: all trees are learnt at the same time, all data is used BU: Bottom Up learning: DT higher in the hierarchy wait for DT below to be formed PS: Probabilistic selection: plans are selected according to the frequency of success provided by the decision tree Use the DT

after k instances have been observed for CL and BU (k large), after few instances for PS (5-10 to have an initial DT).

Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-15
SLIDE 15

Initial Results

Setup: 17 plans, world state is defined by six boolean attributes, depth of goal-plan tree is 4. All context conditions are set to true. k = 100

0.2 0.4 0.6 0.8 1 200 400 600 800 1000 1200 1400 frequence of success Number of instances Non Deterministic World: action may fail with a probability of 0.1 Probabilistic selection BU CL

Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-16
SLIDE 16

Conclusion Though theoretically, need to wait for DTs below to be accurate before collecting data for DT higher, DTs handle the spurious data as noise Using PS, the context conditions are learnt faster and are accurate Future Work Test on larger goal-plan trees Try better criteria for starting using the DTs

Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent

slide-17
SLIDE 17

Contacts

St´ ephane Airiau: stephane@illc.uva.nl Lin Padgham: lin.padgham@rmit.edu.au Sebastian Sardina: sebastian.sardina@rmit.edu.au Sandip Sen: sandip@utulsa.edu

Airiau,Padgham,Sardina,Sen Incorporating Learning in BDI Agent