CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017
“Artificial Intelligence: A Modern Approach”, 3rd Edition, Chapter 4
Soleymani
Searching in non-deterministic, partially
- bservable and unknown environments
Searching in non-deterministic, partially observable and unknown - - PowerPoint PPT Presentation
Searching in non-deterministic, partially observable and unknown environments CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani Artificial Intelligence: A Modern Approach , 3 rd
“Artificial Intelligence: A Modern Approach”, 3rd Edition, Chapter 4
Agent knows exactly its state even after a sequence of actions Solution is a sequence
Agent’s percepts provide no information at all Solution is a sequence
Percepts provide new information about current state Solution can be a contingency plan (tree or strategy) and not a sequence Often interleave search and execution
2
3
4
Perception become useful
Partially observable
To narrow down the set of possible states for the agent
Non-deterministic
To show which outcome of the action has occurred
Future percepts can not be determined in advance Solution is a contingency plan
A tree composed of nested if-then-else statements What to do depending on what percepts are received
Now, we focus on an agent design that finds a guaranteed plan
Future percepts can specify which outcome has occurred.
𝑆𝐹𝑇𝑉𝑀𝑈𝑇: 𝑇 × 𝐵 → 2𝑇 instead of 𝑆𝐹𝑇𝑉𝑀𝑈𝑇: 𝑇 × 𝐵 → 𝑇
Solution will be a sub-tree containing a contingency plan
5
6
{1, 2, …, 8}
{Left, Right, Suck}
{7} or {8}
7
8
specifies one action at each OR node includes every outcome at each AND node has a goal node at every leaf
Depth first BFS, best first,A*, …
9
10
Algorithm returns with failure when the current state is
If there is a non-cyclic path, the earlier consideration of the state is
Termination is guaranteed in finite state spaces
Every path reaches a goal, a dead-end, or a repeated state
11
No acyclic solution
12
Cyclic plan: keep on trying an action until it works.
[Suck, 𝑀1: Right, if state = 5 then 𝑀1 else Suck]
Or equivalently [Suck, while state = 5 do Right, Suck]
Agent is in one of several possible states and thus an action
13
14
belief = {1, 2, 3, 4, 5, 6, 7, 8}
[Right, Suck, Left, Suck] Right Suck Left Suck
It is fully observable
15
16
Illegal actions?! i.e., 𝑐 = {𝑡1, 𝑡2}, 𝐵𝐷𝑈𝐽𝑃𝑂𝑇𝑄(𝑡1)≠ 𝐵𝐷𝑈𝐽𝑃𝑂𝑇𝑄(𝑡2)
Illegal actions have no effect on the env. (union of physical actions) Illegal actions are not legal at all (intersection of physical actions)
17
S2 S3 S1
S’2 S’3 S’1 S2 S3 S1
S’3 S’4 S’1 S’2 S’5
Deterministic actions: 𝑐′ = {𝑡′: 𝑡′ = 𝑆𝐹𝑇𝑉𝑀𝑈𝑇𝑄(𝑡, 𝑏) 𝑏𝑜𝑒 𝑡 ∈ 𝑐 } Nondeterministic actions: 𝑐′ = 𝑡∈𝑐 𝑆𝐹𝑇𝑉𝑀𝑈𝑇𝑄(𝑡, 𝑏)
18
Deterministic actions: 𝑐′ = {𝑡′: 𝑡′ = 𝑆𝐹𝑇𝑉𝑀𝑈𝑇𝑄(𝑡, 𝑏) 𝑏𝑜𝑒 𝑡 ∈ 𝑐 } Nondeterministic actions: 𝑐′ = 𝑡∈𝑐 𝑆𝐹𝑇𝑉𝑀𝑈𝑇𝑄(𝑡, 𝑏)
19
Initial state It is on B & A is clean It is on A It is on A & A is clean
Total number of possible belief states? 28 Number of reachable belief states? 12
21
Partition the belief state according to the possible perceptions
E.g., local sensing vacuum world
After each perception, the belief state can contain at most two
22
23
24
Prediction stage: How does the belief state change after doing an action?
Deterministic actions:
Nondeterministic actions:
Possible Perceptions: What are the possible perceptions in a belief state?
Update stage: How is the belief state updated after a perception?
25
26
Based on the achieved perception either then-part or else-part
Agent’s belief state is updated when performing actions and
Maintaining the belief state is a core function of any intelligent system
27
28
E.g., percepts=NW means there are obstacles to the north and west
Move action randomly chooses among {Right, Left, Up, Down}
29
𝑐0: o squares Percept: NSW 𝑐1 = 𝑉𝑄𝐸𝐵𝑈𝐹(𝑐𝑝, 𝑂𝑇𝑋)
Execute action 𝑏 = 𝑁𝑝𝑤𝑓 𝑐𝑏
1 = 𝑄𝑆𝐹𝐸𝐽𝐷𝑈(𝑐1, 𝑏)
30
Percept: NS 𝑐2=𝑉𝑄𝐸𝐵𝑈𝐹(𝑐𝑏
1, 𝑂𝑇)
In this example, we had only one action and so we did not need to plan
31
Necessary in unknown environments Useful in dynamic and semi-dynamic environments Saves computational resource in non-deterministic domains (focusing
Tradeoff between finding a guaranteed plan (to not get stuck in an undesirable
Robot in a new environment must explore to produce a map New born baby Autonomous vehicles
𝑆𝐹𝑇𝑉𝑀𝑈𝑇(𝑡, 𝑏) is found by actually being in 𝑡 and doing 𝑏 By filling 𝑆𝐹𝑇𝑉𝑀𝑈𝑇 map table, the map of the environment is found.
Also, we assume the agent knows 𝐵𝐷𝑈𝐽𝑃𝑂𝑇(𝑡) and 𝑑(𝑡, 𝑏, 𝑡’)
32
33
Smaller values are more desirable
Dead-end state: no goal state is reachable from it
irreversible actions can lead to a dead-end state
34
A goal state is achievable from every reachable state
Can expand a node somewhere in the state space and
Expand nodes in a local order Interleaving search & execution
35
36
Physical backtrack (works only for reversible actions)
Goes back to the state from which the agent most recently entered the
Works only for state spaces with reversible actions
37