Searching in non-deterministic, partially observable and unknown - PowerPoint PPT Presentation

Searching in non-deterministic, partially observable and unknown environments CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani “ Artificial Intelligence: A Modern Approach ” , 3 rd Edition, Chapter 4

Problem types  Deterministic and fully observable (single-state problem)  Agent knows exactly its state even after a sequence of actions  Solution is a sequence  Non-observable or sensor-less (conformant problem)  Agent ’ s percepts provide no information at all  Solution is a sequence  Nondeterministic and/or partially observable (contingency problem)  Percepts provide new information about current state  Solution can be a contingency plan (tree or strategy) and not a sequence  Often interleave search and execution  Unknown state space (exploration problem) 2

More complex than single-state problem  Searching with nondeterministic actions  Searching with partial observations  Online search & unknown environment 3

Non-deterministic or partially observable env.  Perception become useful  Partially observable  To narrow down the set of possible states for the agent  Non-deterministic  To show which outcome of the action has occurred  Future percepts can not be determined in advance  Solution is a contingency plan  A tree composed of nested if-then-else statements  What to do depending on what percepts are received  Now, we focus on an agent design that finds a guaranteed plan before execution (not online search) 4

Searching with non-deterministic actions  In non-deterministic environments, the result of an action can vary.  Future percepts can specify which outcome has occurred.  Generalizing the transition function  𝑆𝐹𝑇𝑉𝑀𝑈𝑇: 𝑇 × 𝐵 → 2 𝑇 instead of 𝑆𝐹𝑇𝑉𝑀𝑈𝑇: 𝑇 × 𝐵 → 𝑇  Search tree will be an AND-OR tree.  Solution will be a sub-tree containing a contingency plan (nested if-then-else statements) 5

Erratic vacuum world  States  {1, 2, … , 8}  Actions  {Left, Right, Suck}  Goal  {7} or {8}  Non-deterministic: When sucking a dirty square, it cleans it and sometimes cleans up  dirt in an adjacent square. When sucking a clean square, it sometimes deposits dirt on the  carpet. 6

AND-OR search tree OR node: agent ’ s choices AND node: environment ’ s of actions choice of outcome [Suck, if State=5 then [Right, Suck] else []] 7

Solution to AND-OR search tree  Solution for AND-OR search problem is a sub-tree that:  specifies one action at each OR node  includes every outcome at each AND node  has a goal node at every leaf  Algorithms for searching AND-OR graphs  Depth first  BFS, best first,A*, … 8

function AND-OR-GRAPH-SEARCH( 𝑞𝑠𝑝𝑐𝑚𝑓𝑛 ) returns a conditional plan or failure OR-SEARCH( 𝑞𝑠𝑝𝑐𝑚𝑓𝑛 .INITIAL-STATE, 𝑞𝑠𝑝𝑐𝑚𝑓𝑛 , [ ]) function OR-SEARCH( 𝑡𝑢𝑏𝑢𝑓 , 𝑞𝑠𝑝𝑐𝑚𝑓𝑛 , 𝑞𝑏𝑢ℎ ) returns a conditional plan or failure if 𝑞𝑠𝑝𝑐𝑚𝑓𝑛 . GOAL-TEST( 𝑡𝑢𝑏𝑢𝑓 ) then return the empty plan if 𝑡𝑢𝑏𝑢𝑓 is on 𝑞𝑏𝑢ℎ then return 𝑔𝑏𝑗𝑚𝑣𝑠𝑓 for each 𝑏𝑑𝑢𝑗𝑝𝑜 in 𝑞𝑠𝑝𝑐𝑚𝑓𝑛 . ACTIONS( 𝑡𝑢𝑏𝑢𝑓 ) do 𝑞𝑚𝑏𝑜 ← AND-SEARCH(RESULTS( 𝑡𝑢𝑏𝑢𝑓 , 𝑏𝑑𝑢𝑗𝑝𝑜 ) , 𝑞𝑠𝑝𝑐𝑚𝑓𝑛 , [ 𝑡𝑢𝑏𝑢𝑓 | 𝑞𝑏𝑢ℎ ]) if 𝑞𝑚𝑏𝑜 ≠ 𝑔𝑏𝑗𝑚𝑣𝑠𝑓 then return [ 𝑏𝑑𝑢𝑗𝑝𝑜 | 𝑞𝑚𝑏𝑜 ] return 𝑔𝑏𝑗𝑚𝑣𝑠𝑓 function AND-SEARCH( 𝑡𝑢𝑏𝑢𝑓𝑡 , 𝑞𝑠𝑝𝑐𝑚𝑓𝑛 , 𝑞𝑏𝑢ℎ ) returns a conditional plan , or failure for each 𝑡 𝑗 in 𝑡𝑢𝑏𝑢𝑓𝑡 do 𝑞𝑚𝑏𝑜 𝑗 ← OR-SEARCH( 𝑡 𝑗 , 𝑞𝑠𝑝𝑐𝑚𝑓𝑛 , 𝑞𝑏𝑢ℎ ) if 𝑞𝑚𝑏𝑜 𝑗 = 𝑔𝑏𝑗𝑚𝑣𝑠𝑓 then return 𝑔𝑏𝑗𝑚𝑣𝑠𝑓 return [ if 𝑡 1 then 𝑞𝑚𝑏𝑜 1 else if 𝑡 2 then 𝑞𝑚𝑏𝑜 2 else ... if 𝑡 𝑜−1 then 𝑞𝑚𝑏𝑜 𝑜−1 else 𝑞𝑚𝑏𝑜 𝑜 ] 9

AND-OR-GRAPH-SEARCH  Cycles arise often in non-deterministic problems  Algorithm returns with failure when the current state is identical to one of ancestors  If there is a non-cyclic path, the earlier consideration of the state is sufficient  Termination is guaranteed in finite state spaces  Every path reaches a goal, a dead-end, or a repeated state 10

Cycles  Slippery vacuum world: Left and Right actions sometimes fail (leaving the agent in the same location)  No acyclic solution 11

Cycles solution  Solution?  Cyclic plan: keep on trying an action until it works.  [Suck, 𝑀 1 : Right, if state = 5 then 𝑀 1 else Suck]  Or equivalently [Suck, while state = 5 do Right, Suck]  What changes are required in the algorithm to find cyclic solutions? 12

Searching with partial observations  The agent does not always know its exact state.  Agent is in one of several possible states and thus an action may lead to one of several possible outcomes  Belief state : agent ’ s current belief about the possible states, given the sequence of actions and observations up to that point. 13

Searching with unobservable states (Sensor-less or conformant problem)  Initial state:  belief = {1, 2, 3, 4, 5, 6, 7, 8}  Action sequence (conformant plan)  [Right, Suck, Left, Suck] Right Suck Left Suck 14

Belief State  Belief state space (instead of physical state space)  It is fully observable  Physical problem: states, 𝐵𝐷𝑈𝐽𝑃𝑂𝑇 𝑄 , RESULTS 𝑄 , 𝑂 GOAL_TEST 𝑄 , STEP_COST 𝑄 are defined on physical states  Sensor-less problem: Up to 2 𝑂 belief states, 𝐵𝐷𝑈𝐽𝑃𝑂𝑇 , RESULTS , GOAL_TEST , STEP_COST are defined on belief states 15

Sensor-less problem formulation (Belief-state space)  States: every possible set of physical states, 2 𝑂  Initial State: usually the set of all physical states  Actions: 𝐵𝐷𝑈𝐽𝑃𝑂𝑇(𝑐) = 𝑡 ∈ 𝑐 𝐵𝐷𝑈𝐽𝑃𝑂𝑇 𝑄 (𝑡)  Illegal actions?! i.e., 𝑐 = {𝑡 1 , 𝑡 2 } , 𝐵𝐷𝑈𝐽𝑃𝑂𝑇 𝑄 (𝑡 1 ) ≠ 𝐵𝐷𝑈𝐽𝑃𝑂𝑇 𝑄 (𝑡 2 )  Illegal actions have no effect on the env. (union of physical actions)  Illegal actions are not legal at all (intersection of physical actions)  Solution is a sequence of actions (even in non-deterministic environment) 16

Sensor-less problem formulation (Belief-state space)  Transposition model ( 𝑐′ = 𝑄𝑆𝐹𝐸𝐽𝐷𝑈 𝑄 (𝑐, 𝑏) )  Deterministic actions: 𝑐′ = {𝑡′: 𝑡′ = 𝑆𝐹𝑇𝑉𝑀𝑈𝑇 𝑄 (𝑡, 𝑏) 𝑏𝑜𝑒 𝑡 ∈ 𝑐 }  Nondeterministic actions: 𝑐′ = 𝑡∈𝑐 𝑆𝐹𝑇𝑉𝑀𝑈𝑇 𝑄 (𝑡, 𝑏) 𝑏 S ’ 1 𝑏 S 1 S ’ 1 S 1 S ’ 2 𝑏 𝑏 S 2 S ’ 2 S ’ 3 S 2 𝑏 𝑏 S ’ 4 S 3 S ’ 3 S 3 S ’ 5 17

Sensor-less problem formulation (Belief-state space)  Transposition model ( 𝑐′ = 𝑄𝑆𝐹𝐸𝐽𝐷𝑈 𝑄 (𝑐, 𝑏) )  Deterministic actions: 𝑐′ = {𝑡′: 𝑡′ = 𝑆𝐹𝑇𝑉𝑀𝑈𝑇 𝑄 (𝑡, 𝑏) 𝑏𝑜𝑒 𝑡 ∈ 𝑐 }  Nondeterministic actions: 𝑐′ = 𝑡 ∈ 𝑐 𝑆𝐹𝑇𝑉𝑀𝑈𝑇 𝑄 (𝑡, 𝑏)  Goal test: Goal is satisfied when all the physical states in the belief state satisfy GOAL_TEST 𝑄 .  Step cost: STEP_COST 𝑄 if the cost of an action is the same in all states 18

Belief-state space for sensor-less deterministic vacuum world  Total number of possible belief states? 2 8  Number of reachable belief states? 1 2 Initial state It is on A It is on A & A is clean It is on B & A is clean 19

Searching with partial observations  Similar to sensor-less, after each action the new belief state must be predicted  We must plan for different possible perceptions  Partition the belief state according to the possible perceptions  After each perception the belief state is updated  E.g., local sensing vacuum world  After each perception, the belief state can contain at most two physical states. 21

Searching with partial observations A position sensor & local dirt sensor [𝐵, 𝐸𝑗𝑠𝑢𝑧] Deterministic world 22

Searching with partial observations A position sensor & local dirt sensor [𝐵, 𝐸𝑗𝑠𝑢𝑧] Deterministic world [𝐵, 𝐸𝑗𝑠𝑢𝑧] Stochastic world 23

Searching in non-deterministic, partially observable and unknown - PowerPoint PPT Presentation

Searching in non-deterministic, partially observable and unknown environments CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani Artificial Intelligence: A Modern Approach , 3 rd

1 Stochastic, Partially Observable Markov Decision Process (MDP) Partially Observable MDP S

Reinforcement Learning Environments Fully-observable vs partially-observable Single agent

Sequence Estimation and Schedulability Aim Analysis for Partially Observable Petri Nets

Partially observable Markov decision processes Matthijs Spaan Institute for Systems and Robotics

A Multi-Agent Prediction Market based on Raj Dasgupta Partially Observable Stochastic Game

Outline Searching Computers Computers Computers Topic 2: Searching Topic 2: Searching Topic

Training Deterministic Parsers with Non-Deterministic Oracles by Yoav Goldberg and Joakim

Completeness of Online Planners for Partially Observable Deterministic Tasks Blai Bonet Gabriel

Searching in speech Language and Keyword searching in OSCAR Language and Computers Computers

Linguistics 384: Language and Computers Operators Searching the web Topic 2: Searching

Introduction to Partially Observable Markov Decision Processes CS 886 Sequential Decision Making

Multiagent models for partially observable environments Matthijs Spaan Institute for Systems and

Uninformed Search (Ch. 3-3.4) Environment classification 1. Fully vs. partially observable = how

Dichotomic Observables; GP(1992) vs Psudospin observable(2002). Gisin-Peres observable for Bell

Searching Documents and Pages Searching Documents and Pages Searching Documents and Pages Prof.

Searching and Sorting Mason Vail, Boise State University Computer Science Searching Searching is

DCP-DM SUBCOMMITTEE MEETING Southwest Roundtable Drought Contingency Plan and Demand Management

18-04-16 So Whats New: The Outline Preliminary Update of the KDOQI Vascular The Process

SWEN 256 Software Process & Project Management Problems that havent happened

Online Replanning CSE 4308/5360 Artificial Intelligence I University of Texas at Arlington 1

Why are we hosting a virtual event? What overall organizational goals does it contribute

POCT Manager TriCore Reference Laboratories Albuquerque NM Describe the steps to take to

CCE Proposal Elizabeth Sexton-Kennedy 2nd Meeting of the International Computing Advisory

summary of ideas and evidence Professor Naomi Chambers 10 November 2015

Searching in non-deterministic, partially observable and unknown - PowerPoint PPT Presentation

Searching in non-deterministic, partially observable and unknown environments CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani Artificial Intelligence: A Modern Approach , 3 rd

1 Stochastic, Partially Observable Markov Decision Process (MDP) Partially Observable MDP S

Reinforcement Learning Environments Fully-observable vs partially-observable Single agent

Sequence Estimation and Schedulability Aim Analysis for Partially Observable Petri Nets

Partially observable Markov decision processes Matthijs Spaan Institute for Systems and Robotics

A Multi-Agent Prediction Market based on Raj Dasgupta Partially Observable Stochastic Game

Outline Searching Computers Computers Computers Topic 2: Searching Topic 2: Searching Topic

Training Deterministic Parsers with Non-Deterministic Oracles by Yoav Goldberg and Joakim

Completeness of Online Planners for Partially Observable Deterministic Tasks Blai Bonet Gabriel

Searching in speech Language and Keyword searching in OSCAR Language and Computers Computers

Linguistics 384: Language and Computers Operators Searching the web Topic 2: Searching

Introduction to Partially Observable Markov Decision Processes CS 886 Sequential Decision Making

Multiagent models for partially observable environments Matthijs Spaan Institute for Systems and

Uninformed Search (Ch. 3-3.4) Environment classification 1. Fully vs. partially observable = how

Dichotomic Observables; GP(1992) vs Psudospin observable(2002). Gisin-Peres observable for Bell

Searching Documents and Pages Searching Documents and Pages Searching Documents and Pages Prof.

Searching and Sorting Mason Vail, Boise State University Computer Science Searching Searching is

DCP-DM SUBCOMMITTEE MEETING Southwest Roundtable Drought Contingency Plan and Demand Management

18-04-16 So Whats New: The Outline Preliminary Update of the KDOQI Vascular The Process

SWEN 256 Software Process &amp; Project Management Problems that havent happened

Online Replanning CSE 4308/5360 Artificial Intelligence I University of Texas at Arlington 1

Why are we hosting a virtual event? What overall organizational goals does it contribute

POCT Manager TriCore Reference Laboratories Albuquerque NM Describe the steps to take to

CCE Proposal Elizabeth Sexton-Kennedy 2nd Meeting of the International Computing Advisory

summary of ideas and evidence Professor Naomi Chambers 10 November 2015

SWEN 256 Software Process & Project Management Problems that havent happened