EECS 3401 AI and Logic Prog. Lecture 20 Adapted from official - PowerPoint PPT Presentation

EECS 3401 — AI and Logic Prog. — Lecture 20 Adapted from official slides for 3-ed ed. Russell & Norvig (Ch.17) Vitaliy Batusov vbatusov@cse.yorku.ca York University November 30, 2020 Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 1 / 55

Today: Sequential Decision-Making Required reading: Russell & Norvig Ch. 17.1–17.3 Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 2 / 55

Context Covered to date: Search; Belief Networks Today: Markov Decision Processes Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 3 / 55

Basic Idea behind MDP Goal: decision making under uncertainty and a notion of utility Random variables to describe the world (like in Belief Networks) But now the world is again dynamical Transition model: specifies the probability distribution over the latest state variables, given the previous values Markov assumption : current state depends on only a finite fixed number of previous states First-order Markov process: current state depends only on last state Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 4 / 55

Sequential Decision Problems Search uncertainty Planning and utility explicit actions and subgoals uncertainty MDP and utility uncertain sensing Decision-theoretic Planning POMDP Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 5 / 55

Example MDP States: s ∈ S , actions: a ∈ A Transition model : T ( s , a , s ′ ) � P ( s ′ | s , a ) — probability that a in s leads to s ′ Reward function : � − 0 . 04 (small penalty for non-terminal states) R ( s ) = ± 1 for terminal states Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 6 / 55

Solving MDPs In search problems, the aim is to find an optimal sequence of actions In MDPs, the aim is to find an optimal policy π ( s ) I.e., best action for every possible state s The optimal policy maximizes the expected sum of rewards Suppose R ( s ) = − 0 . 04. Optimal policy: Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 7 / 55

Risk and Reward Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 8 / 55

Utility of State Sequences Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 9 / 55

Utility of States Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 10 / 55

Utility of States Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 11 / 55

Dynamic Programming Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 12 / 55

Value Iteration Algorithm Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 13 / 55

Convergence Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 14 / 55

Policy Iteration Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 15 / 55

Modified Policy Iteration Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 16 / 55

Partial Observability Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 17 / 55

Partial Observability Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS 3401 Lecture 20 November 30, 2020 18 / 55

EECS 3401 AI and Logic Prog. Lecture 20 Adapted from official - PowerPoint PPT Presentation

EECS 3401 AI and Logic Prog. Lecture 20 Adapted from official slides for 3-ed ed. Russell & Norvig (Ch.17) Vitaliy Batusov vbatusov@cse.yorku.ca York University November 30, 2020 Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS

CSE 3401 Functional and Logic Programming York University CSE 3401 Vida Movahedi 1 York University

EECS 3401 AI and Logic Prog. Lecture 1 Adapted from slides of Prof. Yves Lesperance York

EECS 3401 AI and Logic Prog. Lecture 3 Adapted from slides of Prof. Yves Lesperance

EECS 3401 AI and Logic Prog. Lecture 2 Adapted from slides of Prof. Yves Lesperance

EECS 3401 AI and Logic Prog. Lecture 7 Adapted from slides of Prof. Yves Lesperance

EECS 3401 AI and Logic Prog. Lecture 8 Adapted from slides of Brachman & Levesque

EECS 3401 AI and Logic Prog. Lecture 16 Adapted from slides of Brachman & Levesque

EECS 3401 AI and Logic Prog. Lecture 10 Adapted from slides of Yves Lesperance Vitaliy

EECS 3401 AI and Logic Prog. Lecture 17 Adapted from slides of Brachman & Levesque

EECS 3401 AI and Logic Prog. Lectures 4 & 5 Adapted from slides of Prof. Yves

Higher order functions York University CSE 3401 Vida Movahedi 1 York University CSE 3401

Lists Lists York University CSE 3401 Vida Movahedi 1 York University CSE 3401 V. Movahedi

Property lists York University CSE 3401 Vida Movahedi 1 York University CSE 3401 V.

Resolution and Refutation Resolution and Refutation York University CSE 3401 Vida Movahedi 1 York

Conjunctive Normal Form & Horn Clauses York University CSE 3401 Vida Movahedi 1 York

CS 112: Intro to Comp Prog CS 112: Intro to Comp Prog Lecture Review Data Types String

Reviewing papers by Xingjian, Tolik Goal Reviewing - a public service: Conference/journal

Visual Soccer Analytics: Understanding the Characteristics of Collective Team Movement Based on

MOL2NET, 2018 , 4, http://sciforum.net/conference/mol2net-04 2 overcome in order to establish a

Lecture 02: Unix Filesystem APIs Software layered over hardware, filesystem API calls

Patent Strategy 4 Where to file a first non-provisional patent application, as mentioned in

S e a r c h f o r U l t r a H i g h E n e r g y p h o t o n s a t

Good Morning! LIS1001 Information and Technology for Searching October 2016, Ulrich Werner,

Cloud Scale Storage Systems Sean Ogden October 30, 2013 Evolution P2P routing/DHTs (Chord,

EECS 3401 AI and Logic Prog. Lecture 20 Adapted from official - PowerPoint PPT Presentation

EECS 3401 AI and Logic Prog. Lecture 20 Adapted from official slides for 3-ed ed. Russell & Norvig (Ch.17) Vitaliy Batusov vbatusov@cse.yorku.ca York University November 30, 2020 Vitaliy Batusov vbatusov@cse.yorku.ca (YorkU) EECS

CSE 3401 Functional and Logic Programming York University CSE 3401 Vida Movahedi 1 York University

EECS 3401 AI and Logic Prog. Lecture 1 Adapted from slides of Prof. Yves Lesperance York

EECS 3401 AI and Logic Prog. Lecture 3 Adapted from slides of Prof. Yves Lesperance

EECS 3401 AI and Logic Prog. Lecture 2 Adapted from slides of Prof. Yves Lesperance

EECS 3401 AI and Logic Prog. Lecture 7 Adapted from slides of Prof. Yves Lesperance

EECS 3401 AI and Logic Prog. Lecture 8 Adapted from slides of Brachman &amp; Levesque

EECS 3401 AI and Logic Prog. Lecture 16 Adapted from slides of Brachman &amp; Levesque

EECS 3401 AI and Logic Prog. Lecture 10 Adapted from slides of Yves Lesperance Vitaliy

EECS 3401 AI and Logic Prog. Lecture 17 Adapted from slides of Brachman &amp; Levesque

EECS 3401 AI and Logic Prog. Lectures 4 &amp; 5 Adapted from slides of Prof. Yves

Higher order functions York University CSE 3401 Vida Movahedi 1 York University CSE 3401

Lists Lists York University CSE 3401 Vida Movahedi 1 York University CSE 3401 V. Movahedi

Property lists York University CSE 3401 Vida Movahedi 1 York University CSE 3401 V.

Resolution and Refutation Resolution and Refutation York University CSE 3401 Vida Movahedi 1 York

Conjunctive Normal Form &amp; Horn Clauses York University CSE 3401 Vida Movahedi 1 York

CS 112: Intro to Comp Prog CS 112: Intro to Comp Prog Lecture Review Data Types String

Reviewing papers by Xingjian, Tolik Goal Reviewing - a public service: Conference/journal

Visual Soccer Analytics: Understanding the Characteristics of Collective Team Movement Based on

MOL2NET, 2018 , 4, http://sciforum.net/conference/mol2net-04 2 overcome in order to establish a

Lecture 02: Unix Filesystem APIs Software layered over hardware, filesystem API calls

Patent Strategy 4 Where to file a first non-provisional patent application, as mentioned in

S e a r c h f o r U l t r a H i g h E n e r g y p h o t o n s a t

Good Morning! LIS1001 Information and Technology for Searching October 2016, Ulrich Werner,

Cloud Scale Storage Systems Sean Ogden October 30, 2013 Evolution P2P routing/DHTs (Chord,

EECS 3401 AI and Logic Prog. Lecture 8 Adapted from slides of Brachman & Levesque

EECS 3401 AI and Logic Prog. Lecture 16 Adapted from slides of Brachman & Levesque

EECS 3401 AI and Logic Prog. Lecture 17 Adapted from slides of Brachman & Levesque

EECS 3401 AI and Logic Prog. Lectures 4 & 5 Adapted from slides of Prof. Yves

Conjunctive Normal Form & Horn Clauses York University CSE 3401 Vida Movahedi 1 York