CS325 Artificial Intelligence Ch. 17 Planning Under Uncertainty - PowerPoint PPT Presentation

CS325 Artificial Intelligence Ch. 17 – Planning Under Uncertainty Cengiz Günay, Emory Univ. Spring 2013 Günay Ch. 17 – Planning Under Uncertainty Spring 2013 1 / 17

Is This AI Course a Bit Schizo? Classical AI vs. Machine Learning Günay Ch. 17 – Planning Under Uncertainty Spring 2013 2 / 17

Is This AI Course a Bit Schizo? Classical AI vs. Machine Learning Classical AI Symbolic logic (propositional, first-order) Algorithms Thinking and programming Günay Ch. 17 – Planning Under Uncertainty Spring 2013 2 / 17

Is This AI Course a Bit Schizo? Classical AI vs. Machine Learning Classical AI Symbolic logic (propositional, first-order) Algorithms Thinking and programming Probabilities Math Machine Learning Automated methods, power of math Günay Ch. 17 – Planning Under Uncertainty Spring 2013 2 / 17

Planning Under Uncertainty Into Thrun territory Aim is to use more math, probabilities achieve learnability for hard-to-program scenarios (that is, real-life) Günay Ch. 17 – Planning Under Uncertainty Spring 2013 3 / 17

Planning Under Uncertainty Into Thrun territory Aim is to use more math, probabilities achieve learnability for hard-to-program scenarios (that is, real-life) plan +exec, Uncertainty Planning MDP RL Learning Günay Ch. 17 – Planning Under Uncertainty Spring 2013 3 / 17

Entry/Exit Surveys Exit survey: Planning Why do we need to alternate between plan and execution? Why do we need a belief state? Entry survey: Planning Under Uncertainty (0.25 points of final grade) What algorithm would you use to plan under uncertain conditions? How do you think machine learning can be used in planning? Günay Ch. 17 – Planning Under Uncertainty Spring 2013 4 / 17

So What’s Wrong with Classical Planning? 1 2 3 4 Grid World: a G S: Start b xt G: Goal c S Günay Ch. 17 – Planning Under Uncertainty Spring 2013 5 / 17

So What’s Wrong with Classical Planning? 1 2 3 4 Grid World: a G S: Start b xt G: Goal c S It’s too slow Branching factor can get large Search tree gets too deep (may have loops) Same states can be repeated multiple times (although can be avoided with dynamic programming) Günay Ch. 17 – Planning Under Uncertainty Spring 2013 5 / 17

Start with Certainty: Deterministic Grid World 1 2 3 4 a + 1 Reward function: b xt R ( s ) = + 1 @ a4 c S Remember utility values? State, s Action, a Optimal policy π ( s ) → a ? Günay Ch. 17 – Planning Under Uncertainty Spring 2013 6 / 17

Start with Certainty: Deterministic Grid World 1 2 3 4 a + 1 Reward function: b xt − 1 R ( s ) = + 1 @ a4 c S Remember utility values? State, s Action, a Optimal policy π ( s ) → a ? Günay Ch. 17 – Planning Under Uncertainty Spring 2013 6 / 17

Start with Certainty: Deterministic Grid World 1 2 3 4 a + 1 Reward function: b xt − 1 R ( s ) = + 1 @ a4 c S Remember utility values? State, s Action, a Optimal policy π ( s ) → a ? @ a3? @ b3? @ c4? Günay Ch. 17 – Planning Under Uncertainty Spring 2013 6 / 17

Start with Certainty: Deterministic Grid World 1 2 3 4 a → + 1 Reward function: b xt ↑ − 1 R ( s ) = + 1 @ a4 c S ↑ ← Remember utility values? State, s Action, a Optimal policy π ( s ) → a ? @ a3? @ b3? @ c4? Günay Ch. 17 – Planning Under Uncertainty Spring 2013 6 / 17

Value Iteration: Movement Cost Reward function: 1 2 3 4  + 1 @ a4 a + 1   R ( s ) = − 1 @ b4 b xt − 1  c S − . 1 everywhere else  Günay Ch. 17 – Planning Under Uncertainty Spring 2013 7 / 17

Value Iteration: Movement Cost Reward function: 1 2 3 4  + 1 @ a4 a + 1   R ( s ) = − 1 @ b4 b xt − 1  c S − . 1 everywhere else  Optimal policy π ( s ) → a ? @ a3? @ b3? @ c4? Günay Ch. 17 – Planning Under Uncertainty Spring 2013 7 / 17

Value Iteration: Movement Cost Reward function: 1 2 3 4  + 1 @ a4 a 0 . 9 + 1   R ( s ) = − 1 @ b4 b xt − 1  c S − . 1 everywhere else  Optimal policy π ( s ) → a ? @ a3? @ b3? @ c4? Günay Ch. 17 – Planning Under Uncertainty Spring 2013 7 / 17

Value Iteration: Movement Cost Reward function: 1 2 3 4  + 1 @ a4 a 0 . 9 + 1   R ( s ) = − 1 @ b4 b xt 0 . 8 − 1  c S − . 1 everywhere else  Optimal policy π ( s ) → a ? @ a3? @ b3? @ c4? Günay Ch. 17 – Planning Under Uncertainty Spring 2013 7 / 17

Value Iteration: Movement Cost Reward function: 1 2 3 4  + 1 @ a4 a 0 . 9 + 1   R ( s ) = − 1 @ b4 b xt 0 . 8 − 1  c S 0 . 7 0 . 6 − . 1 everywhere else  Optimal policy π ( s ) → a ? @ a3? @ b3? @ c4? Günay Ch. 17 – Planning Under Uncertainty Spring 2013 7 / 17

Value Iteration: Movement Cost Reward function: 1 2 3 4  + 1 @ a4 a 0 . 9 + 1   R ( s ) = − 1 @ b4 b xt 0 . 8 − 1  c S 0 . 7 0 . 6 − . 1 everywhere else  Optimal policy π ( s ) → a ? @ a3? @ b3? @ c4? Value function: � � V ( s ′ ) V ( s ) ← arg max + R ( s ) a where s ′ is neighboring states. Günay Ch. 17 – Planning Under Uncertainty Spring 2013 7 / 17

Value Iteration Video Value iteration video Günay Ch. 17 – Planning Under Uncertainty Spring 2013 8 / 17

Value Iteration: Discount Factor Reward function: 1 2 3 4  + 1 @ a4 a 0 . 9 + 1   R ( s ) = − 1 @ b4 b xt 0 . 8 − 1  c S 0 . 7 0 . 6 0 everywhere else  Recursive definition � � V ( s ′ ) V ( s ) ← arg max + R ( s ) , a can be also written as expected reward � ∞ � � γ t R t | s o = s V ( s ) ← arg max E . π t = 0 Instead of movement cost, it uses discount factor , γ , to decay future reward. Günay Ch. 17 – Planning Under Uncertainty Spring 2013 9 / 17

Value Iteration: Discount Factor Reward function: 1 2 3 4  + 1 @ a4 a 0 . 9 + 1   R ( s ) = − 1 @ b4 b xt 0 . 8 − 1  c S 0 . 7 0 . 6 0 everywhere else  Recursive definition � � V ( s ′ ) V ( s ) ← arg max + R ( s ) , a can be also written as expected reward � ∞ � � γ t R t | s o = s V ( s ) ← arg max E . π t = 0 Instead of movement cost, it uses discount factor , γ , to decay future reward. 1 Helps to keep it bounded ≤ 1 − γ | R max | Günay Ch. 17 – Planning Under Uncertainty Spring 2013 9 / 17

Value Iteration: Bellman Equation General case (Bellman, 1957) is stochastic � � � P ( s ′ | a ) V ( s ′ ) V ( s ) ← arg max + R ( s ) . γ a s ′ Recursive Used iteratively Converges to solution Günay Ch. 17 – Planning Under Uncertainty Spring 2013 10 / 17

Value Iteration: Bellman Equation General case (Bellman, 1957) is stochastic � � � P ( s ′ | a ) V ( s ′ ) V ( s ) ← arg max + R ( s ) . γ a s ′ Recursive Used iteratively Converges to solution Why stochastic? Remember we want to plan under uncertainty Günay Ch. 17 – Planning Under Uncertainty Spring 2013 10 / 17

Markov Decision Processes Andrey Andreyevich Russian mathematician Markov Stochastic processes (1856–1922) Günay Ch. 17 – Planning Under Uncertainty Spring 2013 11 / 17

Markov Decision Processes Andrey Andreyevich Russian mathematician Markov Stochastic processes (1856–1922) Markov Decision Processes (MDPs) Value iteration with stochasticity (Bellman, 1957) Günay Ch. 17 – Planning Under Uncertainty Spring 2013 11 / 17

Markov Decision Processes Andrey Andreyevich Russian mathematician Markov Stochastic processes (1856–1922) Markov Decision Processes (MDPs) Value iteration with stochasticity (Bellman, 1957) Later Q-learning (1989) → (next class) Günay Ch. 17 – Planning Under Uncertainty Spring 2013 11 / 17

Robots in Real Life Video: Robots gone wild Günay Ch. 17 – Planning Under Uncertainty Spring 2013 12 / 17

Uncertain Movement in Grid World 80% Reward function: 1 2 3 4  + 1 @ a4 a + 1 10% 10%   R ( s ) = − 1 @ b4 b xt − 1 c  0 else  Optimal policy π ( s ) → a ? Günay Ch. 17 – Planning Under Uncertainty Spring 2013 13 / 17

Uncertain Movement in Grid World 80% Reward function: 1 2 3 4  + 1 @ a4 a + 1 10% 10%   R ( s ) = − 1 @ b4 b xt − 1 c  0 else  Optimal policy π ( s ) → a ? @ a3? @ b3? @ c4? Günay Ch. 17 – Planning Under Uncertainty Spring 2013 13 / 17

Uncertain Movement in Grid World 80% Reward function: 1 2 3 4  + 1 @ a4 a → + 1 10% 10%   R ( s ) = − 1 @ b4 b xt − 1 c  0 else  Optimal policy π ( s ) → a ? @ a3? @ b3? @ c4? Günay Ch. 17 – Planning Under Uncertainty Spring 2013 13 / 17

Uncertain Movement in Grid World 80% Reward function: 1 2 3 4  + 1 @ a4 a → + 1 10% 10%   R ( s ) = − 1 @ b4 b xt ← − 1 c  0 else  Optimal policy π ( s ) → a ? @ a3? @ b3? @ c4? Günay Ch. 17 – Planning Under Uncertainty Spring 2013 13 / 17

CS325 Artificial Intelligence Ch. 17 Planning Under Uncertainty - PowerPoint PPT Presentation

CS325 Artificial Intelligence Ch. 17 Planning Under Uncertainty Cengiz Gnay, Emory Univ. Spring 2013 Gnay Ch. 17 Planning Under Uncertainty Spring 2013 1 / 17 Is This AI Course a Bit Schizo? Classical AI vs. Machine Learning

Artificial Intelligence Artificial Intelligence Artificial Intelligence Study and design of

CS325 Artificial Intelligence Chs. 10, 11 Planning Cengiz Gnay, Emory Univ. Spring 2013

CS325 Artificial Intelligence Ch. 11, Advanced Planning Cengiz Gnay, Emory Univ. Spring 2013

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial intelligence Artificial Intelligence is the science of PHILOSOPHY OF ARTIFICIAL

Artificial Intelligence Intro (Chapter 1 of AIMA) Summary Artificial Intelligence What is AI?

CS325 Artificial Intelligence Ch. 17.56, Game Theory Cengiz Gnay, Emory Univ. Spring 2013

CS325 Artificial Intelligence Ch 14b Probabilistic Inference Cengiz Gnay Spring 2013

CS325 Artificial Intelligence Ch. 5, Games! Cengiz Gnay, Emory Univ. vs. Spring 2013 Gnay

CS325 Artificial Intelligence Ch. 21 Reinforcement Learning Cengiz Gnay, Emory Univ.

CS325 Artificial Intelligence Chs. 9, 12 Knowledge Representation and Inference Cengiz

CS325 Artificial Intelligence Spring 2013 Midterm Solution Guide Instructor: Cengiz Gunay,

CS325 Artificial Intelligence Ch. 7, 8, 9 Logic, Knowledge, and Inference Cengiz Gnay,

CS325 Artificial Intelligence Natural Language Processing II (Ch. 23) Dr. Cengiz Gnay, Emory

CS325 ARTIFICIAL INTELLIGENCE Introduction: Chapter 1 Outline Course overview What is

Quantifying error and Quantifying error and modeling accuracy & uncertainty modeling

Bounded Rationality in Decision Making Under Uncertainty: Towards Optimal Granularity Joe

Decision Making Under Uncertainty 14.123 Microeconomic Theory III Muhamet Yildiz Decision

How good are your fits? Unbinned multivariate - Nonparametric regression using the concept of

Repairing Decision-Making Programs Under Uncertainty Samuel Drews Aws Albarghouthi Loris

Chaos and Uncertainty The FY 2014 Defense Budget and Beyond Todd Harrison Base DoD Requested vs.

Treatment of uncertainties and correlations in combinations of e + e annihilation data Michel

Systematics on (long-baseline) neutrino oscillation measurements Introduction on oscillation

CS325 Artificial Intelligence Ch. 17 Planning Under Uncertainty - PowerPoint PPT Presentation

CS325 Artificial Intelligence Ch. 17 Planning Under Uncertainty Cengiz Gnay, Emory Univ. Spring 2013 Gnay Ch. 17 Planning Under Uncertainty Spring 2013 1 / 17 Is This AI Course a Bit Schizo? Classical AI vs. Machine Learning

Artificial Intelligence Artificial Intelligence Artificial Intelligence Study and design of

CS325 Artificial Intelligence Chs. 10, 11 Planning Cengiz Gnay, Emory Univ. Spring 2013

CS325 Artificial Intelligence Ch. 11, Advanced Planning Cengiz Gnay, Emory Univ. Spring 2013

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial intelligence Artificial Intelligence is the science of PHILOSOPHY OF ARTIFICIAL

Artificial Intelligence Intro (Chapter 1 of AIMA) Summary Artificial Intelligence What is AI?

CS325 Artificial Intelligence Ch. 17.56, Game Theory Cengiz Gnay, Emory Univ. Spring 2013

CS325 Artificial Intelligence Ch 14b Probabilistic Inference Cengiz Gnay Spring 2013

CS325 Artificial Intelligence Ch. 5, Games! Cengiz Gnay, Emory Univ. vs. Spring 2013 Gnay

CS325 Artificial Intelligence Ch. 21 Reinforcement Learning Cengiz Gnay, Emory Univ.

CS325 Artificial Intelligence Chs. 9, 12 Knowledge Representation and Inference Cengiz

CS325 Artificial Intelligence Spring 2013 Midterm Solution Guide Instructor: Cengiz Gunay,

CS325 Artificial Intelligence Ch. 7, 8, 9 Logic, Knowledge, and Inference Cengiz Gnay,

CS325 Artificial Intelligence Natural Language Processing II (Ch. 23) Dr. Cengiz Gnay, Emory

CS325 ARTIFICIAL INTELLIGENCE Introduction: Chapter 1 Outline Course overview What is

Quantifying error and Quantifying error and modeling accuracy &amp; uncertainty modeling

Bounded Rationality in Decision Making Under Uncertainty: Towards Optimal Granularity Joe

Decision Making Under Uncertainty 14.123 Microeconomic Theory III Muhamet Yildiz Decision

How good are your fits? Unbinned multivariate - Nonparametric regression using the concept of

Repairing Decision-Making Programs Under Uncertainty Samuel Drews Aws Albarghouthi Loris

Chaos and Uncertainty The FY 2014 Defense Budget and Beyond Todd Harrison Base DoD Requested vs.

Treatment of uncertainties and correlations in combinations of e + e annihilation data Michel

Systematics on (long-baseline) neutrino oscillation measurements Introduction on oscillation

Quantifying error and Quantifying error and modeling accuracy & uncertainty modeling