for AI and Robotics Planning and Control: Markov Decision Processes - PowerPoint PPT Presentation

Statistical Filtering and Control for AI and Robotics Planning and Control: Markov Decision Processes Alessandro Farinelli

Outline • Uncertainty: localization for mobile robots – State estimation based on Bayesian filters [recall] • Acting Under Uncertainty – Markov Decision Problem – Solution approaches • Motion planning – Markov Decision Processes for path planning • Acknowledgment: material based on – Russel and Norvig; Artificial Intelligence: a Modern Approach – Thrun, Burgard, Fox; Probabilistic Robotics

Mobile robots

Sensors

Uncertainty open = open a door Will open actually open the door ? Problems: • 1) partial observability and noisy sensors • 2) uncertainty in action outcomes • 3) immense complexity of modelling and predicting environment

Probability Probabilistic assertions summarize effects of • laziness (enumeration of all relevant facts), • ignorance (lack of relevant facts) Subjective or Bayesian probability: • Probabilities relate propositions to one's own state of knowledge – P(open|I am in front of the door) = 0.6 – P(open|I am in front of the door; door is not locked) = 0.8

Simple Example of State Estimation Suppose a robot obtains measurement z What is P(open|z)?

Causal vs. Diagnostic Reasoning P(open|z) is diagnostic P(z|open) is causal Often causal knowledge is easier to obtain Bayes rule allows us to use causal knowledge: ( | ) ( ) P z open P open  ( | ) P open z ( ) P z count frequencies!

Combining Evidence Suppose our robot obtains another observation z2. How can we integrate this new information? More generally, how can we estimate P(x| z1...zn )?

Example: Second Measurement P(z2|  open) = 0.6 P(z2|open) = 0.5 P(open|z1)=2/3 ( | ) ( | ) P z open P open z  2 1 ( | , ) P open z z    2 1 ( | ) ( | ) ( | ) ( | ) P z open P open z P z open P open z 2 1 2 1 1 2  5 2 3    0 . 625 1 2 3 1 8    2 3 5 3 z 2 lowers the probability that the door is open.

Actions Often the world is dynamic – actions carried out by the robot, – actions carried out by other agents, – time passing by How can we incorporate such actions?

Typical Actions The robot moves The robot moves objects People move around the robot Actions are never carried out with absolute certainty. In contrast to measurements, actions generally increase the uncertainty.

Modeling Actions To incorporate the outcome of an action u into the current “belief”, we use conditional pdf P( x’|u,x ) This term specifies the pdf that executing u changes the state from x to x’ . 15

Example: Closing the door

State Transitions • P( x’|u,x) for u = “close door”: 0.9 0.1 open closed 1 0 • If the door is open, the action “close door” succeeds in 90% of all cases.

Integrating the Outcome of Actions Continuous case:   ( ' | ) ( ' | , ) ( ) P x u P x u x P x dx Discrete case:   ( ' | ) ( ' | , ) ( ) P x u P x u x P x

Bayes Filters: Framework • Given: – Stream of observations z and action data u:   { , , , } d u z u z 1 1 t t t – Sensor model P(z|x) – Action model P( x’|u,x ) – Prior probability of the system state P(x) • Compute: – Estimate of the state X of a dynamical system – The posterior of the state is also called Belief:   ( ) ( | , , , ) Bel x P x u z u z 1 1 t t t t

Markov Assumption  ( | , , ) ( | ) p z x z u p z x 0 : 1 : 1 : t t t t t t  ( | , , ) ( | , ) p x x z u p x x u   1 : 1 1 : 1 : 1 t t t t t t t Underlying Assumptions • Static world (no one else changes the world) • Independent noise (over time) • Perfect model, no approximation errors

Bayes Filters z = observation  u = action  ( ) ( | , , , ) Bel x P x u z u z 1 1 x = state t t t t     ( | , , , , ) ( | , , , ) P z x u z u P x u z u Bayes 1 1 1 1 t t t t t    ( | ) ( | , , , ) P z x P x u z u Markov 1 1 t t t t     ( | ) ( | , , , , ) P z x P x u z u x Total prob.  1 1 1 t t t t t  ( | , , , ) P x u z u dx   1 1 1 1 t t t     ( | ) ( | , ) ( | , , , ) P z x P x u x P x u z u dx Markov    1 1 1 1 1 t t t t t t t t     ( | ) ( | , ) ( | , , , ) P z x P x u x P x u z z dx Markov     1 1 1 1 1 1 t t t t t t t t    ( | ) ( | , ) ( ) P z x P x u x Bel x dx    1 1 1 t t t t t t t

Bayes Filter Algorithm Algorithm Bayes_filter ( Bel(x),d ): 1.  0 2. 3. If d is a perceptual data item z then 4. For all x do  5. ' ( ) ( | ) ( ) Bel x P z x Bel x     6. ' x ( ) Bel 7. For all x do  8.   1 ' ( ) ' ( ) Bel x Bel x 9. Else if d is an action data item u then 10. For all x’ do   11. ' ( ' ) ( ' | , ) ( ) Bel x P x u x Bel x dx 12. Return Bel ’(x)    ( ) ( | ) ( | , ) ( ) Bel x P z x P x u x Bel x dx    1 1 1 t t t t t t t t

Bayes Filters are Familiar!    ( ) ( | ) ( | , ) ( ) Bel x P z x P x u x Bel x dx    1 1 1 t t t t t t t t Kalman filters Particle filters Hidden Markov models Dynamic Bayesian networks Partially Observable Markov Decision Processes (POMDPs)

Bayesian filters for localization How do I know whether I am in front of the door ? Localization as a state estimation process (filtering) State update Sensor Reading

Kalman Filter for Localization Gaussian pdf for belief • Pros: closed form representation, very fast update • Cons: Works only for linear action and sensor models (can use EKF to overcome this) Works well only for unimodal beliefs

Particle filters Particles to represent the belief Pros: no assumption on belief, action and sensor models Cons: update can be computationally demanding

Particle Filters: prior

Particle Filters: bimodal belief

Particle Filters: Unimodal beliefs

Mapping and SLAM Localization: given map and observations, update pose estimation Mapping: given pose and observation, update map SLAM: given observations, update map and pose New observations increase uncertainty Loop closures reduce uncertainty

SLAM in action Courtesy of Sebastian Thrun and Dirk Haehnel ( link for the video)

Markov Decision Process • Mathematical model to plan sequences of actions in face of uncertainty

Example MDP

Solving MDPs

Risk and Reward

Utility of State Sequences

Utility of States

MDPs for mobile robots Optimal path (shortest) if actions are deterministic Optimal path (safer) if actions are NOT deterministic

MDPs for mobile robots: formalization Input: • States x (Assume state is known) • Actions u • Transition probabilities p( x‘|u,x ) • Reward / payoff function r(x,u) • Note: now reward depends on state and action. This is a different notation, but the core concepts do not change. output • Policy p (x) that maximizes the future expected reward

for AI and Robotics Planning and Control: Markov Decision Processes - PowerPoint PPT Presentation

Statistical Filtering and Control for AI and Robotics Planning and Control: Markov Decision Processes Alessandro Farinelli Outline Uncertainty: localization for mobile robots State estimation based on Bayesian filters [recall]

Mobile & Service Robotics Mobile & Service Robotics Sensors for Robotics Sensors for

Mobile & Service Robotics Mobile & Service Robotics Sensors for Robotics Sensors for

Mobile & Service Robotics Mobile & Service Robotics Sensors for Sensors for Robotics

LEGO Develops a new LEGO Develops a new robotics platform - WeDo robotics platform - WeDo

Human-Oriented Robotics Octave/Matlab Tutorial Kai Arras Social Robotics Lab, University of

Robotics Engineering Prof. Michael Gennert Robotics Engineering Program Director Fall 2016

The Robot Operating System (ROS) Introduction, Concepts and Examples Stefano Rosa, 8/5/2015

ROBOTICS ROBOTICS A brief history A brief history Basilio Bona ROBOTICA 03CFIOR 1 Outline

Human-Oriented Robotics Basics of Probabilistic Reasoning Kai Arras Social Robotics Lab,

Human-Oriented Robotics Temporal Reasoning Part 3/3 Kai Arras Social Robotics Lab, University

Human-Oriented Robotics Unsupervised Learning Kai Arras Social Robotics Lab, University of

Human-Oriented Robotics Probability Refresher Kai Arras Social Robotics Lab, University of

Human-Oriented Robotics Robot Motion Planning Kai Arras Social Robotics Lab, University of

Human-Oriented Robotics Supervised Learning Part 3/3 Kai Arras Social Robotics Lab, University

Human-Oriented Robotics Supervised Learning Part 2/3 Kai Arras Social Robotics Lab, University

ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Mobile & Service Robotics

CPSC 121: Models of Computation 2018 Summer, section V01 Introduction & Motivation Meghan

Commission Briefing on Commission Briefing on Equal Employment Equal Employment Opportunity and

Lectures 1&2: Change over Time, Select, Navigate 4 labs core foundational

12. The Second World War and Contemporary Europe 12.1 The Causes and Course of the Second World

Who we are: Greg Foertsch Senior Artist, Firaxis Games Chris Sulzbach Artist,

Maxim Likhachev 1 Motion/Path Planning Uncertainty and Planning Uncertainty can be in: -

We lc o me to the NRF C We b ina r Ke e ping the Door s Ope n: Sustainability T ips for F

Zainabiya Madressa Milton Keynes Parents Meeting JANUARY 2018 / RABI AL THAANI 1439AH DR HAMID

for AI and Robotics Planning and Control: Markov Decision Processes - PowerPoint PPT Presentation

Statistical Filtering and Control for AI and Robotics Planning and Control: Markov Decision Processes Alessandro Farinelli Outline Uncertainty: localization for mobile robots State estimation based on Bayesian filters [recall]

Mobile &amp; Service Robotics Mobile &amp; Service Robotics Sensors for Robotics Sensors for

Mobile &amp; Service Robotics Mobile &amp; Service Robotics Sensors for Robotics Sensors for

Mobile &amp; Service Robotics Mobile &amp; Service Robotics Sensors for Sensors for Robotics

LEGO Develops a new LEGO Develops a new robotics platform - WeDo robotics platform - WeDo

Human-Oriented Robotics Octave/Matlab Tutorial Kai Arras Social Robotics Lab, University of

Robotics Engineering Prof. Michael Gennert Robotics Engineering Program Director Fall 2016

The Robot Operating System (ROS) Introduction, Concepts and Examples Stefano Rosa, 8/5/2015

ROBOTICS ROBOTICS A brief history A brief history Basilio Bona ROBOTICA 03CFIOR 1 Outline

Human-Oriented Robotics Basics of Probabilistic Reasoning Kai Arras Social Robotics Lab,

Human-Oriented Robotics Temporal Reasoning Part 3/3 Kai Arras Social Robotics Lab, University

Human-Oriented Robotics Unsupervised Learning Kai Arras Social Robotics Lab, University of

Human-Oriented Robotics Probability Refresher Kai Arras Social Robotics Lab, University of

Human-Oriented Robotics Robot Motion Planning Kai Arras Social Robotics Lab, University of

Human-Oriented Robotics Supervised Learning Part 3/3 Kai Arras Social Robotics Lab, University

Human-Oriented Robotics Supervised Learning Part 2/3 Kai Arras Social Robotics Lab, University

ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Mobile &amp; Service Robotics

CPSC 121: Models of Computation 2018 Summer, section V01 Introduction &amp; Motivation Meghan

Commission Briefing on Commission Briefing on Equal Employment Equal Employment Opportunity and

Lectures 1&amp;2: Change over Time, Select, Navigate 4 labs core foundational

12. The Second World War and Contemporary Europe 12.1 The Causes and Course of the Second World

Who we are: Greg Foertsch Senior Artist, Firaxis Games Chris Sulzbach Artist,

Maxim Likhachev 1 Motion/Path Planning Uncertainty and Planning Uncertainty can be in: -

We lc o me to the NRF C We b ina r Ke e ping the Door s Ope n: Sustainability T ips for F

Zainabiya Madressa Milton Keynes Parents Meeting JANUARY 2018 / RABI AL THAANI 1439AH DR HAMID

Mobile & Service Robotics Mobile & Service Robotics Sensors for Robotics Sensors for

Mobile & Service Robotics Mobile & Service Robotics Sensors for Robotics Sensors for

Mobile & Service Robotics Mobile & Service Robotics Sensors for Sensors for Robotics

ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Mobile & Service Robotics

CPSC 121: Models of Computation 2018 Summer, section V01 Introduction & Motivation Meghan

Lectures 1&2: Change over Time, Select, Navigate 4 labs core foundational