De Decisions cisions Computer ter Sc Scienc nce e cpsc sc32 - PowerPoint PPT Presentation

Dec ecision ision Th Theo eory: ry: Seq equential uential De Decisions cisions Computer ter Sc Scienc nce e cpsc sc32 322, 2, Lect ctur ure e 34 (Te Textb xtbook ok Ch Chpt 9.3) No Nov, , 29, 2013

“Single” Action vs. Sequence of Actions Set of primitive decisions that can be treated as a single macro decision to be made before acting • Agent makes observations • Decides on an action • Carries out the action

Lecture cture Ov Overview rview • Sequential Decisions • Representation • Policies • Finding Optimal Policies

Sequential decision problems • A sequential decision problem consists of a sequence of decision variables D 1 ,….., D n . • Each D i has an information set of variables pD i , whose value will be known at the time decision D i is made.

Sequential decisions : Simplest possible • Only one decision! (but different from one-off decisions) • Early in the morning. I listen to the we weather er fore recas cast, shall I take my umbre rella today? (I’ll have to go for a long walk at noon noon) • What is a reasonable decision network ? Weather@12 Weather@12 Morning Morning Forecast Forecast U U Take Umbrella Take Umbrella A. C . . B. D . . None of these

Sequential decisions : Simplest possible • Only one decision! (but different from one-off decisions) • Early in the morning. Shall I take my umbrell ella today? (I’ll have to go for a long walk at noon) • Relevant Random Variables?

Policies for Sequential Decision Problem: Intro • A policy specifies what an agent should do under each circumstance (for each decision, consider the parents of the decision node) In the Umbrella “degenerate” case: D 1 One possible Policy pD 1 How many policies?

Sequential decision problems: “complete” Example • A sequential decision problem consists of a sequence of decision variables D 1 ,….., D n . • Each D i has an information set of variables pD i , whose value will be known at the time decision D i is made. No-forgetting decision network: • decisions are totally ordered • if a decision D b comes before D a ,then • D b is a parent of D a • any parent of D b is a parent of D a

Policies for Sequential Decision Problems • A policy is a sequence of δ 1 ,….., δ n decision on funct ction ons δ i : dom( pD i ) → dom( D i ) • This policy means that when the agent has observed O  dom( pD i ) , it will do δ i ( O ) Example: Report port Check Smoke Repor port Check eckSm Smoke SeeSm eSmoke Call true true true true true true false false true false true true How many policies? true false false false false true true true false true false false false false true false false false false false

Lecture cture Ov Overview rview • Recap • Sequential Decisions • Finding Optimal Policies

When does a possible world satisfy a policy? • A possible world specifies a value for each random variable and each decision variable. • Possible ble wo world w satisfies isfies policy δ , written w ╞ δ if the value of each decision variable is the value selected by its decision function in the policy (when applied in w ). Decision function for… Report port Chec eck Smok oke true true VARs false false Fire true Decision function for… Tampering false Report t CheckS ckSmoke ke SeeSmoke ke Call Alarm true Leaving true true true true true Report false true true false false true false true true Smoke true true false false false true SeeSmoke false true true true true Chec eckSm Smok oke false true false false true Call false false true false false false false false

When does a possible world satisfy a policy? • Possible world w satisfies policy δ , written w ╞ δ if the value of each decision variable is the value selected by its decision function in the policy (when applied in w ). Decision function for… Report port Chec eck Smok oke VARs true true Fire true false false Decision function for… Tampering false Alarm true Report t CheckS ckSmoke ke SeeSmoke ke Call Leaving true Report true true true true true Smoke true true true false false true SeeSmoke true false true true true true false false false Chec eckSm Smok oke false true true true true Call false true false false false false true false false false false false A. A. B. C . C . Ca Cannot t tell

Expected Value of a Policy • Each possible world w has a probability P( w ) and a utility U( w ) • The expected utility of policy δ is • The optimal policy is one with the expected utility.

Lecture cture Ov Overview rview • Recap • Sequential Decisions • Finding Optimal Policies (Efficiently)

Complexity of finding the optimal policy: how many policies? • How many assignments to parents? • How many decision functions? (binary decisions) • How many policies? • If a decision D has k binary parents, how many assignments of values to the parents are there? • If there are b possible actions (possible values for D), how many different decision functions are there? • If there are d decisions, each with k binary parents and b possible actions, how many policies are there?

Finding the optimal policy more efficiently: VE 1. Create a factor for each conditional probability table and a factor for the utility. 2. 2. Su Sum out random variables that are not parents of a decision node. 3. 3. Eliminat ate (aka sum out) the decision variables 4. 4. Su Sum out the remaining random variables. 5. 5. Multi tiply ply the factor ctors: this is the expected utility of the optimal policy.

Eliminate the decision Variables: step3 details • Select a variable D that corresponds to the latest decision to be made • this variable will appear in only one factor with its parents • Eliminate D by maximizing. This returns: • A new factor to use in VE, max D f • The optimal decision function for D , arg max D f • Repeat till there are no more decision nodes. Report New factor Value lue Examp Ex mple: le: El Elim iminate inate Ch CheckS kSmo moke ke true false Report t CheckSmok oke Value lue Decision Function true true -5.0 true false -5.6 Report Check eckSm Smok oke false true -23.7 true false false -17.5 false

VE elimination reduces complexity of finding the optimal policy • We have seen that, if a decision D has k binary parents, there are b possible actions, If there are d decisions, • Then there are: ( b 2 k ) d policies • Doing variable elimination lets us find the optimal policy after considering only d . b 2 k policies (we eliminate one decision at a time) • VE is much h more re effi fici cient ent than searching through policy space. • However, this complexity is still l doubly-exp expon onenti ential al we'll only be able to handle relatively small problems.

Learning Goals for today’s class Yo You u can an: • Represent seque uenti ntial al decis isio ion n proble lems ms as decision networks. And explain the non forget gettin ing g proper erty ty • Verify whether a possi sibl ble e world d satis tisfie fies s a polic icy y and define the expec ected d value e of a policy icy • Compute the number r of polic icie ies s for a decision problem • Compute te the optim imal al polic icy y by Variable Elimination CPSC 322, Lecture 4 Slide 20

Big g Picture: cture: Planning anning under der Uncertainty certainty Probability Theory Decision Theory One-Off Decisions/ Markov Decision Processes (MDPs) Sequential Decisions Partially Fully Observable Observable MDPs MDPs (POMDPs) Decision Support Systems (medicine, business, …) Economics Control Robotics Systems 21

Cpsc sc 322 2 Big g Picture cture En Enviro ronm nmen ent Stochastic Deterministic Problem Arc Consistency Search Constraint Vars + Satisfaction Constraints SLS Static Belief Nets Logics Query Var. Elimination Search Markov Chains Sequential STRIPS Decision Nets Planning Var. Elimination Search Representation Reasoning CPSC 322, Lecture 2 Slide 22 Technique

De Decisions cisions Computer ter Sc Scienc nce e cpsc sc32 - PowerPoint PPT Presentation

Dec ecision ision Th Theo eory: ry: Seq equential uential De Decisions cisions Computer ter Sc Scienc nce e cpsc sc32 322, 2, Lect ctur ure e 34 (Te Textb xtbook ok Ch Chpt 9.3) No Nov, , 29, 2013 Single Action vs.

De Decisions cisions Computer ter Sc Science ce cpsc3 c322 22, , Lectur ture e 33 (Te

De Decision cision Th Theo eory: ry: Se Sequ quential ential De Decisions cisions Co

GCSE or Equivalent Options Decisions! Decisions! Decisions! An important time for our Year 10

Doing Your Taxes Decisions Decisions Decisions How do I get ready? Should I

Dysphagia: decisions, decisions, decisions Sean White Home Enteral Feed Dietitian Sheffield

Today Making Simple Decisions Making Decisions Making Sequential Decisions Planning

Making Simple Decisions Chapter 16 Ch. 16 p.1/33 Outline Rational preferences Utilities

De Decision cision Th Theo eory: ry: Si Singl ngle e St Stag age e De Decisions

Making Simple Decisions Chapter 16 Ch. 16 p.1/25 Outline Rational preferences Utilities

Decisions Matter: Understanding How and Why We Make Decisions About the Environment Elke U.

Dynamic Programming Sequence Of Decisions Sequence of decisions. As in the greedy

Dynamic Programming Sequence Of Decisions Sequence of decisions. As in the greedy

Crop I nsurance Decisions in 2 0 1 1 Crop I nsurance Decisions in 2 0 1 1 Bruce J. Sherrick

Web 2.0: Hiring & Firing Web 2.0: Hiring & Firing Decisions Based on Social Decisions

Derek Smithee, WQP Division Chief Decisions dont require data But GOOD decisions do! Quality

PROJECT BOUNDARY Nag Murty Nayan Jain Erik Bjontegard ? ? ?

CS440/ECE448 Lecture 18: Hidden Markov Models Mark Hasegawa-Johnson, 3/2020 Including slides by

CSEP 573: Artificial Intelligence Spring 2014 Hidden Markov Models & Exact Inference Ali

Preserving Scientific Codes with Umbrella Haiyan Meng The Cooperative Computing Lab University

Game Theory: Lecture #8 Outline: Individual Optimization Security Strategies Optimization

Reasoning Under Uncertainty Over Time Alice Gao Lecture 16 Readings: R & N 15.1 to 15.3

Where are we? Informatics 2D Reasoning and Agents Semester 2, 20192020 Last time . . .

2017 Partners in Giving Campaign UW Chair & Coordinator Training Fall 2017 Agenda

Constraint Handling Rules - Getting started Prof. Dr. Thom Fr uhwirth | 2009 | University of

Sambuz

Useful Links

Newsletter

Mail Us

De Decisions cisions Computer ter Sc Scienc nce e cpsc sc32 - PowerPoint PPT Presentation

Dec ecision ision Th Theo eory: ry: Seq equential uential De Decisions cisions Computer ter Sc Scienc nce e cpsc sc32 322, 2, Lect ctur ure e 34 (Te Textb xtbook ok Ch Chpt 9.3) No Nov, , 29, 2013 Single Action vs.

De Decisions cisions Computer ter Sc Science ce cpsc3 c322 22, , Lectur ture e 33 (Te

De Decision cision Th Theo eory: ry: Se Sequ quential ential De Decisions cisions Co

GCSE or Equivalent Options Decisions! Decisions! Decisions! An important time for our Year 10

Doing Your Taxes Decisions Decisions Decisions How do I get ready? Should I

Dysphagia: decisions, decisions, decisions Sean White Home Enteral Feed Dietitian Sheffield

Today Making Simple Decisions Making Decisions Making Sequential Decisions Planning

Making Simple Decisions Chapter 16 Ch. 16 p.1/33 Outline Rational preferences Utilities

De Decision cision Th Theo eory: ry: Si Singl ngle e St Stag age e De Decisions

Making Simple Decisions Chapter 16 Ch. 16 p.1/25 Outline Rational preferences Utilities

Decisions Matter: Understanding How and Why We Make Decisions About the Environment Elke U.

Dynamic Programming Sequence Of Decisions Sequence of decisions. As in the greedy

Dynamic Programming Sequence Of Decisions Sequence of decisions. As in the greedy

Crop I nsurance Decisions in 2 0 1 1 Crop I nsurance Decisions in 2 0 1 1 Bruce J. Sherrick

Web 2.0: Hiring &amp; Firing Web 2.0: Hiring &amp; Firing Decisions Based on Social Decisions

Derek Smithee, WQP Division Chief Decisions dont require data But GOOD decisions do! Quality

PROJECT BOUNDARY Nag Murty Nayan Jain Erik Bjontegard ? ? ?

CS440/ECE448 Lecture 18: Hidden Markov Models Mark Hasegawa-Johnson, 3/2020 Including slides by

CSEP 573: Artificial Intelligence Spring 2014 Hidden Markov Models &amp; Exact Inference Ali

Preserving Scientific Codes with Umbrella Haiyan Meng The Cooperative Computing Lab University

Game Theory: Lecture #8 Outline: Individual Optimization Security Strategies Optimization

Reasoning Under Uncertainty Over Time Alice Gao Lecture 16 Readings: R &amp; N 15.1 to 15.3

Where are we? Informatics 2D Reasoning and Agents Semester 2, 20192020 Last time . . .

2017 Partners in Giving Campaign UW Chair &amp; Coordinator Training Fall 2017 Agenda

Constraint Handling Rules - Getting started Prof. Dr. Thom Fr uhwirth | 2009 | University of

Sambuz

Useful Links

Newsletter

Mail Us

Web 2.0: Hiring & Firing Web 2.0: Hiring & Firing Decisions Based on Social Decisions

CSEP 573: Artificial Intelligence Spring 2014 Hidden Markov Models & Exact Inference Ali

Reasoning Under Uncertainty Over Time Alice Gao Lecture 16 Readings: R & N 15.1 to 15.3

2017 Partners in Giving Campaign UW Chair & Coordinator Training Fall 2017 Agenda