lectur ture 2 e 24 decis isio ion networks a and sequen
play

Lectur ture 2 e 24 Decis isio ion Networks a and Sequen uenti - PowerPoint PPT Presentation

Computer Science CPSC 322 Lectur ture 2 e 24 Decis isio ion Networks a and Sequen uenti tial al Decision on Probl blem ems 1 Lect cture re Overvi rview Recap Computing single-stage optimal decision Sequential Decision


  1. Computer Science CPSC 322 Lectur ture 2 e 24 Decis isio ion Networks a and Sequen uenti tial al Decision on Probl blem ems 1

  2. Lect cture re Overvi rview • Recap • Computing single-stage optimal decision • Sequential Decision Problems • Finding Optimal Policies with VE 2

  3. Expected u utility tility o of f a decis isio ion • The expected utility of decision D = d is E( U | D = d ) = ∑ w╞ ( D = d ) P ( w ) U ( w ) = P ( w 1 ) × U ( w 1 ) + …. + P ( w n ) × U ( w n ) Probability Utility E[U|D] 0.2 35 35 83 0.8 95 35 30 0.01 74.55 0.99 75 0.2 35 3 80.6 0.8 100 35 0 0.01 79.2 0.99 80

  4. Op Optim timal s l sin ingle-stage d age decision on • Single Stage (aka One-Off) Decisions • One or more primitive decisions that can be treated as a single macro decision to be made before acting • Given a single (macro) decision variable D • the agent can choose D=d i for any value d i ∈ dom(D)

  5. Opt ptimal dec decision i in n rob robot del delivery ex example Conditional Best decision: (wear pads, short Utility E[U|D] probability way) 0.2 35 35 83 0.8 95 30 35 0.01 74.55 0.99 75 0.2 35 3 80.6 0.8 100 35 0 0.01 79.2 0.99 80

  6. Singl ngle-Stag age e dec decisi sion net on networ orks Extend belief networks Random variables: same as in Bayesian networks • drawn as an ellipse • Arcs into the node represent probabilistic dependence • random variable is conditionally independent of its non-descendants gi its parents Decision nodes, that the agent chooses the value for • Parents: only other decision nodes allowed  represent information available when the decision is made • Domain is the set of possible actions • Drawn as a rectangle Exactly one utility node • Parents: all random & decision variables on which the utility depends • Specifies a utility for each instantiation of its parents • Drawn as a diamond

  7. Examp mple D Deci cisi sion N Network rk Which Way W Accident A P(A|W) long true 0.01 Explicitly shows dependencies. E.g., long false 0.99 which variables affect the probability short true 0.2 of an accident and the agent’s utility? short false 0.8 Which way Accident Wear Pads Utility long true true 30 Decision nodes simply list long true false 0 the available decisions. long false true 75 long false false 80 Which Way short true true 35 t short true false 3 f short false true 95 short false false 100

  8. Appl pplet et f for or B Bay ayes esian an and and Dec ecision on Net etworks The Belief and Decision Networks we have seen previously allows you to load predefined Decision networks for various domains and run queries on them. Select one of the available examples via “File -> Load Sample Problem For Deci cisi sion Netw tworks ks Choose any of the examples below the blue line in the list that appears • • Right click on a node to perform any of these operations • View ew t the C he CPT/Decision t tabl able/Utility t tabl able f for or a a chanc hance/dec ecision/utility node node • Make an observation for a chance variable (i.e., set it to one of its values) • Query the current probability distribution for a chance node given the observations made • A dialogue box will appear the first time you do this. Select “Always brief” at the bottom, and then click “Brief”. • To compute the optimal decision (policy) click on the “Optimize Decision” button in the toolbar and select Brief in the dialogue box that will appear • To see the actual policy, view the decision table for each decision node in the network See available help pages and video tutorials for more details on how to use the Bayes applet (http://www.aispace.org/bayes/)

  9. Lect cture re Overvi rview • Recap • Computing single-stage optimal decision • Sequential Decision Problems • Finding Optimal Policies with VE 9

  10. Comput omputing t ng the he opt optimal mal dec decisi sion: on: w we e can an us use e VE Denote • the random variables as X 1 , …, X n • the decision variables as D • the parents of node N as pa(N) ∑ = ( ) ( ,..., | ) ( ( )) E U P X X D U pa U 1 n ,..., X X 1 n n ∑ ∏ = ( | ( )) ( ( )) P X pa X U pa U i i = ,..., 1 X X i 1 n Includes decision vars • To find the optimal decision we can use VE: 1. Create a factor for each conditional probability and for the utility 2. Sum out all random variables, one at a time 1. This creates a factor on D that gives the expected utility for each d i 3. Choose the d i with the maximum value in the factor

  11. VE Exampl ple: e: Step 1 ep 1, creat eate i e initial al f factor ors Abbreviations: f 1 (A,W) Which Way W Accident A P(A|W) W = Which Way long true 0.01 P = Wear Pads long false 0.99 A = Accident short true 0.2 short false 0.8 f 2 (A,W,P) Which way W Accident A Pads P Utility long true true 30 ∑ long true false 0 = ( ) ( | ) ( , , ) E U P A W U A W P long false true 75 long false false 80 A ∑ = short true true 35 ( , ) ( , , ) f A W f A W P 1 2 short true false 3 A short false true 95 short false false 100

  12. VE e VE exa xample: step p 2, 2, sum out um out A A Step 2a: compute product f 1 (A,W) × f 2 (A,W,P) What is the right form for the product f 1 (A,W) × f 2 (A,W,P)? It is f(A,P,W): • the domain of the product is the union of the multiplicands’ domains f(A=a,P=p,W=w) = f 1 (A=a,W=w) × f 2 (A=a,W=w,P=p) •

  13. VE e VE exa xample: step p 2, 2, sum out um out A A Step 2a: compute product f 1 (A,W) × f 2 (A,W,P) f (A=a,P=p,W=w) = f 1 (A=a,W=w) × f 2 (A=a,W=w,P=p) Which way W Accident A f 1 (A,W) long true 0.01 Which way W Accident A Pads P f(A,W,P) long false 0.99 long true true short true 0.2 long true false short false 0.8 long false true ???? long false false Which way W Accident A Pads P f 2 (A,W,P) short true true long true true 30 short true false long true false 0 short false true long false true 75 short false false long false false 80 short true true 35 short true false 3 short false true 95 short false false 100

  14. Ge Getti tting th the o outc tcome w with ith th the applet Select “optimize decision” in the menu bar

  15. Lect cture re Overvi rview • Recap • Computing single-stage optimal decision • Sequential Decision Problems • Finding Optimal Policies with VE 15

  16. Sequen quential al Deci cisi sion P Problems ms • Under uncertainty, a typical scenario is that an agent observes, acts, observes, acts, … • New observations are taken into account for acting • Subsequent actions can depend on what is observed • What is observed often depends on previous actions • Often the sole reason for carrying out an action is to provide information for future actions  For example: diagnostic tests • General Decision networks: • Just like single-stage decision networks, with one exception: the parents of decision nodes can include random variables

  17. Sequential decisions : Simplest possible • Only one decision! (but different from one-off decisions) • Early in the morning. Shall I take my um umbr brel ella today, based on the weather forecast? (I’ll have to go for a long walk at noon) • Relevant Random Variables?

  18. Sequen quential al D Decision on Probl blem ems: E Exam ampl ple In our Fire Alarm domain • • If there is a report you can decide to call the fire department • Before doing that, you can decide to check if you can see smoke, but this takes time and will delay calling • A decision (e.g. Call) can depend on a random variable (e.g. SeeSmoke ) Decision node: Agent decides Chance node: Chance decides Each decision D i has an information set of variables pa(D i ), whose value will be known at the time decision D i is made • pa(CheckSmoke) = {Report} • pa(Call) = {Report, CheckSmoke, See Smoke}

  19. The no he no-for orget getting ng p prope operty • A decision network has the no-forgetting property if • Decision variables are totally ordered: D 1 , …, D m • If a decision D i comes before D j ,then  D i is a parent of D j  any parent of D i is a parent of D j pa(CheckSmoke) = {Report} pa(Call) = {Report, CheckSmoke, See Smoke}

Recommend


More recommend