Making Decisions 10 AI Slides (6e) c Lin Zuoquan@PKU 1998-2020 10 - PowerPoint PPT Presentation

Making Decisions 10 AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 1

10 Making Decisions 10.1 Decision making agent 10.2 Preferences 10.3 Utilities 10.4 Decision networks • Decision networks • Value of information • Sequential decision problem ∗ 10.5 Game theory ∗ AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 2

Decision making agent function Decision-Theoretic-Agent ( percept ) returns action Updated decision-theoretic policy for current state based on available information including current percept and previous action calculate outcome for actions given action descriptions and utility of current states select action with highest expected utility given outcomes and utility information return action Decision theories: an agent’s choices • Utility theory: worth or value utility function – preference ordering over a choice set • Game theory: strategic interaction between rational decision- makers Hint: AI → Economy → Computational economy AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 3

Making decisions under uncertainty Suppose I believe the following: P ( A 25 gets me there on time | . . . ) = 0 . 04 P ( A 90 gets me there on time | . . . ) = 0 . 70 P ( A 120 gets me there on time | . . . ) = 0 . 95 P ( A 1440 gets me there on time | . . . ) = 0 . 9999 Which action to choose? Depends on my preferences for missing flight vs. airport cuisine, etc. Utility theory is used to represent and infer preferences Decision theory = probability theory + utility theory AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 4

Preferences An agent chooses among prizes ( A , B , etc.) and lotteries, i.e., situ- ations with uncertain prizes A p L Lottery L = [ p, A ; (1 − p ) , B ] 1−p B In general, a lottery (state) L with possible outcomes S 1 , · · · , S n that occur with probabilities p 1 , · · · , p n L = [ p 1 , S 1 ; · · · ; p n , S n ] each outcome S i of a lottery can be either an atomic state or another lottery AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 5

Preferences Notation A ≻ B A preferred to B A ∼ B indifference between A and B A ≻ ∼ B B not preferred to A Rational preferences preferences of a rational agent must obey constraints ⇒ behavior describable as maximization of expected utility AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 6

Axioms of preferences Orderability ( A ≻ B ) ∨ ( B ≻ A ) ∨ ( A ∼ B ) Transitivity ( A ≻ B ) ∧ ( B ≻ C ) ⇒ ( A ≻ C ) Continuity A ≻ B ≻ C ⇒ ∃ p [ p, A ; 1 − p, C ] ∼ B Substitutability A ∼ B ⇒ [ p, A ; 1 − p, C ] ∼ [ p, B ; 1 − p, C ] ( A ≻ B ⇒ [ p, A ; 1 − p, C ] ≻ [ p, B ; 1 − p, C ] ) Monotonicity A ≻ B ⇒ ( p ≥ q ⇔ [ p, A ; 1 − p, B ] ≻ [ q, A ; 1 − q, B ]) Decomposability [ p, A ; 1 − p, [ q, B ; 1 − q, C ]] ∼ [ p, A ; (1 − p ) q, B ; (1 − p )(1 − q ) , C ] AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 7

Rational preferences Violating the constraints leads to self-evident irrationality For example: an agent with intransitive preferences can be induced to give away all its money If B ≻ C , then an agent who has A C would pay (say) 1 cent to get B 1c 1c If A ≻ B , then an agent who has B would pay (say) 1 cent to get A B C If C ≻ A , then an agent who has 1c A would pay (say) 1 cent to get C AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 8

Utilities Preferences are captured by a utility function, U ( s ) assigns a single number to express the desirability of a state The expected utility of an action given the evidence, EU ( a | e ) the average utility value of the outcomes, weighted by the probability that the outcome occurs U ( a | e ) = Σ s ′ P ( Result ( a ) = s ′ | a, e ) ∪ U ( s ′ ) Theorem (Ramsey, 1931; von Neumann and Morgenstern, 1944): Given preferences satisfying the axioms, there exists a real-valued function U s.t. A ≻ U ( A ) ≥ U ( B ) ⇔ ∼ B U ([ p 1 , S 1 ; . . . ; p n , S n ]) = Σ i p i U ( S i ) AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 9

Maximizing expected utility MEU principle Choose the action that maximizes expected utility a ∗ = argmax a EU ( a | e ) Note: an agent can be entirely rational (consistent with MEU) without ever representing or manipulating utilities and probabilities E.g., a lookup table for perfect tic-tac-toe AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 10

Utility function Utilities map states (lotteries) to real numbers. Which numbers? Standard approach to assessment of human utilities compare a given state A to a standard lottery L p that has “best possible prize” u ⊤ with probability p “worst possible catastrophe” u ⊥ with probability (1 − p ) adjust lottery probability p until A ∼ L p continue as before 0.999999 pay $30 ~ L 0.000001 instant death Say, pay a monetary value on life AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 11

Utility scales Normalized utilities: u ⊤ = 1 . 0 , u ⊥ = 0 . 0 Micromorts (micro-mortality): one-millionth chance of death useful for Russian roulette, paying to reduce product risks, etc. QALYs: quality-adjusted life years useful for medical decisions involving substantial risk Note: behavior is invariant w.r.t. +ve linear transformation U ′ ( x ) = k 1 U ( x ) + k 2 where k 1 > 0 AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 12

Money Money does not behave as a utility function Given a lottery L with expected monetary value EMV ( L ) , usually U ( L ) < U ( EMV ( L )) , i.e., people are risk-averse Utility curve: for what probability p am I indifferent between a prize x and a lottery [ p, $ M ; (1 − p ) , $0] for large M ? Typical empirical data, extrapolated with risk-prone behavior: +U o o o o o o o o o o o o +$ −150,000 800,000 o o o AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 13

Multiattribute utility How can we handle utility functions of many variables X 1 . . . X n ? E.g., what is U ( Deaths, Noise, Cost ) ? How can complex utility functions be assessed from preference behaviour? Idea 1: identify conditions under which decisions can be made without complete identification of U ( x 1 , . . . , x n ) Idea 2: identify various types of independence in preferences and derive consequent canonical forms for U ( x 1 , . . . , x n ) AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 14

Strict dominance Typically define attributes such that U is monotonic in each Strict dominance: choice B strictly dominates choice A iff ∀ i X i ( B ) ≥ X i ( A ) (and hence U ( B ) ≥ U ( A ) ) X X 2 2 This region dominates A B C B C A A D X X 1 1 Deterministic attributes Uncertain attributes Strict dominance seldom holds in practice AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 15

Stochastic dominance 1.2 1 1 0.8 0.8 Probability Probability 0.6 0.6 S1 S2 S1 0.4 S2 0.4 0.2 0.2 0 0 -6 -5.5 -5 -4.5 -4 -3.5 -3 -2.5 -2 -6 -5.5 -5 -4.5 -4 -3.5 -3 -2.5 -2 Negative cost Negative cost Distribution p 1 stochastically dominates distribution p 2 iff � t � t ∀ t −∞ p 1 ( x ) dx ≤ −∞ p 2 ( t ) dt If U is monotonic in x , then A 1 with outcome distribution p 1 stochastically dominates A 2 with outcome distribution p 2 : � ∞ � ∞ −∞ p 1 ( x ) U ( x ) dx ≥ −∞ p 2 ( x ) U ( x ) dx Multiattribute: stochastic dominance on all attributes ⇒ optimal AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 16

Stochastic dominance Stochastic dominance can often be determined without exact distributions using qualitative reasoning E.g., construction cost increases with distance from city S 1 is closer to the city than S 2 ⇒ S 1 stochastically dominates S 2 on cost E.g., injury increases with collision speed Can annotate belief networks with stochastic dominance information + X → Y ( X positively influences Y ) means that − For every value z of Y ’s other parents Z ∀ x 1 , x 2 x 1 ≥ x 2 ⇒ P ( Y | x 1 , z ) stochastically dominates P ( Y | x 2 , z ) AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 17

Label the arcs + or – SocioEcon Age GoodStudent ExtraCar Mileage RiskAversion VehicleYear SeniorTrain MakeModel DrivingSkill DrivingHist Antilock DrivQuality HomeBase AntiTheft CarValue Airbag Accident Ruggedness Theft OwnDamage Cushioning OwnCost OtherCost MedicalCost LiabilityCost PropertyCost AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 18

Label the arcs + or – SocioEcon Age GoodStudent ExtraCar Mileage RiskAversion VehicleYear SeniorTrain + MakeModel DrivingSkill DrivingHist Antilock DrivQuality HomeBase AntiTheft CarValue Airbag Accident Ruggedness Theft OwnDamage Cushioning OwnCost OtherCost MedicalCost LiabilityCost PropertyCost AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 19

Label the arcs + or – SocioEcon Age GoodStudent ExtraCar Mileage RiskAversion VehicleYear + SeniorTrain + MakeModel DrivingSkill DrivingHist Antilock DrivQuality HomeBase AntiTheft CarValue Airbag Accident Ruggedness Theft OwnDamage Cushioning OwnCost OtherCost MedicalCost LiabilityCost PropertyCost AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 10 20

Making Decisions 10 AI Slides (6e) c Lin Zuoquan@PKU 1998-2020 10 - PowerPoint PPT Presentation

Making Decisions 10 AI Slides (6e) c Lin Zuoquan@PKU 1998-2020 10 1 10 Making Decisions 10.1 Decision making agent 10.2 Preferences 10.3 Utilities 10.4 Decision networks Decision networks Value of information Sequential

Today Making Simple Decisions Making Decisions Making Sequential Decisions Planning

Making better decisions and improving Making better decisions and improving performance

Making maps pretty Andrea Aime Jim Groffen Making Maps Pretty Making Maps Pretty 1 1 Making

GCSE or Equivalent Options Decisions! Decisions! Decisions! An important time for our Year 10

Doing Your Taxes Decisions Decisions Decisions How do I get ready? Should I

Dysphagia: decisions, decisions, decisions Sean White Home Enteral Feed Dietitian Sheffield

$ Lesson One Making Decisions 04/09 the decision-making process The decision-making process

Decision Making Under Uncertainty Making Decisions Under Uncertainty AI C LASS 10 (C H .

Making Every Contact Count (MECC) Content What is Making Every Contact Count? Who is

TUFF TUFF TUFF TUFF TUFF TUFF TUFF TUFF MAKING MAKING MAKING MAKING SENSE OF SENSE OF

Making Mother Happy Making Mother Happy Titus 1:1-3 Titus 1:1-3 Making Mother

Decisions Matter: Understanding How and Why We Make Decisions About the Environment Elke U.

Dynamic Programming Sequence Of Decisions Sequence of decisions. As in the greedy

Dynamic Programming Sequence Of Decisions Sequence of decisions. As in the greedy

Lecture 2: Making Sequences of Good Decisions Given a Model of the World Emma Brunskill CS234

Making Decisions via Simulation Factor Screening [Law, Ch. 10], [Handbook of Sim. Opt.], [Haas,

Thermostatic Controls for Noisy Gradient Systems and Applications to Machine Learning Ben

Conformal Gravity The missing symmetry in GR? Reinoud J. Slagter ASFYON, The Netherlands Slagter

Regularization prescriptions and convex duality: density estimation and Renyi entropies Ivan

Advanced Computational Modeling of Social Systems Lars-Erik Cederman and Luc Girardin Center for

QCD Daniel de Florian Dpto. de Fsica- FCEyN- UBA 1 DISCLAIMER(S) Purpose(s) of these

t tr stts t r

SCATTERING THEORY IN NONRELATIVISTIC QUANTUM FIELD THEORY Jan Derezi nski 1 1. Basic

Lecture 20b The Birth of Quantum Mechanics The Origins of the Quantum Theory Announcements