Rational decisions Chapter 16 Chapter 16 1 Outline Rational - PowerPoint PPT Presentation

Rational decisions Chapter 16 Chapter 16 1

Outline ♦ Rational preferences ♦ Utilities ♦ Money ♦ Value of information Chapter 16 2

Preferences An agent chooses among prizes ( A , B , etc.) and lotteries, i.e., situations with uncertain prizes A p L 1−p Lottery L = [ p, A ; (1 − p ) , B ] B Notation: A ≻ B A preferred to B A ∼ B indifference between A and B A ≻ ∼ B B not preferred to A Chapter 16 3

Rational preferences Idea: preferences of a rational agent must obey constraints. ⇒ Rational preferences behavior describable as maximization of expected utility Constraints: Orderability ( A ≻ B ) ∨ ( B ≻ A ) ∨ ( A ∼ B ) Transitivity ( A ≻ B ) ∧ ( B ≻ C ) ⇒ ( A ≻ C ) Continuity A ≻ B ≻ C ⇒ ∃ p [ p, A ; 1 − p, C ] ∼ B Substitutability A ∼ B ⇒ [ p, A ; 1 − p, C ] ∼ [ p, B ; 1 − p, C ] Monotonicity A ≻ B ⇒ ( p ≥ q ⇔ [ p, A ; 1 − p, B ] ≻ ∼ [ q, A ; 1 − q, B ]) Chapter 16 4

Rational preferences contd. Violating the constraints leads to self-evident irrationality For example: an agent with intransitive preferences can be induced to give away all its money A If B ≻ C , then an agent who has C would pay (say) 1 cent to get B 1c 1c If A ≻ B , then an agent who has B would pay (say) 1 cent to get A B C If C ≻ A , then an agent who has A 1c would pay (say) 1 cent to get C Chapter 16 5

Maximizing expected utility Theorem (Ramsey, 1931; von Neumann and Morgenstern, 1944): Given preferences satisfying the constraints there exists a real-valued function U such that A ≻ U ( A ) ≥ U ( B ) ⇔ ∼ B U ([ p 1 , S 1 ; . . . ; p n , S n ]) = Σ i p i U ( S i ) MEU principle: Choose the action that maximizes expected utility Note: an agent can be entirely rational (consistent with MEU) without ever representing or manipulating utilities and probabilities E.g., a lookup table for perfect tictactoe Chapter 16 6

Utilities Utilities map states to real numbers. Which numbers? Standard approach to assessment of human utilities: compare a given state A to a standard lottery L p that has “best possible prize” u ⊤ with probability p “worst possible catastrophe” u ⊥ with probability (1 − p ) adjust lottery probability p until A ∼ L p continue as before 0.999999 pay $30 ~ L 0.000001 instant death Chapter 16 7

Utility scales Normalized utilities: u ⊤ = 1 . 0 , u ⊥ = 0 . 0 Micromorts: one-millionth chance of death useful for Russian roulette, paying to reduce product risks, etc. QALYs: quality-adjusted life years useful for medical decisions involving substantial risk Note: behavior is invariant w.r.t. +ve linear transformation U ′ ( x ) = k 1 U ( x ) + k 2 where k 1 > 0 With deterministic prizes only (no lottery choices), only ordinal utility can be determined, i.e., total order on prizes Chapter 16 8

Money Money does not behave as a utility function Given a lottery L with expected monetary value EMV ( L ) , usually U ( L ) < U ( EMV ( L )) , i.e., people are risk-averse Utility curve: for what probability p am I indifferent between a prize x and a lottery [ p, $ M ; (1 − p ) , $0] for large M ? Define U ( M ) = 1 . 0 and set U ( x ) = pU ( M ) = p Chapter 16 9

Student group utility For each x , adjust p until half the class votes for lottery (M=10,000) p 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 $x 0 500 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Chapter 16 10

Money Typical empirical data, extrapolated with risk-prone behavior: +U o o o o o o o o o o o o +$ −150,000 800,000 o o o Chapter 16 11

Value of information Idea: compute value of acquiring each possible piece of evidence Example: buying oil drilling rights Two blocks A and B , exactly one has oil, worth k Prior probabilities 0.5 each, mutually exclusive Current price of each block is k/ 2 “Consultant” offers accurate survey of A . Fair price? Chapter 16 12

Value of information Idea: compute value of acquiring each possible piece of evidence Example: buying oil drilling rights Two blocks A and B , exactly one has oil, worth k Prior probabilities 0.5 each, mutually exclusive Current price of each block is k/ 2 “Consultant” offers accurate survey of A . Fair price? Solution: compute expected value of information = expected value of best action given the information minus expected value of best action without information Chapter 16 13

Value of information Idea: compute value of acquiring each possible piece of evidence Example: buying oil drilling rights Two blocks A and B , exactly one has oil, worth k Prior probabilities 0.5 each, mutually exclusive Current price of each block is k/ 2 “Consultant” offers accurate survey of A . Fair price? Solution: compute expected value of information = expected value of best action given the information minus expected value of best action without information Survey may say “oil in A” or “no oil in A”, prob. 0.5 each (given!) = [ 0 . 5 × value of “buy A” given “oil in A” + 0 . 5 × value of “buy B” given “no oil in A”] – 0 . 5 × k/ 2 = [ (0 . 5 × k/ 2) + (0 . 5 × k/ 2)] − (0 . 5 × k/ 2) = k/ 4 Chapter 16 14

General formula Current evidence E , current best action α Possible action outcomes S i , potential new evidence E j a Σ i U ( S i ) P ( S i | E, a ) EU ( α | E ) = max Suppose we knew E j = e jk , then we would choose α e jk s.t. a Σ i U ( S i ) P ( S i | E, a, E j = e jk ) EU ( α e jk | E, E j = e jk ) = max E j is a random variable whose value is currently unknown ⇒ must compute expected gain over all possible values: � Σ k P ( E j = e jk | E ) EU ( α e jk | E, E j = e jk ) � V PI E ( E j ) = − EU ( α | E ) (VPI = value of perfect information) Chapter 16 15

Properties of VPI Nonnegative —in expectation , not post hoc ∀ j, E V PI E ( E j ) ≥ 0 Nonadditive —consider, e.g., obtaining E j twice V PI E ( E j , E k ) � = V PI E ( E j ) + V PI E ( E k ) Order-independent V PI E ( E j , E k ) = V PI E ( E j ) + V PI E,E j ( E k ) = V PI E ( E k ) + V PI E,E k ( E j ) Note: when more than one piece of evidence can be gathered, maximizing VPI for each to select one is not always optimal ⇒ evidence-gathering becomes a sequential decision problem Chapter 16 16

Example Problem (from 16.11 in text) A used-car buyer is deciding whether to buy car c 1 . There is time to carry out at most one test, and that t 1 is the test of car c 1 . The buyer’s estimate is that c 1 has a 70% chance of being in good shape. A car can be in good shape (quality q + ) or bad shape (quality q − ), and the tests might help to indicate what shape the car is in. Car c 1 costs $1500, and its market value is $2000 if it is in good shape; if not, $700 in repairs will be needed to make it in good shape. Assume: P ( q + | pass ( c 1 , t 1 )) = 0 . 8 , P ( q − | pass ( c 1 , t 1 )) = 0 . 2 P ( q + | fail ( c 1 , t 1 )) = 0 . 4 , P ( q − | fail ( c 1 , t 1 )) = 0 . 6 P ( pass ( c 1 , t 1 )) = 0 . 75 , P ( fail ( c 1 , t 1 )) = 0 . 25 Q1: Calculate the optimal decisions (a) before any test, and (b) given either a pass or a fail, and their expected utilities. Q2: Calculate the value of information of the test. Chapter 16 17

Rational decisions Chapter 16 Chapter 16 1 Outline Rational - PowerPoint PPT Presentation

Rational decisions Chapter 16 Chapter 16 1 Outline Rational preferences Utilities Money Value of information Chapter 16 2 Preferences An agent chooses among prizes ( A , B , etc.) and lotteries, i.e., situations with

Extending Rational Apex Extending Rational Apex Greg Bek Greg Bek gab@rational.com

Rational points, rational curves, rational varieties Rational and integral points We study

Rational Robot A Test Automation Tool What is Rational Robot? Rational Robot is a complete

Rational preferences Idea: preferences of a rational agent must obey constraints. Rational

Rational preferences Idea: preferences of a rational agent must obey constraints. Rational

2.5: Rational Expressions and Equations College Algebra Week 2 Rational Expression

E XAMPLE 1 Identify the sum of product as rational or irrational. a. 5 + 8 rational / irrational

On the convergence of rational Ritz values Applications to rational interpolation of rational

Classes of Rational Graphs Christophe Morvan Irisa Journ ees Montoises 2006 1/25 Classes of

Rational Agents (Ch. 2) Rational agent An agent/robot must be able to perceive and interact with

17. Structs and Classes I C++ does not provide a built-in type for rational numbers Goal Rational

Rational isogenies Computing rational isogenies from the equations of the kernel David Lubicz,

Rational decisions Chapter 16 Chapter 16 1 Outline Rational preferences Utilities

WEEK 4 Management theories (rational and non-rational) Activity: management questionnair e

Rationality problems Rationality An algebraic variety X/k is (R) rational: if X P n for some n

Rational Choice Theories Rational choice theories of terrorism derive from economic theory of

Value of Choices Consider value you derive (from some choice) Say, 2 choices, each with n

PHIL309P Methods in Philosophy, Politics and Economics Eric Pacuit University of Maryland 1 /

Announcements CS 188: Artificial Intelligence W2 is due today (lecture or drop box) Spring

1 Modeling Assumptions The Dangers of Optimism and Pessimism Dangerous Optimism Dangerous

Continuous Improvement Toolkit Risk Analysis Continuous Improvement Toolkit . www.citoolkit.com

t ss rt

Outline Sensor manager vs. sensor scheduler Information based sensor management GMU SMS

Decision Making Beyond How We Can . . . Problem: Sometimes . . . Arrows Impossibility In