dealing with uncertainty
play

Dealing with Uncertainty We want to get to the point where we can - PDF document

Dealing with Uncertainty We want to get to the point where we can reason with uncertainty CS 331: Artificial Intelligence This will require using probability e.g. Probability I probability that it will rain today is 0.99 We will


  1. Dealing with Uncertainty • We want to get to the point where we can reason with uncertainty CS 331: Artificial Intelligence • This will require using probability e.g. Probability I probability that it will rain today is 0.99 • We will review the fundamentals of probability Thanks to Andrew Moore for some course material 1 2 Outline Random Variables • The basic element of probability is the 1. Random variables random variable 2. Probability • Think of the random variable as an event with some degree of uncertainty as to whether that event occurs • Random variables have a domain of values it can take on 4 Random Variables Random Variables Example: Example: • ProfLate is a random variable for whether • ProfLate is a random variable for whether your prof will be late to class or not your prof will be late to class or not • The domain of ProfLate is { true , false} • The domain of ProfLate is < true , false > – ProfLate = true : proposition that prof – ProfLate = true : proposition that prof will be late to class will be late to class – ProfLate = false : proposition that prof You can assign some degree of will not be late to class – ProfLate = false : proposition that prof belief to this proposition e.g. will not be late to class P(ProfLate = true) = 0.9 5 6 1

  2. Random Variables Random Variables • We will refer to random variables with Example: capitalized names e.g. X , Y , ProfLate • ProfLate is a random variable for whether • We will refer to names of values with lower your prof will be late to class or not case names e.g. x , y , proflate • The domain of ProfLate is < true , false > • This means you may see a statement like – ProfLate = true : proposition that prof ProfLate = proflate will be late to class – This means the random variable ProfLate takes the value proflate (which can be true or false ) – ProfLate = false : proposition that prof • Shorthand notation : will not be late to class ProfLate = true is the same as proflate and And to this one e.g. ProfLate = false is the same as ¬ proflate P(ProfLate = false) = 0.1 7 8 Random Variables Boolean Random Variables • Take the values true or false 3 types of random variables: • E.g. Let A be a Boolean random variable 1. Boolean random variables – P(A = false) = 0.9 2. Discrete random variables – P(A = true) = 0.1 3. Continuous random variables 10 Discrete Random Variables Discrete Random Variables Values of the domain must be: Allowed to taken on a finite number of values • Mutually Exclusive i.e. P( A = v i AND A = v j ) = 0 e.g. if i  j • P(DrinkSize=small) = 0.1 This means, for instance, that you can’t have a • P(DrinkSize=medium) = 0.2 drink that is both small and medium • Exhaustive i.e. P(A = v 1 OR A = v 2 OR ... OR A = • P(DrinkSize=large) = 0.7 v k ) = 1 This means that a drink can only be either small , medium or large . There isn’t an extra large. 2

  3. Discrete Random Variables Discrete Random Variables • Since we now have multi-valued discrete Values of the domain must be: • Mutually Exclusive i.e. P( A = v i AND A = v j ) = 0 random variables we can’t write P(a) or if i  j P(¬a) anymore This means, for instance, that you can’t have a The AND here means intersection • We have to write P(A = v i ) where v i = a i.e. (A = v i )  (A = v j ) drink that is both Small and Medium value in { v 1 , v 2 , …, v k } • Exhaustive i.e. P(A = v 1 OR A = v 2 OR ... OR A = v k ) = 1 This means that a drink can only be either small , The OR here means union i.e. (A = v 1 )  medium or large . There isn’t an extra large (A = v 2 )  ...  (A = v k ) 14 Probability Density Functions Continuous Random Variables Discrete random variables have probability distributions: • Can take values from the real numbers 1.0 • E.g. They can take values from [0, 1] P( A ) • Note: We will primarily be dealing with discrete random variables a ¬a Continuous random variables have probability density • (The next slide is just to provide a little bit functions e.g: of information about continuous random P( X ) P( X ) variables) X X 15 Probabilities Probabilities • We will sometimes talk about the • We will write P(A=true) as “the fraction of probabilities of all possible values of a possible worlds in which A is true” random variable • We can debate the philosophical • Instead of writing implications of this for the next 4 hours – P(A=false) = 0.25 • But we won’t – P(A=true) = 0.75 • We will write P ( A ) = (0.25, 0.75) Note the boldface! 18 3

  4. Visualizing A The Axioms of Probability • 0  P( a )  1 • P( true ) = 1 • P( false ) = 0 Event space of all possible P( a ) = Area of Worlds in which • P( a OR b ) = P( a ) + P( b ) - P( a AND b ) worlds reddish oval A is true The logical OR is equivalent to set Its area is 1 The logical AND is equivalent to union  . Worlds in which A is false set intersection (  ). Sometimes, I’ll write it as P(a, b) These axioms are often called Kolmogorov’s axioms in honor of the Russian mathematician Andrei Kolmogorov 19 20 Interpreting the axioms Interpreting the axioms • 0  P( a )  = 1 • 0  P( a )  1 • • P( true ) = 1 P( true ) = 1 • P( false ) = 0 • P( false ) = 0 • • P( a OR b ) = P( a ) + P( b ) - P( a, b ) P( a OR b ) = P( a ) + P( b ) - P( a, b ) The area of P( a) can’t The area of P( a) can’t get any smaller than 0 get any bigger than 1 And a zero area would And an area of 1 would mean that there is no mean all worlds will have world in which a is not a is true false 21 22 Interpreting the axioms Prior Probability • 0  P( a )  1 • We can consider P(A) as the unconditional • P( true ) = 1 or prior probability • P( false ) = 0 • P( a OR b ) = P( a ) + P( b ) - P( a, b ) – E.g. P(ProfLate = true) = 1.0 P( a, b ) [The purple area] • It is the probability of event A in the absence of any other information • If we get new information that affects A , we a b can reason with the conditional probability of A given the new information. P( a OR b ) [the area of both circles] 23 24 4

  5. Conditional Probability Conditional Probability Example • P( A | B ) = Fraction of worlds in which B is H = “Have a headache” F = “Coming down with true that also have A true F Flu” • Read this as: “Probability of A conditioned P( H ) = 1/10 on B ” P( F ) = 1/40 P( H | F ) = 1/2 • Prior probability P( A ) is a special case of the H “Headaches are rare and flu conditional probability P( A | ) conditioned on is rarer, but if you’re coming no evidence down with ‘flu there’s a 50 - 50 chance you’ll have a headache.” 25 26 Conditional Probability Definition of Conditional Probability P( H | F ) = Fraction of flu-inflicted ( , ) P A B F worlds in which you have a  ( | ) P A B headache ( ) P B # worlds with flu and headache  # worlds with flu H Area of " H and F" region Corollary: The Chain Rule (aka The Product Rule)  Area of " F" region  P(H, F)  ( , ) ( | ) ( ) P A B P A B P B H = “Have a headache” P(F) F = “Coming down with Flu” P( H ) = 1/10 P( F ) = 1/40 27 28 P( H | F ) = 1/2 Important Note The Joint Probability Distribution    • P( A , B ) is called the joint probability ( | ) ( | ) 1 P A B P A B distribution of A and B • It captures the probabilities of all But: combinations of the values of a set of    random variables ( | ) ( | ) does not always 1 P A B P A B 29 30 5

  6. The Joint Probability Distribution The Joint Probability Distribution • Now suppose we have the random variables: • For example, if A and B are Boolean – Drink = { coke , sprite } random variables, then P( A , B ) could be – Size = { small , medium, large } specified as: • The joint probability distribution for P( Drink , Size ) could look like: P( A = false , B = false ) 0.25 P( A = false , B = true ) 0.25 P( Drink = coke , Size = small ) 0.1 P( A = true , B = false ) 0.25 P( Drink = coke , Size = medium ) 0.1 P( A = true , B = true ) 0.25 P( Drink = coke , Size = large ) 0.3 P( Drink = sprite , Size = small ) 0.1 P( Drink = sprite , Size = medium ) 0.2 P( Drink = sprite , Size = large ) 0.2 31 32 Full Joint Probability Distribution • Suppose you have the complete set of random variables used to describe the world • A joint probability distribution that covers this complete set is called the full joint probability distribution • Is a complete specification of one’s uncertainty about the world in question • Very powerful: Can be used to answer any probabilistic query 33 6

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend