and Applications Lecture 8: Review of Probability Theory Juan - PowerPoint PPT Presentation

Artificial Intelligence: Methods and Applications Lecture 8: Review of Probability Theory Juan Carlos Nieves Sánchez November 28, 2014

Outline • Probability Axioms • Independence. • Baye ’s rule. • Inference Using Full Joint Distributions Review of Probability Theory 3

What is probability theory Probability theory deals with mathematical models of random phenomena. We often use models of randomness to model uncertainty. Uncertainty can have different causes: • Laziness: it is too difficult or computationally expensive to get to a certain answer. • Theoretical ignorance: We don’t know all the rules that influence the processes we are studying. • Practical ignorance: We know the rules in principle, but we don’t have all the data to apply them. Review of Probability Theory 4

Random experiments Mathematical models of randomness are based on the concept of random experiments. Such experiments should have two important properties: • 1. The experiment must be repeatable. • 2. Future outcomes cannot be exactly predicted based on previous outcomes, even if we can control all aspects of the experiment. Examples: • Coin tossing • Genetics Review of Probability Theory 5

Deterministic vs. random models Deterministic models often give a macroscopic view of random phenomena. They describe an average behavior but ignore local random variations. Examples: • Water molecules in a river. • Gas molecules in a heated container. Lesson to be learned: Model on the right level of detail! Review of Probability Theory 6

Random Variables The basic element of probability is the random variable. We can think random variable as an event with some degree of uncertainty as to whether that event occurs. Random variables have a domain of values it can take on. There are two types of random variables: 2. Discrete random variables. 3. Continuous random variables. Review of Probability Theory 7

Exampes of Random Variables Discrete random variable can take values from a finite number of values. For example: • P(DrinkSize=Small) = 0.1 • P(DrinkSize=Medium) = 0.2 • P(DrinkSize=Large) = 0.7 Continuous random variables can take values from the real number, e.g, they can take values from 0,1 . Note: We will mainly be dealing with discrete random variables. Review of Probability Theory 8

Probability Given a random variable A , P(A) denotes the fraction of possible worlds in which A is true. Worlds in which X is false Event space of P(A) all possible worlds Worlds in which A is true Review of Probability Theory 9

Key observation Consider a random experiment for which outcome 𝐵 sometimes occurs and sometimes doesn’t occur. • Repeat the experiment a large number of times and note, for each repetition, whether 𝐵 occurs or not • Let 𝑔 𝑜 (𝐵) be the number of times 𝐵 occurred in the first 𝑜 experiments 𝑔 𝑜 (𝐵) • Let 𝑠 𝑜 be the relative frequency of 𝐵 in the 𝑜 𝐵 = first 𝑜 experiments Key observation: As 𝑜 → ∞ , the relative frequency 𝑠 𝑜 𝐵 converges to a real number . Review of Probability Theory 10

Intuitions about probability I. Since 0 ≤ 𝑔 𝑜 (𝐵) ≤ 𝑜 we have 0 ≤ 𝑠 𝑜 (𝐵) ≤ 1 . Thus the probability of 𝐵 should be in [0, 1] . II. 𝑔 𝑜 ∅ = 0 and 𝑔 𝑜 𝐹𝑤𝑓𝑠𝑧𝑢ℎ𝑗𝑜𝑕 = 𝑜 . Thus the probability of ∅ should be 0 and the probability of 𝐹𝑤𝑓𝑠𝑧𝑢ℎ𝑗𝑜𝑕 should be 1. III. Let 𝐶 be 𝐹𝑤𝑓𝑠𝑧𝑢ℎ𝑗𝑜𝑕 except 𝐵 . Then 𝑔 𝑜 𝐵 + 𝑔 𝑜 𝐶 = 𝑜 and 𝑠 𝑜 𝐵 + 𝑠 𝑜 𝐶 = 1 . Thus the probability of 𝐵 plus the probability of 𝐶 should be 1. IV. Let 𝐵 ⊆ 𝐶 . Then 𝑠 𝑜 𝐵 ≤ 𝑠 𝑜 𝐶 and thus the probability of 𝐵 should be no bigger than that of 𝐶 . V. Let 𝐵 ∩ 𝐶 = ∅ and 𝐷 = 𝐵 ∪ 𝐶 . Then 𝑠 𝑜 (𝐷) = 𝑠 𝑜 (𝐵) + 𝑠 𝑜 (𝐶) . Thus the probability of 𝐷 should be the probability of 𝐵 plus the probability of 𝐶 . VI. Let 𝐷 = 𝐵 ∪ 𝐶 . Then 𝑔 𝑜 𝐷 ≤ 𝑔 𝑜 (𝐵) + 𝑔 𝑜 (𝐶) and 𝑠 𝑜 (𝐷) ≤ 𝑠 𝑜 (𝐵) + 𝑠 𝑜 (𝐶) . Thus the probability of 𝐷 should be at most the sum of the probabilities of 𝐵 and 𝐶 . VII. Let 𝐷 = 𝐵 ∪ 𝐶 and 𝐸 = 𝐵 ∩ 𝐶 . Then 𝑔 𝑜 𝐸 and 𝑜 𝐷 = 𝑔 𝑜 𝐵 + 𝑔 𝑜 𝐶 − 𝑔 thus the probability of 𝐷 should be the probability of 𝐵 plus the probability of 𝐶 minus the probability of 𝐸 . Review of Probability Theory 11

The probability space A probability space is a tuple where: • is the sample space or set of all elementary events • is the set of events (for our purposes, we can consider ) • is the probability function Note: We often use logical formulas to describe events: 𝑇𝑣𝑜𝑜𝑧 ∧ ¬ 𝐺𝑠𝑓𝑓𝑨𝑗𝑜𝑕 Review of Probability Theory 12

Kolmogorov’s axioms Kolmogorov formulated three axioms that the probability function 𝑄 must satisfy. The rest of probability theory can be built from these axioms. 1. A1: For any , there is a nonnegative real number 2. A2: 3. A3: Let be a collection of pairwise disjoint events. Let be their union. Then These axioms are often called Kolmogorov’s axioms in honor of the Russian mathematician Andrei Kolmogorov. Review of Probability Theory 13

Kolmogorov’s axioms Kolmogorov’s axioms express which properties have to satisfy a probability; however, they do not say how to calculate the probability of the events Review of Probability Theory 14

Flipping coins Consider the random experiment of flipping a coin two times, one after the other. Review of Probability Theory 15

Drawing from an urn Consider the random experiment of drawing two balls, one after the other, from an urn that contains a red (R) , a blue (B) , and a green (G) ball. Review of Probability Theory 16

Independent events The difference between the two examples is that in the first one, the two events are independent while in the second they are not. Review of Probability Theory 17

Conditional probability Review of Probability Theory 18

Flipping coins What is the probability of the second throw resulting in a head given that the first one results in a head? Review of Probability Theory 19

Drawing from an urn What is the probability of the second ball being blue given that the first one is red? Review of Probability Theory 20

The product rule If we rewrite the definition of conditional probability, we get the product rule. Conditional probability: Product rule: Review of Probability Theory 21

Bayes’ rule Review of Probability Theory 22

Bayes Rule’ Example Meningitis causes stiff necks with probability 0.5. The prior probability of having meningitis is 0.00002. The prior probability of having a stiff neck is 0.05. What is the probability of having meningitis given that you have a stiff neck? Review of Probability Theory 23

When is Bayes’ Rule Useful? • Sometimes it’s easier to get 𝑄(𝑌|𝑍) than 𝑄(𝑍|𝑌) . • Information is typically available in the form 𝑄(𝑓𝑔𝑔𝑓𝑑𝑢 | 𝑑𝑏𝑣𝑡𝑓 ) rather than 𝑄( 𝑑𝑏𝑣𝑡𝑓 | 𝑓𝑔𝑔𝑓𝑑𝑢) . 𝑄(𝑓𝑔𝑔𝑓𝑑𝑢 | 𝑑𝑏𝑣𝑡𝑓 ) quantifies the relationship in the • causal direction, whear 𝑄( 𝑑𝑏𝑣𝑡𝑓 | 𝑓𝑔𝑔𝑓𝑑𝑢) describes the diagnostic direction. • For example, 𝑄( 𝑡𝑧𝑛𝑞𝑢𝑝𝑛 | 𝑒𝑗𝑡𝑓𝑏𝑡𝑓 ) is easy to measure empirically but obtaining 𝑄( 𝑒𝑗𝑡𝑓𝑏𝑡𝑓 |𝑡𝑧𝑛𝑞𝑢𝑝𝑛 ) is harder. Review of Probability Theory 24

How is Bayes’ Rule Used In machine learning, we use Bayes rule in the following way: Likelihood of the data Prior probability Posterior probability Review of Probability Theory 25

Probability distributions For random variables with finite domains, the probability distribution simply defines the probability of the variable taking on each of the different values. For instance, The bold P indicates that the result is a vector of • numbers representing the probabilities of each individual state of weather; and where we assume a predefined ordering. Because a probability distribution represents a • normalized frequency distribution, the sum of probabilities must sum 1 . Review of Probability Theory 26

P notation and Conditional Distributions Review of Probability Theory 27

Possible worlds and full joint distributions Review of Probability Theory 28

Full Joint Probability Distributions Toothache Cavity Catch false false false 0.576 false false true 0.144 false true false 0.008 false true true 0.072 true false false 0.064 true false true 0.016 true true false 0.012 true true true 0.108 This cell means 𝑄(𝑈𝑝𝑝𝑢ℎ𝑏𝑑ℎ𝑓 = 𝑢𝑠𝑣𝑓, 𝐷𝑏𝑤𝑗𝑢𝑧 = 𝑢𝑠𝑣𝑓, 𝐷𝑏𝑢𝑑ℎ = 𝑢𝑠𝑣𝑓) = 0.108 Review of Probability Theory 29

Joint Probability Distribution Full Joint Probability Distributions are very powerful, they can be used to answer any probabilistic query involving the three random variables. Review of Probability Theory 30

and Applications Lecture 8: Review of Probability Theory Juan - PowerPoint PPT Presentation

Artificial Intelligence: Methods and Applications Lecture 8: Review of Probability Theory Juan Carlos Nieves Snchez November 28, 2014 Outline Probability Axioms Independence. Baye s rule. Inference Using Full Joint

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Multimedia Applications Multimedia Applications Srinidhi Varadarajan Multimedia Applications

Tcp/Ip Applications Programming for Os/2: With Applications for Presentation Manager Tcp/Ip

Network Applications Network Applications There are many network applications Network

Modular Applications, Loose Coupling, and the NetBeans Lookup API The Need for Modular

Vadim Lozin DIMAP Center for Discrete Mathematics and its Applications Mathematics Institute

CO550 Web Applications UNIT 11 Wider Context of Web Applications, Progressive Web Apps,

Customer Data Privacy in Customer Data Privacy in AMI Applications AMI Applications AMI

Sponsored by: Sponsored by: OR 680: Applications Seminar OR 680: Applications Seminar OR 680:

BLOCKCHAIN Technology & Applications #apiconf2018 BLOCKCHAIN Technology & Applications

New Directions for Web Applications Dave Raggett, Canon, TV Raman, IBM 1/11 Web Applications

AI Planner Applications Practical Applications of AI Planners Overview Deep Space 1

Maximum Flow Applications Max flow extensions and applications. Disjoint paths and network

Mobile Applications and Cloud Computing Mobile Applications and Cloud Computing 2015 Leonardo

CS378 - Mobile Computing Intents Intents Allow us to use applications and components that

S OFTWARE S ECURITY AND R ANDOMIZATION THROUGH P ROGRAM P ARTITIONING AND C IRCUIT V ARIATION M

Motion Planning n Problem n Given start state x S , goal state x G n Asked for: a sequence

Data Mining II Time Series Analysis Heiko Paulheim Introduction So far, we have only

A scalable quantum architecture for dark matter detection Daniel Carney JQI/QuICS, University of

Many-Task Applications in the Integrated Plasma Simulator Samantha S. Foley, Wael R. Elwasif,

Atmospheric Neutrino Fluxes: The use of muon fluxes to Improve the Accuracy in Low Energies. May,

Some RNN Variants Arun Mallya Best viewed with Computer Modern fonts installed Outline

Formal Design of Composite Physically Unclonable Function Durga Prasad Sahoo Debdeep

and Applications Lecture 8: Review of Probability Theory Juan - PowerPoint PPT Presentation

Artificial Intelligence: Methods and Applications Lecture 8: Review of Probability Theory Juan Carlos Nieves Snchez November 28, 2014 Outline Probability Axioms Independence. Baye s rule. Inference Using Full Joint

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Multimedia Applications Multimedia Applications Srinidhi Varadarajan Multimedia Applications

Tcp/Ip Applications Programming for Os/2: With Applications for Presentation Manager Tcp/Ip

Network Applications Network Applications There are many network applications Network

Modular Applications, Loose Coupling, and the NetBeans Lookup API The Need for Modular

Vadim Lozin DIMAP Center for Discrete Mathematics and its Applications Mathematics Institute

CO550 Web Applications UNIT 11 Wider Context of Web Applications, Progressive Web Apps,

Customer Data Privacy in Customer Data Privacy in AMI Applications AMI Applications AMI

Sponsored by: Sponsored by: OR 680: Applications Seminar OR 680: Applications Seminar OR 680:

BLOCKCHAIN Technology &amp; Applications #apiconf2018 BLOCKCHAIN Technology &amp; Applications

New Directions for Web Applications Dave Raggett, Canon, TV Raman, IBM 1/11 Web Applications

AI Planner Applications Practical Applications of AI Planners Overview Deep Space 1

Maximum Flow Applications Max flow extensions and applications. Disjoint paths and network

Mobile Applications and Cloud Computing Mobile Applications and Cloud Computing 2015 Leonardo

CS378 - Mobile Computing Intents Intents Allow us to use applications and components that

S OFTWARE S ECURITY AND R ANDOMIZATION THROUGH P ROGRAM P ARTITIONING AND C IRCUIT V ARIATION M

Motion Planning n Problem n Given start state x S , goal state x G n Asked for: a sequence

Data Mining II Time Series Analysis Heiko Paulheim Introduction So far, we have only

A scalable quantum architecture for dark matter detection Daniel Carney JQI/QuICS, University of

Many-Task Applications in the Integrated Plasma Simulator Samantha S. Foley, Wael R. Elwasif,

Atmospheric Neutrino Fluxes: The use of muon fluxes to Improve the Accuracy in Low Energies. May,

Some RNN Variants Arun Mallya Best viewed with Computer Modern fonts installed Outline

Formal Design of Composite Physically Unclonable Function Durga Prasad Sahoo Debdeep

BLOCKCHAIN Technology & Applications #apiconf2018 BLOCKCHAIN Technology & Applications