15-381/781 Bayesian Nets & Probabilistic Inference Emma - PowerPoint PPT Presentation

15-381/781 Bayesian Nets & Probabilistic Inference Emma Brunskill (this time) Ariel Procaccia With thanks to Dan Klein (Berkeley), Percy Liang (Stanford) and Past 15-381 Instructors for some slide content, and Russell & Norvig

What You Should Know • Define probabilistic inference • How to define a Bayes Net given a real example • How joint can be used to answer any query • Complexity of exact inference • Approximation inference (direct, likelihood, Gibbs) Be able to implement and run algorithm o Compare benefits and limitations of each o 2

Bayesian Network • Compact representation of the joint distribution • Conditional independence relationships explicit Each var conditionally independent of all its non- o descendants in the graph given the value of its parents 3

Joint Distribution Ex. +c +s +r +w .01 • Variables: Cloudy, Sprinkler, +c +s +r -w .01 Rain, Wet Grass +c +s -r +w .05 +c +s -r -w .1 • Domain of each variable: 2 +c -s +r +w # +c -s +r -w # (true or false) +c -s -r +w # • Joint encodes probability of all +c -s -r -w # -c +s +r +w # combos of variables & values -c +s +r -w # -c +s -r +w # -c +s -r -w # P(Cloudy=false & Sprinkler = true -c -s +r +w # & Rain = false & WetGrass = True) -c -s +r -w # -c -s -r +w # -c -s -r -w # 4

Joint as Product of Conditionals (Chain rule) +c +s +r +w .01 P(WetGrass|Cloudy,Sprinkler,Rain)* +c +s +r -w .01 P(Rain|Cloudy,Sprinkler)* +c +s -r +w .05 +c +s -r -w .1 P(Sprinkler|Cloudy)* +c -s +r +w # P(Cloudy) +c -s +r -w # +c -s -r +w # = +c -s -r -w # -c +s +r +w # -c +s +r -w # -c +s -r +w # -c +s -r -w # -c -s +r +w # -c -s +r -w # -c -s -r +w # -c -s -r -w # 5

Joint as Product of Conditionals +c +s +r +w .01 P(WetGrass|Cloudy,Sprinkler,Rain)* +c +s +r -w .01 P(Rain|Cloudy,Sprinkler)* +c +s -r +w .05 P(Sprinkler|Cloudy)* +c +s -r -w .1 +c -s +r +w # P(Cloudy) +c -s +r -w # +c -s -r +w # = Cloudy +c -s -r -w # -c +s +r +w # -c +s +r -w # -c +s -r +w # Rain Sprinkler -c +s -r -w # -c -s +r +w # -c -s +r -w # -c -s -r +w # Wet -c -s -r -w # Grass … but there may be additional conditional independencies 6

What if some variables are conditionally indep? Cloudy Cloudy Rain Sprinkler Rain Sprinkler Wet Wet Grass Grass Explicitly shows any conditional independencies 7

Conditional Independencies +c ¡ 0.5 ¡ -‑c ¡ 0.5 ¡ +c +s +r +w .01 +c +s +r -w .01 Cloudy +c +s -r +w .05 +c +s -r -w .1 +c -s +r +w # +c ¡ +s ¡ .1 ¡ +c ¡ +r ¡ .8 ¡ +c -s +r -w # +c ¡ -‑s ¡ .9 ¡ Rain +c ¡ -‑r ¡ .2 ¡ Sprinkler +c -s -r +w # à -‑c ¡ +s ¡ .5 ¡ -‑c ¡ +r ¡ .2 ¡ +c -s -r -w # -‑c ¡ -‑s ¡ .5 ¡ -‑c ¡ -‑r ¡ .8 ¡ -c +s +r +w # Wet -c +s +r -w # Grass -c +s -r +w # +s ¡+r ¡+w ¡ .99 ¡ -c +s -r -w # +s ¡+r ¡ -‑w ¡ .01 ¡ -c -s +r +w # +s ¡-‑r ¡ +w ¡ .90 ¡ -c -s +r -w # +s ¡-‑r ¡ -‑w ¡ .10 ¡ -c -s -r +w # -‑s ¡+r ¡+w ¡ .90 ¡ -c -s -r -w # -‑s ¡+r ¡ -‑w ¡ .10 ¡ -‑s ¡ -‑r ¡ +w ¡ 0 ¡ -‑s ¡ -‑r ¡ -‑w ¡ 1.0 ¡ 8

Bayesian Network • Compact representation of the joint distribution • Conditional independence relationships explicit • Still represents joint so can be used to answer any probabilistic query 9

Probabilistic Inference • Compute probability of a query variable (or variables) taking on a value (or set of values) given some evidence • Pr[Q | E 1 =e 1 ,...,E k =e k ] 10

Using the Joint To Answer Queries • Joint distribution is sufficient to answer any probabilistic inference question involving variables described in joint • Can take Bayes Net, construct full joint, and then look up entries where evidence variables take on specified values 11

But Constructing Joint Expensive & Exact Inference is NP-Hard 12

Soln: Approximate Inference • Use samples to approximate posterior distribution Pr[Q | E 1 =e 1 ,...,E k =e k ] • Last time Direct sampling o Likelihood weighting o • Today Gibbs o 13

Poll: Which Algorithm? • Evidence: Cloudy=+c, Rain=+r • Query variable: Sprinkler • P(Sprinkler|Cloudy=+c,Rain=+r) Cloudy • Samples +c,+s,+r,-w o +c,-s,-r,-w Rain Sprinkler o +c,+s,-r,+w o +c,-s,+r,-w o Wet Grass • What algorithm could’ve generated these samples? 1) Direct sampling 2) Likelihood weighting 3) Both 4) No clue 14

Direct Sampling Recap Algorithm: 1. Create a topological order of the variables in the Bayes Net 15

Topological Order • Any ordering in directed acyclic graph where a node can only appear after all of its ancestors in the graph Cloudy • E.g. Cloudy, Sprinkler, Rain, WetGrass Rain o Sprinkler Cloudy, Rain, Sprinkler, WetGrass o Wet Grass 16

Direct Sampling Recap Algorithm: 1. Create a topological order of the variables in the Bayes Net 2. Sample each variable conditioned on the values of its parents 3. Use samples which match evidence variable values to estimate probability of query variable e.g. P(Sprinkler=+s|Cloudy=+c,Rain=+r) ~ # samples with +s,+c, +r / # samples with +c, +r • Consistent in limit of infinite samples • Inefficient (why?) 17

Consistency • In the limit of infinite samples, estimated Pr[Q | E 1 =e 1 ,...,E k =e k ] will converge to true posterior probability • Desirable property (otherwise always have some error) 18

Likelihood Weighting Recap 1. Create array TotalWeights Initialize value of each array element to 0 1. 2. For j=1:N w tmp = 1 1. Set evidence variables in sample z =<z 1 , … z n > to observed values 2. For each variable z i in topological order 3. 1. If x i is an evidence variable 1. w tmp = w tmp *P(Z i = e i |Parents(Z) = x (Parents(Z i ))) 2. Else 1. Sample x i conditioned on the values of its parents Update weight of resulting sample 4. 1. TotalWeights[ z ]=TotalWeights[ z ]+w tmp 3. Use weights to compute probability of query variable P(Sprinkler=+s|Cloudy=+c,Rain=+r) ~ Sum c,r,w TotalWeight(+s,c,r,w)/Sum s,c,r,w TotalWeight(s,c,r,w) 19

LW Consistency • Probability of getting a sample ( z,e ) where z is a set of values for the non-evidence variables and e is the vals of evidence vars Sampling distribution for a weighted sample (WS) • Is this the true posterior distribution P(z|e)? No, why? o Doesn’t consider evidence that is not an ancestor … o Weights fix this! o 20

Weighted Probability • Samples each non-evidence variable z according to • Weight of a sample is • Weighted probability of a sample is From chain rule & conditional indep 21

Does Likelihood Weighting Produce Consistent Estimates? Yes X is query var(s) E is evidence var(s) P ( X = x | e ) = P ( X = x , e ) ∝ P ( X = x , e ) Y is non-query vars P ( e ) P ( X = x | e ) ∝   ∑ P ( X = x , e ) = N WS ( x , y , e ) w ( x , y , e ) # of samples where query y variables=x, non-query=y, Evidence=e ∑ n * S WS ( x , y , e ) w ( x , y , e ) ≈ y as # samples n à infinity ∑ P ( x , y , e ) = y = P ( x , e ) 22

Example • When sampling S and R the evidence W=t is ignored Samples with S=f and R=f although evidence rules o this out • Weight makes up for this difference above weight would be 0 o • If we have 100 samples with R=t and total weight 1 , and 400 samples with R=f and total weight 2 , what is estimate of R=t? = 1/ 3 o 23

Limitations of Likelihood Weighting • Poor performance if evidence vars occur later in ordering • Why? • Not being used to influence samples! • Yields samples with low weights 24

Markov Chain Monte Carlo Methods • Prior methods generate each new sample from scratch • MCMC generate each new sample by making a random change to preceding sample • Can view algorithm as being in a particular state (assignment of values to each variable) 25

Review: Markov Blanket • Markov blanket Parents o Children o Children’s o parents • Variable conditionally independent of all other nodes given its Markov Blanket 26

Gibbs Sampling: Compute P(X| e ) mb(Z i ) = Markov Blanket of Z i from Russell & Norvig 27

Gibbs Sampling Example • Want Pr(R|S=t,W=t) +c ¡ 0.5 ¡ • Non-evidence variables are C & R -‑c ¡ 0.5 ¡ • Initialize randomly: C= t and R=f Cloudy • Initial state (C,S,R,W)= [t,t,f,t] • Sample C given current values of +c ¡ +s ¡ .1 ¡ +c ¡ +r ¡ .8 ¡ +c ¡ -‑s ¡ .9 ¡ +c ¡ -‑r ¡ .2 ¡ its Markov Blanket Rain Sprinkler -‑c ¡ +s ¡ .5 ¡ -‑c ¡ +r ¡ .2 ¡ -‑c ¡ -‑s ¡ .5 ¡ -‑c ¡ -‑r ¡ .8 ¡ Wet Grass +s ¡+r ¡+w ¡ .99 ¡ +s ¡+r ¡ -‑w ¡ .01 ¡ +s ¡-‑r ¡ +w ¡ .90 ¡ +s ¡-‑r ¡ -‑w ¡ .10 ¡ -‑s ¡+r ¡+w ¡ .90 ¡ -‑s ¡+r ¡ -‑w ¡ .10 ¡ -‑s ¡ -‑r ¡ +w ¡ 0 ¡ -‑s ¡ -‑r ¡ -‑w ¡ 1.0 ¡ 28

15-381/781 Bayesian Nets & Probabilistic Inference Emma - PowerPoint PPT Presentation

15-381/781 Bayesian Nets & Probabilistic Inference Emma Brunskill (this time) Ariel Procaccia With thanks to Dan Klein (Berkeley), Percy Liang (Stanford) and Past 15-381 Instructors for some slide content, and Russell & Norvig What

Slides for 15-381/781 15-381/781 Fall 2016

781 FIFTH AVEN AVENUE NEW EW YO YORK, K, NY Y 10022 781 FIFTH AVENUE LAN ANDMAR MARKS KS

CMU MDPs 15-381/781 Emma Brunskill (THIS TIME) Ariel Procaccia DeepMind 2 So long

Machine Learning Machine Learning 10 10- -701/15 701/15- -781, Fall 2006 781, Fall 2006

EOR Enhanced Oil Recovery 3535 W. 16 th . St. Odessa, Texas 79763 Tel. (432) 381-6540 Fax

6. Parameter Passing Parameter Passing CS 381 Spring 2016 Example (Formal) Parameter void

COMPANY PRESENTATION ADDRESS AND LOCATION ATB SEVER a.d. Magnetna polja 6, 24 000 Subotica

381: Guidelines for final project proposal and presentation LSA Linguistic Institute, Summer 2019

Hu Hurrican ane Ha Harvey F Funding g for W Water er I Infrastructure Clean Water

Students Alliance of Belgrad of Bel grade Balkansk nska a 4/II III, I, 11000 Beograd +3

Truth In Taxation 2017 Payable 2018 Levy and 2017-2018 Budget Presentation ISD #381 Lake

ASX ANNOUNCEMENT Date: 1 December 2014 Number: 381/011214 REVISED PRESENTATION TO 2014 ANNUAL

Dan Brandborg 620 Fish Hatchery Rd. Hamilton, Montana | 406-381-5643 www.SBSlink.com

CMU-Q 15-381 Lecture 1: Introduction AI, basic definitions, problems, road map Teacher:

Fast and simple constant-time hashing to the BLS12-381 elliptic curve (and other curves, too!)

15-381: AI Classical Deterministic Planning Representation and Search Fall 2009 Manuela

Successful Deployment of a Wireless Sensor Network for Precision Agriculture in Malawi M. Mafuta,

A10/M11 Growth Area Overview of economic characteristics and issues Tony Jackson, Executive

Sub Regional Leaders Board Friday 26 th July 2019 Local Industrial Strategy Update Progress

net zero in in the Humber region 30 July 2020 Dr Geraint Evans Catch Consultants Agenda ISCF

TOPIC 3: COUNTERFACTUALS & CAUSAL MODELS Dan Lassiter Paris VII Stanford Linguistics

Klamath Project 2018 Hydrologic Update Meeting March 9, 2018 Jeff Nettleton Area Manager,

GWA Advisory Committee October 10, 2018 Agenda Approval of September Meeting Minutes

Local Analysis of 2D Curve Patches Local Analysis of 2D Curve Patches Topic 4.2: Topic 4.2:

15-381/781 Bayesian Nets & Probabilistic Inference Emma - PowerPoint PPT Presentation

15-381/781 Bayesian Nets & Probabilistic Inference Emma Brunskill (this time) Ariel Procaccia With thanks to Dan Klein (Berkeley), Percy Liang (Stanford) and Past 15-381 Instructors for some slide content, and Russell & Norvig What

Slides for 15-381/781 15-381/781 Fall 2016

781 FIFTH AVEN AVENUE NEW EW YO YORK, K, NY Y 10022 781 FIFTH AVENUE LAN ANDMAR MARKS KS

CMU MDPs 15-381/781 Emma Brunskill (THIS TIME) Ariel Procaccia DeepMind 2 So long

Machine Learning Machine Learning 10 10- -701/15 701/15- -781, Fall 2006 781, Fall 2006

EOR Enhanced Oil Recovery 3535 W. 16 th . St. Odessa, Texas 79763 Tel. (432) 381-6540 Fax

6. Parameter Passing Parameter Passing CS 381 Spring 2016 Example (Formal) Parameter void

COMPANY PRESENTATION ADDRESS AND LOCATION ATB SEVER a.d. Magnetna polja 6, 24 000 Subotica

381: Guidelines for final project proposal and presentation LSA Linguistic Institute, Summer 2019

Hu Hurrican ane Ha Harvey F Funding g for W Water er I Infrastructure Clean Water

Students Alliance of Belgrad of Bel grade Balkansk nska a 4/II III, I, 11000 Beograd +3

Truth In Taxation 2017 Payable 2018 Levy and 2017-2018 Budget Presentation ISD #381 Lake

ASX ANNOUNCEMENT Date: 1 December 2014 Number: 381/011214 REVISED PRESENTATION TO 2014 ANNUAL

Dan Brandborg 620 Fish Hatchery Rd. Hamilton, Montana | 406-381-5643 www.SBSlink.com

CMU-Q 15-381 Lecture 1: Introduction AI, basic definitions, problems, road map Teacher:

Fast and simple constant-time hashing to the BLS12-381 elliptic curve (and other curves, too!)

15-381: AI Classical Deterministic Planning Representation and Search Fall 2009 Manuela

Successful Deployment of a Wireless Sensor Network for Precision Agriculture in Malawi M. Mafuta,

A10/M11 Growth Area Overview of economic characteristics and issues Tony Jackson, Executive

Sub Regional Leaders Board Friday 26 th July 2019 Local Industrial Strategy Update Progress

net zero in in the Humber region 30 July 2020 Dr Geraint Evans Catch Consultants Agenda ISCF

TOPIC 3: COUNTERFACTUALS &amp; CAUSAL MODELS Dan Lassiter Paris VII Stanford Linguistics

Klamath Project 2018 Hydrologic Update Meeting March 9, 2018 Jeff Nettleton Area Manager,

GWA Advisory Committee October 10, 2018 Agenda Approval of September Meeting Minutes

Local Analysis of 2D Curve Patches Local Analysis of 2D Curve Patches Topic 4.2: Topic 4.2:

TOPIC 3: COUNTERFACTUALS & CAUSAL MODELS Dan Lassiter Paris VII Stanford Linguistics