1 Building the (Entire) Joint Example: Alarm Network We can take a - PDF document

Announcements Introduction to Artificial Intelligence • How was mid-term? V22.0472-001 Fall 2009 • Will grade mid-term / assignment 2 this Lecture 14: Bayes’ Nets 2 Lecture 14: Bayes Nets 2 weekend weekend • Assignment 3 due this time next week Rob Fergus – Dept of Computer Science, Courant Institute, NYU Slides from Karen Livescu, Jeff Blimes , • Office hours today after class Dan Klein, Stuart Russell or Andrew Moore Example Bayes’ Net Bayes’ Nets • A Bayes’ net is an efficient encoding of a probabilistic model of a domain • Questions we can ask: Q • Inference: given a fixed BN, what is P(X | e)? • Representation: given a fixed BN, what kinds of distributions can it encode? • Modeling: what BN is most appropriate for a given domain? 3 4 Example: Traffic Bayes’ Net Semantics • Variables • A Bayes’ net: L • A set of nodes, one per variable X A 1 A n • T: Traffic • A directed, acyclic graph • R: It rains • A conditional distribution of each variable R B • L: Low pressure conditioned on its parents (the parameters θ ) • D: Roof drips X • B: Ballgame D T • Semantics: • A BN defines a joint probability distribution over its variables: 5 6 1

Building the (Entire) Joint Example: Alarm Network • We can take a Bayes’ net and build any entry from the full joint distribution it encodes • Typically, there’s no reason to build ALL of it • We build what we need on the fly • To emphasize: every BN over a domain implicitly defines a joint distribution over that domain, specified by local probabilities and graph structure 7 8 Size of a Bayes’ Net Bayes’ Nets • How big is a joint distribution over N Boolean variables? • So far: • What is a Bayes’ net? 2 N • What joint distribution does it encode? • Next: how to answer queries about that distribution • How big is an N-node net if nodes have up to k parents? • Key idea: conditional independence Key idea: conditional independence O(N * 2 k+1 ) O(N * 2 k+1 ) • Last class: assembled BNs using an intuitive notion of conditional independence as causality • Today: formalize these ideas • Both give you the power to calculate • Main goal: answer queries about conditional independence and • BNs: Huge space savings! influence • Also easier to elicit local CPTs • After that: how to answer numerical queries (inference) • Also turns out to be faster to answer queries (coming) 9 10 Conditional Independence Example: Independence • Reminder: independence • For this graph, you can fiddle with θ (the CPTs) all you want, but you won’t be able to represent any distribution in which • X and Y are independent if the flips are dependent! • X and Y are conditionally independent given Z X 1 X 2 • (Conditional) independence is a property of a h 0.5 h 0.5 distribution t 0.5 t 0.5 All distributions 11 12 2

Topology Limits Distributions Independence in a BN Y • Important question about a BN: • Given some graph topology G, only certain • Are two nodes independent given certain evidence? X Z joint distributions can be Y • If yes, can calculate using algebra (really tedious) encoded X Z • If no, can prove with a counter example • The graph structure guarantees certain g • Example: • Example: (conditional) independences X Y Z • (There might be more independence) • Adding arcs increases the • Question: are X and Z independent? set of distributions, but • Answer: not necessarily , we’ve seen examples otherwise: low has several costs pressure causes rain which causes traffic. Y • X can influence Z, Z can influence X (via Y) X Z • Addendum: they could be independent: how? 13 14 1. Causal Chains 2. Common Cause • This configuration is a “causal chain” • Another basic configuration: two effects of the same cause Y X: Low pressure X Y Z Y: Rain • Are X and Z independent? Z: Traffic X Z • Are X and Z independent given Y? p g • Is X independent of Z given Y? Y: Midterm exam X: Email list busy Z: Library full Yes! Yes! • Observing the cause blocks influence between effects. • Evidence along the chain “blocks” the influence 15 16 Is height independent of Common Cause Example: Is height independent of hair length? hair length? (2) x x x x long x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x L L mid x x x x x x x x x x x x x x x x x x x x x x x short x x x x x x x x x x 5’ 6’ 7’ H Slide credit: Karen Livescu Slide credit: Karen Livescu 3

Is height independent of 3. Common Effect hair length? (3) • Generally, no • Last configuration: two causes of one • If gender known, yes effect (v-structures) X Z • This is the “common cause” scenario • Are X and Z independent? gender • Yes: remember the ballgame and the rain G causing traffic, no correlation? causing traffic no correlation? Y • Still need to prove they must be (try it!) • Are X and Z independent given Y? hair L H height length X: Raining • No: remember that seeing traffic put the rain and the ballgame in competition? Z: Ballgame • This is backwards from the other cases Y: Traffic ≠ H ⊥ p ( h | l ) p ( h ) L • Observing the effect enables influence between = ⊥ p ( h | l , g ) p ( h | g ) | H L G effects. 20 Slide credit: Karen Livescu More explaining away... Common Effect Example • Let X, Z be two i.i.d coin tosses {0,1} pipes faucet caulking drain upstairs • Let Y = X + Z C 1 C 2 C 3 C 4 C 5 • If we observe Y then X and Z become coupled X Z Y L 0 0 0 leak 1 0 1 0 1 1 1 1 2 = ⊥ ∀ ( | ) ( ) p c c p c C C i , j i j i i j ≠ ⊥ ∀ p ( c | c , l ) p ( c | l ) C C | L i , j • P(X=1|Z=1) = 0.25 but P(X=1|Z=1,Y=2) = 1 i j i i j Slide credit: Karen Livescu Examples of the three cases The General Case • Any complex example can be analyzed using Greenhouse Global SUVs Gasses Warming these three canonical cases • General question: in a given BN are two • General question: in a given BN, are two Lung Bad d Smoking Cancer Breath variables independent (given evidence)? • Solution: analyze the graph Genetics Cancer Smoking Page 23 24 Slide: J. Bilmes 4

Reachability Reachability (the Bayes’ Ball) Correct algorithm: • Recipe: shade evidence nodes • S L • Shade in evidence • Start at source node • Attempt 1: if two nodes are connected • Try to reach target by search X X by an undirected path not blocked by a R B • States: pair of (node X, previous shaded node, they are conditionally state S) S S dependent d d • Successor function: • X unobserved: • To any child • Almost works, but not quite S D T • To any parent if coming from a child • Where does it break? • X observed: • Answer: the v-structure at T doesn’t count • From parent to parent X X as a link in a path unless “inactive” • If you can’t reach a node, it’s conditionally independent of the T’ S start node given evidence 25 26 Reachability (D-Separation) Example Active Triples Inactive Triples • Question: Are X and Y conditionally independent given evidence variables {Z}? • Look for “active paths” from X to Y Yes • No active paths = independence! • A path is active if each triple is • A path is active if each triple is either a: Causal chain A → B → C where B is • unobserved (either direction) Common cause A ← B → C where B • is unobserved • Common effect (aka v-structure) A → B ← C where B or one of its descendents is observed Also known as Bayes Ball 28 27 Example Example • Variables: L • R: Raining R Yes • T: Traffic R B Yes Yes • D: Roof drips D: Roof drips T D • S: I’m sad • Questions: D T S Yes Yes T’ 29 30 5

Causality? Example: Coins • When Bayes’ nets reflect the true causal patterns: • Extra arcs don’t prevent representing independence, • Often simpler (nodes have fewer parents) just allow non-independence • Often easier to think about • Often easier to elicit from experts • BNs need not actually be causal y X 1 X 2 X 1 X 2 1 2 1 2 • Sometimes no causal net exists over the domain • E.g. consider the variables Traffic and Drips • End up with arrows that reflect correlation, not causation h 0.5 h 0.5 h 0.5 h | h 0.5 • What do the arrows really mean? t 0.5 t 0.5 t 0.5 t | h 0.5 • Topology may happen to encode causal structure h | t 0.5 • Topology only guaranteed to encode conditional independence t | t 0.5 31 32 Changing Bayes’ Net Structure Example: Alternate Alarm If we reverse the edges, we • The same joint distribution can be encoded in make different conditional B urglary E arthquake many different Bayes’ nets independence assumptions • Causal structure tends to be the simplest J ohn calls M ary calls A l A larm • Analysis question: given some edges, what other edges do you need to add? A larm J ohn calls • One answer: fully connect the graph M ary calls • Better answer: don’t make any false conditional To capture the same joint independence assumptions B urglary E arthquake distribution, we have to add 33 more edges to the graph 34 Summary • Bayes nets compactly encode joint distributions • Guaranteed independencies of distributions can be deduced from BN graph structure • The Bayes’ ball algorithm (aka d-separation) • A Bayes’ net may have other independencies that are not detectable until you inspect its specific distribution 35 6

1 Building the (Entire) Joint Example: Alarm Network We can take a - PDF document

Announcements Introduction to Artificial Intelligence How was mid-term? V22.0472-001 Fall 2009 Will grade mid-term / assignment 2 this Lecture 14: Bayes Nets 2 Lecture 14: Bayes Nets 2 weekend weekend Assignment 3 due this time

Baysian Networks Marco Chiarandini Department of Mathematics & Computer Science University

Probabilistic Models CS 4100: Artificial Intelligence Bayes Nets Models describe how (a

Evaluating Software Sensors for Actively Profiling Windows 2000 Computer Users Mark Shavlik

TulStat 911 PSC The City Experience Inside City Hall April 17, 2017 911 PSC Mission

Embedded Systems Programming Signaling (Module 24) Yann-Hang Lee Arizona State University

s tr t s

Leading with Innovation NIC Virtual Conference November 9, 2016 2 Drones: Implications of

ProgOS UE Getting Introduction to Pintos Started Pintos Basics Daniel Prokesch, Denise

Probabilistic Models Models describe how (a portion of) the world works Models are always

Chapter14 Probabilistic Reasoning (Bayesian Networks) Sec. 1 - 2 20070607 Chap14 1

Bayesian Networks Bayesian Networks Course: CS40022 Course: CS40022 Instructor: Dr. Pallab

1 Relation Between Multinomial Logistic Regression Nave Bayes and Logistic Regression

in Memory for Melodies W. Jay Dowling University of Texas at Dallas Thanks to Rachna Raman &

Fingerprints in the Ether: Physical Layer Authentication Liang Xiao Advisors: Prof. L.

Hardwiring Happiness : The Practical Science of Growing Inner Strength and Peace Openground

and Electrooculography Features Ruofei Du 1 , Renjie Liu 1 , Tianxiang Wu 1 , Baoliang Lu 1234 1

Using Workforce Planning Systems in Managing Fatigue Risk Arjen Heeres, COO Quintiq 13 th

Day 1 Summary 1 Extending the Web Platform to Automotive What considerations need to be

Acquiring Durable Mental Resources 2 3 Resources in the Mind Mental resources which help us

End Homelessness Co-production: its messy HSCP Strategy, Integrated Joint Board Policy

An EPI PIC Ap C Approach to Co Community P Partnerships Dr. Sandy Turnage and Emily Toalson

Fostering Transfer Student Success Through Cross Campus Collaboration Maia Randle, Ph.D., PI and

South East London Commissioning Alliance: Engagement with Health & Wellbeing Boards on CCG

What I wont talk about Luc De Raedt (KULeuven) Dagstuhl Seminar on ML and Formal Methods

1 Building the (Entire) Joint Example: Alarm Network We can take a - PDF document

Announcements Introduction to Artificial Intelligence How was mid-term? V22.0472-001 Fall 2009 Will grade mid-term / assignment 2 this Lecture 14: Bayes Nets 2 Lecture 14: Bayes Nets 2 weekend weekend Assignment 3 due this time

Baysian Networks Marco Chiarandini Department of Mathematics &amp; Computer Science University

Probabilistic Models CS 4100: Artificial Intelligence Bayes Nets Models describe how (a

Evaluating Software Sensors for Actively Profiling Windows 2000 Computer Users Mark Shavlik

TulStat 911 PSC The City Experience Inside City Hall April 17, 2017 911 PSC Mission

Embedded Systems Programming Signaling (Module 24) Yann-Hang Lee Arizona State University

s tr t s

Leading with Innovation NIC Virtual Conference November 9, 2016 2 Drones: Implications of

ProgOS UE Getting Introduction to Pintos Started Pintos Basics Daniel Prokesch, Denise

Probabilistic Models Models describe how (a portion of) the world works Models are always

Chapter14 Probabilistic Reasoning (Bayesian Networks) Sec. 1 - 2 20070607 Chap14 1

Bayesian Networks Bayesian Networks Course: CS40022 Course: CS40022 Instructor: Dr. Pallab

1 Relation Between Multinomial Logistic Regression Nave Bayes and Logistic Regression

in Memory for Melodies W. Jay Dowling University of Texas at Dallas Thanks to Rachna Raman &amp;

Fingerprints in the Ether: Physical Layer Authentication Liang Xiao Advisors: Prof. L.

Hardwiring Happiness : The Practical Science of Growing Inner Strength and Peace Openground

and Electrooculography Features Ruofei Du 1 , Renjie Liu 1 , Tianxiang Wu 1 , Baoliang Lu 1234 1

Using Workforce Planning Systems in Managing Fatigue Risk Arjen Heeres, COO Quintiq 13 th

Day 1 Summary 1 Extending the Web Platform to Automotive What considerations need to be

Acquiring Durable Mental Resources 2 3 Resources in the Mind Mental resources which help us

End Homelessness Co-production: its messy HSCP Strategy, Integrated Joint Board Policy

An EPI PIC Ap C Approach to Co Community P Partnerships Dr. Sandy Turnage and Emily Toalson

Fostering Transfer Student Success Through Cross Campus Collaboration Maia Randle, Ph.D., PI and

South East London Commissioning Alliance: Engagement with Health &amp; Wellbeing Boards on CCG

What I wont talk about Luc De Raedt (KULeuven) Dagstuhl Seminar on ML and Formal Methods

Baysian Networks Marco Chiarandini Department of Mathematics & Computer Science University

in Memory for Melodies W. Jay Dowling University of Texas at Dallas Thanks to Rachna Raman &

South East London Commissioning Alliance: Engagement with Health & Wellbeing Boards on CCG