Example Topology of network encodes conditional independence - PDF document

Example Topology of network encodes conditional independence assertions: Cavity Weather Bayesian networks Toothache Catch Chapter 14.1–3 Weather is independent of the other variables Toothache and Catch are conditionally independent given Cavity Chapter 14.1–3 1 Chapter 14.1–3 4 Outline Example ♦ Syntax I’m at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn’t call. Sometimes it’s set off by minor earthquakes. Is there a ♦ Semantics burglar? ♦ Parameterized distributions Variables: Burglar , Earthquake , Alarm , JohnCalls , MaryCalls Network topology reflects “causal” knowledge: – A burglar can set the alarm off – An earthquake can set the alarm off – The alarm can cause Mary to call – The alarm can cause John to call Chapter 14.1–3 2 Chapter 14.1–3 5 Bayesian networks Example contd. A simple, graphical notation for conditional independence assertions P(B) P(E) and hence for compact specification of full joint distributions Burglary Earthquake .002 .001 Syntax: a set of nodes, one per variable B E P(A|B,E) a directed, acyclic graph (link ≈ “directly influences”) T T .95 Alarm a conditional distribution for each node given its parents: T F .94 F T .29 P ( X i | Parents ( X i )) F F .001 In the simplest case, conditional distribution represented as a conditional probability table (CPT) giving the A P(J|A) A P(M|A) distribution over X i for each combination of parent values JohnCalls T .90 .70 MaryCalls T F .05 .01 F Chapter 14.1–3 3 Chapter 14.1–3 6

Compactness Global semantics A CPT for Boolean X i with k Boolean parents has “Global” semantics defines the full joint distribution B E B E 2 k rows for the combinations of parent values as the product of the local conditional distributions: A A P ( x 1 , . . . , x n ) = Π n Each row requires one number p for X i = true i = 1 P ( x i | parents ( X i )) (the number for X i = false is just 1 − p ) J M J M e.g., P ( j ∧ m ∧ a ∧ ¬ b ∧ ¬ e ) If each variable has no more than k parents, = P ( j | a ) P ( m | a ) P ( a |¬ b, ¬ e ) P ( ¬ b ) P ( ¬ e ) the complete network requires O ( n · 2 k ) numbers = 0 . 9 × 0 . 7 × 0 . 001 × 0 . 999 × 0 . 998 I.e., grows linearly with n , vs. O (2 n ) for the full joint distribution ≈ 0 . 00063 For burglary net, ?? numbers Chapter 14.1–3 7 Chapter 14.1–3 10 Compactness Constructing Bayesian networks A CPT for Boolean X i with k Boolean parents has Need a method such that a series of locally testable assertions of B E 2 k rows for the combinations of parent values conditional independence guarantees the required global semantics A Each row requires one number p for X i = true 1. Choose an ordering of variables X 1 , . . . , X n (the number for X i = false is just 1 − p ) 2. For i = 1 to n J M add X i to the network If each variable has no more than k parents, select parents from X 1 , . . . , X i − 1 such that the complete network requires O ( n · 2 k ) numbers P ( X i | Parents ( X i )) = P ( X i | X 1 , . . . , X i − 1 ) I.e., grows linearly with n , vs. O (2 n ) for the full joint distribution For burglary net, 1 + 1 + 4 + 2 + 2 = 10 numbers (vs. 2 5 − 1 = 31 ) Chapter 14.1–3 8 Chapter 14.1–3 11 Global semantics Constructing Bayesian networks Global semantics defines the full joint distribution Need a method such that a series of locally testable assertions of B E as the product of the local conditional distributions: conditional independence guarantees the required global semantics A P ( x 1 , . . . , x n ) = Π n i = 1 P ( x i | parents ( X i )) 1. Choose an ordering of variables X 1 , . . . , X n 2. For i = 1 to n J M e.g., P ( j ∧ m ∧ a ∧ ¬ b ∧ ¬ e ) add X i to the network select parents from X 1 , . . . , X i − 1 such that = P ( X i | Parents ( X i )) = P ( X i | X 1 , . . . , X i − 1 ) This choice of parents guarantees the global semantics: P ( X 1 , . . . , X n ) = Π n i = 1 P ( X i | X 1 , . . . , X i − 1 ) (chain rule) = Π n i = 1 P ( X i | Parents ( X i )) (by construction) Chapter 14.1–3 9 Chapter 14.1–3 12

Example Example Suppose we choose the ordering M , J , A , B , E Suppose we choose the ordering M , J , A , B , E MaryCalls MaryCalls JohnCalls JohnCalls Alarm Burglary Earthquake P ( J | M ) = P ( J ) ? P ( J | M ) = P ( J ) ? No P ( A | J, M ) = P ( A | J ) ? P ( A | J, M ) = P ( A ) ? No P ( B | A, J, M ) = P ( B | A ) ? Yes P ( B | A, J, M ) = P ( B ) ? No P ( E | B, A, J, M ) = P ( E | A ) ? P ( E | B, A, J, M ) = P ( E | A, B ) ? Chapter 14.1–3 13 Chapter 14.1–3 16 Example Example Suppose we choose the ordering M , J , A , B , E Suppose we choose the ordering M , J , A , B , E MaryCalls MaryCalls JohnCalls JohnCalls Alarm Alarm Burglary Earthquake P ( J | M ) = P ( J ) ? No P ( J | M ) = P ( J ) ? No P ( A | J, M ) = P ( A | J ) ? P ( A | J, M ) = P ( A ) ? P ( A | J, M ) = P ( A | J ) ? P ( A | J, M ) = P ( A ) ? No P ( B | A, J, M ) = P ( B | A ) ? Yes P ( B | A, J, M ) = P ( B ) ? No P ( E | B, A, J, M ) = P ( E | A ) ? No P ( E | B, A, J, M ) = P ( E | A, B ) ? Yes Chapter 14.1–3 14 Chapter 14.1–3 17 Example Example contd. Suppose we choose the ordering M , J , A , B , E MaryCalls MaryCalls JohnCalls JohnCalls Alarm Alarm Burglary Burglary Earthquake P ( J | M ) = P ( J ) ? No Deciding conditional independence is hard in noncausal directions P ( A | J, M ) = P ( A | J ) ? P ( A | J, M ) = P ( A ) ? No (Causal models and conditional independence seem hardwired for humans!) P ( B | A, J, M ) = P ( B | A ) ? P ( B | A, J, M ) = P ( B ) ? Assessing conditional probabilities is hard in noncausal directions Network is less compact: 1 + 2 + 4 + 2 + 4 = 13 numbers needed Chapter 14.1–3 15 Chapter 14.1–3 18

Example: Car diagnosis Compact conditional distributions contd. Initial evidence: car won’t start Noisy-OR distributions model multiple noninteracting causes Testable variables (green), “broken, so fix it” variables (orange) 1) Parents U 1 . . . U k include all causes (can add leak node) Hidden variables (gray) ensure sparse structure, reduce parameters 2) Independent failure probability q i for each cause alone ⇒ P ( X | U 1 . . . U j , ¬ U j +1 . . . ¬ U k ) = 1 − Π j i = 1 q i fanbelt alternator battery age broken broken Cold Flu Malaria P ( Fever ) P ( ¬ Fever ) F F F 1 . 0 0.0 F F T 0 . 9 0.1 battery no charging F T F 0 . 8 0.2 dead F T T 0 . 98 0 . 02 = 0 . 2 × 0 . 1 T F F 0 . 4 0.6 battery battery fuel line starter no oil no gas T F T 0 . 94 0 . 06 = 0 . 6 × 0 . 1 flat blocked broken meter T T F 0 . 88 0 . 12 = 0 . 6 × 0 . 2 T T T 0 . 988 0 . 012 = 0 . 6 × 0 . 2 × 0 . 1 Number of parameters linear in number of parents car won’t lights oil light gas gauge dipstick start Chapter 14.1–3 19 Chapter 14.1–3 22 Example: Car insurance Hybrid (discrete+continuous) networks SocioEcon Discrete ( Subsidy ? and Buys ? ); continuous ( Harvest and Cost ) Age GoodStudent ExtraCar Subsidy? Harvest Mileage RiskAversion VehicleYear SeniorTrain Cost DrivingSkill MakeModel DrivingHist Antilock DrivQuality HomeBase AntiTheft CarValue Airbag Buys? Accident Ruggedness Theft OwnDamage Option 1: discretization—possibly large errors, large CPTs Cushioning Option 2: finitely parameterized canonical families OwnCost OtherCost 1) Continuous variable, discrete+continuous parents (e.g., Cost ) MedicalCost LiabilityCost PropertyCost 2) Discrete variable, continuous parents (e.g., Buys ? ) Chapter 14.1–3 20 Chapter 14.1–3 23 Compact conditional distributions Continuous variables 2 πσ e − ( x − µ ) 2 / 2 σ 2 CPT grows exponentially with number of parents 1 Gaussian density P ( x ) = √ CPT becomes infinite with continuous-valued parent or child Solution: canonical distributions that are defined compactly Deterministic nodes are the simplest case: X = f ( Parents ( X )) for some function f 0 E.g., Boolean functions NorthAmerican ⇔ Canadian ∨ US ∨ Mexican Uniform density P ( X = x ) = U [18 , 26]( x ) = uniform density between 18 and 26 E.g., numerical relationships among continuous variables 0.125 ∂Level = inflow + precipitation - outflow - evaporation ∂t 18 dx 26 Chapter 14.1–3 21 Chapter 14.1–3 24

Example Topology of network encodes conditional independence - PDF document

Example Topology of network encodes conditional independence assertions: Cavity Weather Bayesian networks Toothache Catch Chapter 14.13 Weather is independent of the other variables Toothache and Catch are conditionally independent given

The Firefighter Problem on Trees David Ellison RMIT School of Science Co-authors: Pierre

Toy Example Toy Example Toy Example Toy Example Toy Example D 1 weak classifiers = vertical or

Toy Example Toy Example Toy Example Toy Example Toy Example D 1 weak classifiers = vertical or

An Example for An Example for An Example for An Example for An Example for An Example for An

Example 1 ln x x dx Example 1 ln x x dx We make the substitution: Example 1 ln x

Part I Baseball Pennant Race Pennant Race: Example Another Example Example Example Team Won

Tutorial 2 the outline Example-1 from linear algebra Conditional probability Example 2:

Proofs by example Benjamin Matschke Boston University Number Theory Seminar Harvard, Oct. 2019

Example 1 x 2 dx x 2 16 Example 1 x 2 dx x 2 16 We want to use the

DEMO: torus example DEMO: torus example DEMO: torus example M Datar, Y Gur, B Paniagua, MA

OTHER EXAMPLE: 2 OTHER EXAMPLE: 3 OTHER EXAMPLE: 4 OBJECTIVE CASE 5 PERSONAL PRONOUNS

Aggregation Operations from First Example: . . . Quantum Computing Second Example: . . . Third

Equations First Example: . . . First Example (cont-d) Without Equations: Second Example: . . .

Lecture 7: Neural Nets Mark Hasegawa-Johnson ECE 417: Multimedia Signal Processing, Fall 2020

Example-Based Skeleton Extraction Scott Schaefer Can Yuksel Example-Based Deformation Examples

Example of Black Body Spectra for different temperatures What is the best known example of a black

Inference by enumeration Slightly intelligent way to sum out variables from the joint without

ARM: Allwinner sunxi SoC's and the community behind it FOSDEM2014 ULB K.1.105 2014-02-02 11:00

Stochastic optimization for the crude oil procurement problem Thomas Martin, Michel De Lara

Nonparametric Directional Perception Julian Straub Collaborators: Oren Freifeld, Jason Chang,

BRICS Goodness C. Aye UNESCAP/UNU-WIDER Conference September 2019 Outline

Learning using attributes Thomas Mensink Computer Vision by Learning, March 28th 11:30-12:15

Coherence Analysis Overview Definition Coherency Definition R xy (e j ) Properties G xy

Midterm #1 Review February 1, 2013 1 / 11 I will provide . . . ASCII character encoding