ECE 6504: Advanced Topics in Machine Learning Probabilistic - PowerPoint PPT Presentation

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics: – Bayes Nets: Representation/Semantics – d-separation, Local Markov Assumption – Markov Blanket – I-equivalence, (Minimal) I-Maps, P-Maps Readings: KF 3.2,, 3.4 Dhruv Batra Virginia Tech

Recap of Last Time (C) Dhruv Batra 2

A general Bayes net • Set of random variables Flu Allergy Sinus • Directed acyclic graph – Encodes independence assumptions Nose Headache • CPTs – Conditional Probability Tables • Joint distribution: (C) Dhruv Batra 3

Independencies in Problem World, Data, reality: BN: True distribution P contains independence assertions Graph G encodes local independence assumptions (C) Dhruv Batra Slide Credit: Carlos Guestrin 4

Bayes Nets • BN encode (conditional) independence assumptions. – I(G) = {X indep of Y given Z} • Which ones? • And how can we easily read them? (C) Dhruv Batra 5

Local Structures • What’s the smallest Bayes Net? (C) Dhruv Batra 6

Local Structures Indirect causal effect: X Z Y Indirect evidential effect: Common effect: X Z Y X Y Common cause: Z Z X Y (C) Dhruv Batra 7

Bayes Ball Rules • Flow of information – on board (C) Dhruv Batra 8

Plan for today • Bayesian Networks: Semantics – d-separation – General (conditional) independence assumptions in a BN – Markov Blanket – (Minimal) I-map, P-map (C) Dhruv Batra 9

Active trails formalized • Let variables O ⊆ {X 1 , … ,X n } be observed • A path X 1 – X 2 – · · · –X k is an active trail if for each consecutive triplet: – X i-1 → X i → X i+1 , and X i is not observed (X i ∉ O ) – X i-1 ← X i ← X i+1 , and X i is not observed (X i ∉ O ) – X i-1 ← X i → X i+1 , and X i is not observed (X i ∉ O ) – X i-1 → X i ← X i+1 , and X i is observed (X i ∈ O ), or one of its descendents is observed (C) Dhruv Batra Slide Credit: Carlos Guestrin 10

An active trail – Example G E A B D H C F F’ F’’ When are A and H independent?

d-Separation A B • Definition : Variables X and Y are d-separated given Z if C – no active trail between X i and Y j when variables Z ⊆ {X 1 , … ,X n } are E observed D G F H J I K (C) Dhruv Batra Slide Credit: Carlos Guestrin 12

d-Separation • So what if X and Y are d-separated given Z ? (C) Dhruv Batra 13

Factorization + d-sep è Independence • Theorem: – If • P factorizes over G • d-sep G ( X , Y | Z ) – Then • P Ⱶ ( X ⊥ Y | Z ) – Corollary: • I( G ) ⊆ I( P ) • All independence assertions read from G are correct! (C) Dhruv Batra 14

More generally: Completeness of d-separation • Theorem: Completeness of d-separation – For “almost all” distributions where P factorizes over to G – we have that I( G ) = I( P ) • “almost all” distributions : except for a set of measure zero of CPTs • Means that if X & Y are not d-separated given Z , then P ¬ ( X ⊥ Y|Z ) (C) Dhruv Batra Slide Credit: Carlos Guestrin 15

Local Markov Assumption A variable X is independent of Flu Allergy its non-descendants given its parents and only its parents Sinus (X i ⊥ NonDescendants Xi | Pa Xi ) Nose Headache

Markov Blanket = Markov Blanket of variable x 8 ¡ – Parents, children and parents of children ¡ (C) Dhruv Batra Slide Credit: Simon J.D. Prince 17

Example A variable is conditionally independent of all others, given its Markov Blanket ¡ (C) Dhruv Batra Slide Credit: Simon J.D. Prince 18

I-map • Independency map • Definition: – If I( G ) ⊆ I( P ) – G is an I-map of P (C) Dhruv Batra 19

Factorization + d-sep è Independence • Theorem: – If • P factorizes over G • d-sep G ( X , Y | Z ) – Then • P Ⱶ ( X ⊥ Y | Z ) – Corollary: • I( G ) ⊆ I( P ) • G is an I-map of P • All independence assertions read from G are correct! (C) Dhruv Batra 20

The BN Representation Theorem P factorizes to G Obtain If G is an I-map of P Important because: Every P has at least one BN structure G Homework 1!!!! J J P factorizes to G Obtain G is an I-map of P Important because: Read independencies of P from BN structure G (C) Dhruv Batra Slide Credit: Carlos Guestrin 21

I-Equivalence • Two graphs G 1 and G 2 are I-equivalent if – I( G 1 ) = I( G 2 ) • Equivalence class of BN structures – Mutually-exclusive and exhaustive partition of graphs (C) Dhruv Batra 22

Minimal I-maps & P-maps • Many possible I-maps • Is there a “simplest” I-map? • Yes, two directions – Minimal I-maps – P-maps (C) Dhruv Batra 23

Minimal I-map • G is a minimal I-map for P if – deleting any edges from G makes it no longer an I-map (C) Dhruv Batra 24

P-map • Perfect map • G is a P-map for P if – I( P ) = I( G ) • Question: Does every distribution P have P-map? (C) Dhruv Batra 25

BN: Representation: What you need to know • Bayesian networks – A compact representation for large probability distributions – Not an algorithm • Representation – BNs represent (conditional) independence assumptions – BN structure = family of distributions – BN structure + CPTs = 1 single distribution – Concepts • Active Trails (flow of information); d-separation; • Local Markov Assumptions, Markov Blanket • I-map, P-map • BN Representation Theorem (I-map çè Factorization) (C) Dhruv Batra 26

Main Issues in PGMs • Representation – How do we store P(X 1 , X 2 , … , X n ) – What does my model mean/imply/assume? (Semantics) • Learning – How do we learn parameters and structure of P(X 1 , X 2 , … , X n ) from data? – What model is the right for my data? • Inference – How do I answer questions/queries with my model? such as – Marginal Estimation: P(X 5 | X 1 , X 4 ) – Most Probable Explanation: argmax P(X 1 , X 2 , … , X n ) (C) Dhruv Batra 27

Learning Bayes nets Known structure Unknown structure Fully observable Very easy Hard data Missing data Somewhat easy Very very hard (EM) Data CPTs – x (1) P(X i | Pa Xi ) … x (m) structure parameters (C) Dhruv Batra Slide Credit: Carlos Guestrin 28

ECE 6504: Advanced Topics in Machine Learning Probabilistic - PowerPoint PPT Presentation

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics: Bayes Nets: Representation/Semantics d-separation, Local Markov Assumption Markov Blanket I-equivalence, (Minimal)

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale

ECE 6504: Deep Learning for Perception Topics: LSTMs (intuition and variants) [Abhishek:]

ECE 6504: Deep Learning for Perception Topics: (Finish) Backprop Convolutional Neural

ECE 6504: Deep Learning for Perception Topics: Recurrent Neural Networks (RNNs) BackProp

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech ECE 4424 / 5424G (CS

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

ECE 697J Advanced Topics Advanced Topics ECE 697J in Computer Networks in Computer

ECE 697J Advanced Topics Advanced Topics ECE 697J in Computer Networks in Computer

ECE 697J - - Advanced Topics in Advanced Topics in ECE 697J Computer Networks Computer

ECE 697J Advanced Topics Advanced Topics ECE 697J in Computer Networks in Computer

ECE 697J Advanced Topics Advanced Topics ECE 697J in Computer Networks in Computer

ECE 697J Advanced Topics Advanced Topics ECE 697J in Computer Networks in Computer

C LANGUAGE INTRODUCTION CSSE 120Rose Hulman Institute of Technology The C Programming

A A novel hyb ybrid distributed-ro routing and SDN N solution for traffic engineering ANR

BradChamberlain,SungEunChoi,SteveDeitz,

CS675: Convex and Combinatorial Optimization Fall 2019 Duality of Convex Optimization Problems

AND RESIDENTS Anne Marie Bott, PharmD, BCOP, BCPS IHS Alaska Area Oncology Pharmacist Infusion

Groupoid C -algebras and their canonical diagonal subalgebras Efren Ruiz Work in progress

SMEC 2014 Context and Aim Inquiry based learning places a strong emphasis on students

AUTOMATED REASONING make resolution steps, but for even a medium sized problem the number of