Problems with high-dim. distribu)ons Suppose we have n - PowerPoint PPT Presentation

Introduc)on ¡to ¡ ¡ Ar)ficial ¡Intelligence ¡ Lecture ¡12 ¡– ¡Bayesian ¡Network ¡Inference ¡ CS/CNS/EE ¡154 ¡ Andreas ¡Krause ¡ TexPoint ¡fonts ¡used ¡in ¡EMF. ¡ ¡

Problems ¡with ¡high-‑dim. ¡distribu)ons ¡ � Suppose ¡we ¡have ¡ n ¡proposi)onal ¡symbols ¡ � How ¡many ¡parameters ¡do ¡we ¡need ¡to ¡specify ¡ ¡ P(X 1 =x 1 ,…,X n =x n ) ? ¡ X 1 ¡ X 2 ¡ … ¡ X n-‑1 ¡ X n ¡ P(X) ¡ 0 ¡ 0 ¡ … ¡ 0 ¡ 0 ¡ .01 ¡ 0 ¡ 0 ¡ … ¡ 1 ¡ 0 ¡ .001 ¡ 0 ¡ 0 ¡ … ¡ 1 ¡ 1 ¡ .213 ¡ … ¡ … ¡ … ¡ … ¡ … ¡ 1 ¡ 1 ¡ … ¡ 1 ¡ 1 ¡ .0003 ¡ ¡ ¡2 n -‑1 ¡parameters! ¡  ¡ 2 ¡

Marginal ¡distribu)ons ¡ � Suppose ¡we ¡have ¡joint ¡distribu)on ¡P(X 1 ,…,X n ) ¡ � Then ¡ � If ¡all ¡X i ¡binary: ¡ ¡How ¡many ¡terms? ¡ 3 ¡ 3 ¡

Independent ¡RVs ¡ � What ¡if ¡RVs ¡are ¡independent? ¡ ¡ P(X 1 =x 1 ,…,X n =x n ) ¡= ¡P(x 1 ) ¡P(x 2 ) ¡… ¡P(x n ) ¡ � How ¡many ¡parameters ¡are ¡needed ¡in ¡this ¡case? ¡ � How ¡about ¡compu)ng ¡ P(x i )? ¡ � Independence ¡too ¡strong ¡assump)on… ¡Is ¡there ¡ something ¡weaker? ¡

Key ¡concept: ¡Condi)onal ¡independence ¡ � Random ¡variables ¡X ¡and ¡Y ¡cond. ¡indep. ¡given ¡Z ¡if ¡ for ¡all ¡x, ¡y, ¡z: ¡ ¡ ¡P(X ¡= ¡x, ¡Y ¡= ¡y ¡| ¡Z ¡= ¡z) ¡= ¡P(X ¡=x ¡| ¡Z ¡= ¡z) ¡P(Y ¡= ¡y| ¡Z= ¡z) ¡ � If ¡P(Y=y ¡|Z=z)>0, ¡that’s ¡equivalent ¡to ¡ ¡ ¡P(X ¡= ¡x ¡| ¡Z ¡= ¡z, ¡Y ¡= ¡y) ¡= ¡P(X ¡= ¡x ¡| ¡Z ¡= ¡z) ¡ ¡Similarly ¡for ¡sets ¡of ¡random ¡variables ¡ X , ¡ Y , ¡ Z ¡ ¡We ¡write: ¡ 5 ¡ 5 ¡

Bayesian ¡networks ¡ � Compact ¡representa)on ¡of ¡distribu)ons ¡over ¡large ¡ number ¡of ¡variables ¡ � (Oien) ¡allows ¡efficient ¡exact ¡inference ¡(compu)ng ¡ marginals, ¡etc.) ¡ HailFinder ¡ 56 ¡vars ¡ ~ ¡3 ¡states ¡each ¡  ~10 26 ¡terms ¡ > ¡ 10.000 ¡years ¡ on ¡Top ¡ ¡ supercomputers ¡ JavaBayes ¡applet ¡ 6 ¡

Bayesian ¡networks ¡ � A ¡ Bayesian ¡network ¡structure ¡ is ¡a ¡ ¡ directed, ¡acyclic ¡graph ¡G, ¡where ¡each ¡vertex ¡s ¡of ¡G ¡is ¡ interpreted ¡as ¡a ¡random ¡variable ¡X s ¡ (with ¡unspecified ¡ distribu)on) ¡ ¡ � A ¡ Bayesian ¡network ¡ (G,P) ¡consists ¡of ¡ ¡ � A ¡BN ¡structure ¡G ¡and ¡.. ¡ � ..a ¡set ¡of ¡condi)onal ¡probability ¡distribu)ons ¡(CPTs) ¡ P(X s ¡| ¡ Pa Xs ), ¡where ¡ Pa Xs ¡are ¡the ¡parents ¡of ¡node ¡X s ¡such ¡that ¡ � (G,P) ¡defines ¡joint ¡distribu)on ¡ 7 ¡

Represen)ng ¡the ¡world ¡using ¡BNs ¡ s 4 s 1 s 1 s 2 s 3 s 3 s 5 s 7 s 8 s 6 represent ¡ s 9 s 10 s 9 s 11 s 12 s 12 True ¡distribu)on ¡P’ ¡ Bayes ¡net ¡(G,P) ¡ with ¡cond. ¡ind. ¡I(P’) ¡ with ¡ ¡I(P) ¡ � Want ¡to ¡make ¡sure ¡that ¡ ¡I(P) ¡is ¡a ¡subset ¡of ¡I(P’) ¡ � Need ¡to ¡understand ¡condi)onal ¡independence ¡ proper)es ¡of ¡BN ¡(G,P) ¡ ¡ 8 ¡

BNs ¡with ¡3 ¡nodes ¡ X ¡ Y ¡ Z ¡ X ¡ Y ¡ Z ¡ X ¡ Z ¡ Y ¡ Y ¡ X ¡ Z ¡ 9 ¡

Ac)ve ¡trails ¡ � When ¡are ¡A ¡and ¡I ¡independent? ¡ B ¡ I ¡ C ¡ G ¡ A ¡ H ¡ D ¡ E ¡ F ¡ 10 ¡

Ac)ve ¡trails ¡ � An ¡undirected ¡path ¡in ¡BN ¡structure ¡G ¡is ¡called ¡ ¡ ac)ve ¡trail ¡for ¡observed ¡variables ¡ O ¡ µ ¡{X 1 ,…,X n }, ¡if ¡for ¡ every ¡consecu)ve ¡triple ¡of ¡vars ¡X,Y,Z ¡on ¡the ¡path ¡ � X ¡  ¡Y ¡  ¡Z ¡and ¡Y ¡is ¡unobserved ¡(Y ¡ ∉ ¡ O ) ¡ � X ¡  ¡Y ¡  ¡Z ¡and ¡Y ¡is ¡unobserved ¡(Y ¡ ∉ ¡ O ) ¡ � X ¡  ¡Y ¡  ¡Z ¡and ¡Y ¡is ¡unobserved ¡(Y ¡ ∉ ¡ O ) ¡ � X ¡  ¡Y ¡  ¡Z ¡and ¡Y ¡ or ¡any ¡of ¡Y’s ¡descendants ¡ is ¡observed ¡ � Any ¡variables ¡X i ¡and ¡X j ¡for ¡which ¡there ¡is ¡no ac)ve ¡trail ¡ for ¡observa)ons ¡ O ¡are ¡called ¡d-‑separated ¡by ¡ O ¡ We ¡write ¡d-‑sep(X i ; X j ¡ | ¡O ) ¡ � Sets ¡ A ¡and ¡ B ¡are ¡d-‑separated ¡given ¡ O ¡if ¡d-‑sep(X,Y ¡| O ) ¡ for ¡all ¡X ¡in A , ¡Y ¡in ¡ B . ¡ ¡Write ¡d-‑sep( A; ¡B ¡| ¡O ) ¡ 11 ¡

d-‑separa)on ¡and ¡independence ¡ Theorem : ¡ A ¡ G ¡ D ¡ I ¡ B ¡ i.e., ¡X ¡cond. ¡indep. ¡Y ¡given ¡Z ¡ E ¡ H ¡ if ¡there ¡does ¡not ¡exist ¡any ¡ ¡ C ¡ ac)ve ¡trail ¡between ¡X ¡and ¡Y ¡ F ¡ I ¡ for ¡observa)ons ¡ Z ¡ � Converse ¡does ¡not ¡hold ¡in ¡general! ¡ � But ¡for ¡“almost” ¡all ¡distribu)ons ¡ ¡ (except ¡set ¡of ¡measure ¡0) ¡ 12 ¡

Examples ¡ A ¡ G ¡ D ¡ I ¡ B ¡ E ¡ H ¡ C ¡ F ¡ J ¡ 13 ¡

More ¡examples ¡ A ¡ G ¡ D ¡ I ¡ B ¡ E ¡ H ¡ C ¡ F ¡ J ¡ 14 ¡

Algorithm ¡for ¡d-‑separa)on ¡ � How ¡can ¡we ¡check ¡if ¡d-‑sep(X; ¡Y ¡| ¡ Z )? ¡ � Idea : ¡ ¡Check ¡every ¡possible ¡path ¡connec)ng ¡X ¡and ¡Y ¡and ¡ verify ¡condi)ons ¡ A ¡ G ¡ � Exponen)ally ¡many ¡paths!!! ¡  ¡ D ¡ I ¡ B ¡ � Linear ¡)me ¡algorithm: ¡ E ¡ H ¡ Find ¡all ¡nodes ¡reachable ¡from ¡X ¡ C ¡ � 1. ¡Mark ¡ Z ¡and ¡its ¡ancestors ¡ F ¡ I ¡ � 2. ¡Do ¡breadth-‑first ¡search ¡star)ng ¡ from ¡X; ¡stop ¡if ¡path ¡is ¡blocked ¡ � Have ¡to ¡be ¡careful ¡with ¡implementa)on ¡details ¡ ¡ (see ¡reading) ¡ 15 ¡

Typical ¡queries: ¡Condi)onal ¡distribu)on ¡ � Compute ¡distribu)on ¡of ¡some ¡ E ¡ B ¡ variables ¡given ¡values ¡for ¡others ¡ A ¡ J ¡ M ¡ 16 ¡

Typical ¡queries: ¡Maximiza)on ¡ � MPE ¡(Most ¡probable ¡explana)on): ¡ E ¡ B ¡ ¡Given ¡values ¡for ¡some ¡vars, ¡ compute ¡most ¡likely ¡assignment ¡to ¡ all ¡remaining ¡vars ¡ A ¡ J ¡ M ¡ � MAP ¡(Maximum ¡a ¡posteriori): ¡ ¡Compute ¡most ¡likely ¡assignment ¡to ¡ some ¡variables ¡ 17 ¡

Hardness ¡of ¡inference ¡for ¡general ¡BNs ¡ � Compu)ng ¡condi)onal ¡distribu)ons: ¡ � Exact ¡solu)on: ¡#P-‑complete ¡ � NP-‑hard ¡to ¡obtain ¡any ¡nontrivial ¡approxima)on ¡ � Maximiza)on: ¡ � MPE: ¡NP-‑complete ¡ � MAP: ¡NP PP -‑complete ¡ � Inference ¡in ¡general ¡BNs ¡is ¡really ¡hard ¡  ¡ ¡ � Is ¡all ¡hope ¡lost? ¡ 18 ¡

Inference ¡ � Can ¡exploit ¡structure ¡(condi)onal ¡independence) ¡to ¡ efficiently ¡perform ¡ exact ¡inference ¡ in ¡many ¡prac)cal ¡ situa)ons ¡ � For ¡BNs ¡where ¡exact ¡inference ¡is ¡not ¡possible, ¡can ¡use ¡ algorithms ¡for ¡ approximate ¡inference ¡ (later) ¡ 19 ¡

Poten)al ¡for ¡savings: ¡Variable ¡elimina)on! ¡ X 1 ¡ X 2 ¡ X 3 ¡ X 4 ¡ X 5 ¡ Intermediate ¡solu)ons ¡are ¡distribu)ons ¡on ¡fewer ¡variables! ¡ 20 ¡

Variable ¡elimina)on ¡in ¡general ¡graphs ¡ � Push ¡sums ¡through ¡product ¡as ¡far ¡as ¡possible ¡ � Create ¡new ¡factor ¡by ¡summing ¡out ¡variables ¡ E ¡ B ¡ A ¡ J ¡ M ¡ 21 ¡

Variable ¡elimina)on ¡algorithm ¡ � Given ¡BN ¡and ¡Query ¡P(X ¡| ¡ E = e ) ¡ � Choose ¡an ¡ordering ¡of ¡X 1 ,…,X n ¡ � Set ¡up ¡ini)al ¡factors: ¡f i ¡= ¡P(X i ¡| ¡ Pa i ) ¡ � For ¡i ¡=1:n, ¡X i ¡ ∉ ¡{X, E } ¡ � Collect ¡all ¡factors ¡f ¡that ¡include ¡X i ¡ � Generate ¡new ¡factor ¡by ¡marginalizing ¡out ¡X i ¡ � Add ¡g ¡to ¡set ¡of ¡factors ¡ � Renormalize ¡P(x, e ) ¡to ¡get ¡P(x ¡| ¡ e ) ¡ 22 ¡

Mul)plying ¡factors ¡ A ¡ B ¡ f 1 (A,B) ¡ B ¡ C ¡ f 2 (B,C) ¡ 0 ¡ 0 ¡ .1 ¡ 0 ¡ 0 ¡ .4 ¡ 0 ¡ 1 ¡ .3 ¡ 0 ¡ 1 ¡ .2 ¡ 1 ¡ 0 ¡ .7 ¡ 1 ¡ 0 ¡ .5 ¡ 1 ¡ 1 ¡ .01 ¡ 1 ¡ 1 ¡ 0 ¡ 23 ¡

Problems with high-dim. distribu)ons Suppose we have n - PowerPoint PPT Presentation

Introduc)on to Ar)ficial Intelligence Lecture 12 Bayesian Network Inference CS/CNS/EE 154 Andreas Krause TexPoint fonts used in EMF. Problems with

Func+on applica+ons (calls, invoca+ons) lambda denotes a anonymous func+on To use a func+on, you

http://cs246.stanford.edu High dim. High dim. Graph Graph Infinite Infinite Machine Machine Apps

Study of Generalized Parton Distribu ons at Je ff erson Lab for LIGHT CONE 2016 September 05,

Name: Prone Leg Curl Tube Thickness: 3.0mm Dim: 196013501180mm Weight: 400KG Model No: EJ01

Name: Leg Extension Tube Thickness: 2.5mm Dim: 140105150cm Weight: 214KG Model No: OE502

Name: Prone Leg Curl Tube Thickness: 2.5mm Dim: 15299135cm Weight: 216 KG Model No: TT101

MC714: Sistemas Distribu dos Prof. Lucas Wanner Instituto de Computac ao, Unicamp

MC714: Sistemas Distribu dos Prof. Lucas Wanner Instituto de Computac ao, Unicamp

Distribu(onal Seman(cs for Resolving Bridging Men(ons Tim

Learning(Distribu.ons(over(Logical(Forms(for( Referring(Expression(Genera.on(

Learning Distribu.ons over Logical Forms for Referring Expression

Probability Distribu.ons on Structured Objects September 17, 2013

DATA SCIENCE AND MACHINE LEARNING Com m on GGPLOT VI SUALI ZATI ONS Dim itris Fouskakis

Corporate Presentation DIMCOIN Foundation Date: 15.06.2017 www.dimcoin.io Structure Structure

Almost Gorenstein rings Naoki Taniguchi Meiji University Based on the works jointly with S.

Galaxies are everywhere...dim little points/smudges of light Whirlpool Galaxy: image taken with

Random Variables August 7, 2019 August 7, 2019 1 / 45 Example: Commute Times I come to campus

MIDlet Navigation Graphs in JML MIDlet Navigation Graphs in JML Wojciech Mostowski and Erik Poll

Frameworks Concepts set of cooperating classes Frameworks extending some class

Basic Java Network Programming CS211 July 30 th , 2001 The Network and OSI Model IP Header TCP

13 Chapter Exercises GUI Objects and Event-Driven Programming Discuss the major difference

A Load Time Policy Checker for Open Multi-Application Smart Cards Nicola Dragoni 1 Eduardo Lostal

Java AWT CS2704: Object-Oriented Software Design and Construction Constantinos Phanouriou

Showing your applet on a webpage 1 11/22/2013 Steps Find the build folder inside your

Problems with high-dim. distribu)ons Suppose we have n - PowerPoint PPT Presentation

Introduc)on to Ar)ficial Intelligence Lecture 12 Bayesian Network Inference CS/CNS/EE 154 Andreas Krause TexPoint fonts used in EMF. Problems with

Func+on applica+ons (calls, invoca+ons) lambda denotes a anonymous func+on To use a func+on, you

http://cs246.stanford.edu High dim. High dim. Graph Graph Infinite Infinite Machine Machine Apps

Study of Generalized Parton Distribu ons at Je ff erson Lab for LIGHT CONE 2016 September 05,

Name: Prone Leg Curl Tube Thickness: 3.0mm Dim: 1960*1350*1180mm Weight: 400KG Model No: EJ01

Name: Leg Extension Tube Thickness: 2.5mm Dim: 140*105*150cm Weight: 214KG Model No: OE502

Name: Prone Leg Curl Tube Thickness: 2.5mm Dim: 152*99*135cm Weight: 216 KG Model No: TT101

MC714: Sistemas Distribu dos Prof. Lucas Wanner Instituto de Computac ao, Unicamp

MC714: Sistemas Distribu dos Prof. Lucas Wanner Instituto de Computac ao, Unicamp

Distribu(onal Seman(cs for Resolving Bridging Men(ons Tim

Learning(Distribu.ons(over(Logical(Forms(for( Referring(Expression(Genera.on(

Learning Distribu.ons over Logical Forms for Referring Expression

Probability Distribu.ons on Structured Objects September 17, 2013

DATA SCIENCE AND MACHINE LEARNING Com m on GGPLOT VI SUALI ZATI ONS Dim itris Fouskakis

Corporate Presentation DIMCOIN Foundation Date: 15.06.2017 www.dimcoin.io Structure Structure

Almost Gorenstein rings Naoki Taniguchi Meiji University Based on the works jointly with S.

Galaxies are everywhere...dim little points/smudges of light Whirlpool Galaxy: image taken with

Random Variables August 7, 2019 August 7, 2019 1 / 45 Example: Commute Times I come to campus

MIDlet Navigation Graphs in JML MIDlet Navigation Graphs in JML Wojciech Mostowski and Erik Poll

Frameworks Concepts set of cooperating classes Frameworks extending some class

Basic Java Network Programming CS211 July 30 th , 2001 The Network and OSI Model IP Header TCP

13 Chapter Exercises GUI Objects and Event-Driven Programming Discuss the major difference

A Load Time Policy Checker for Open Multi-Application Smart Cards Nicola Dragoni 1 Eduardo Lostal

Java AWT CS2704: Object-Oriented Software Design and Construction Constantinos Phanouriou

Showing your applet on a webpage 1 11/22/2013 Steps Find the build folder inside your

Name: Prone Leg Curl Tube Thickness: 3.0mm Dim: 196013501180mm Weight: 400KG Model No: EJ01

Name: Leg Extension Tube Thickness: 2.5mm Dim: 140105150cm Weight: 214KG Model No: OE502

Name: Prone Leg Curl Tube Thickness: 2.5mm Dim: 15299135cm Weight: 216 KG Model No: TT101