Being Bayesian About Being Bayesian About Net work St ruct ure Net - PowerPoint PPT Presentation

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian Approach t o St ruct ure Discovery in Bayesian Net works Nir Friedman and Daphne Koller 04/ 21/ 2005 CS673 1

Roadmap Roadmap • Bayesian lear ning of Bayesian Net wor ks – Exact vs Approximat e Learning • Markov Chain Mont e Carlo met hod – MCMC over st ruct ures – MCMC over orderings • Experiment al Result s • Conclusions 04/ 21/ 2005 CS673 2

Bayesian Net works Bayesian Net works • Compact represent at ion of probabilit y dist ribut ions via condit ional independence E B E B P (A| E,B) Qualit at ive part : e b 0.9 0.1 Direct ed acyclic graph-DAG e !b 0.2 0.8 R • Nodes – random variables A !e b 0.9 0.1 !e !b 0.01 0.99 • Edges – direct inf luence C Quant it at ive part : Set of condit ional Toget her : probabilit y dist ribut ion Def ine a unique dist ribut ion in a f act ored f orm P(B,E,A,C,R) =P(B)P(E)P(A| B,E)P(R| E)P(C| A) 04/ 21/ 2005 CS673 3

Why Learn Bayesian Net works? Why Learn Bayesian Net works? • Condit ional independencies & graphical represent at ion capt ure t he st ruct ure of many real-world dist ribut ions - P rovides insight s int o domain • Graph st ruct ure allows “ knowledge discovery” • I s t here a direct connect ion bet ween X & Y • Does X separat e bet ween t wo “ subsyst ems” • Does X causally af f ect Y • Bayesian Net works can be used f or many t asks – I nf erence, causalit y, et c. • Examples: scient if ic dat a mining - Disease propert ies and sympt oms - I nt eract ions bet ween t he expression of genes 04/ 21/ 2005 CS673 4

Learning Bayesian Net works Learning Bayesian Net works E B Inducer Data + Prior Information R A E B P (A| E,B) e b 0.9 0.1 C e !b 0.2 0.8 !e b 0.9 0.1 !e !b 0.01 0.99 •I nducer needs t he prior probabilit y dist ribut ion P( I nducer needs t he prior probabilit y dist ribut ion P( B B ) ) • Using Bayesian condit ioning, updat e t he prior Using Bayesian condit ioning, updat e t he prior P( B B ) ) P( B B | D) | D) P( P( 04/ 21/ 2005 CS673 5

Why St ruggle f or Accurat e St ruct ure? Why St ruggle f or Accurat e St ruct ure? “Tr ue” st r uct ur e Tr ue” st r uct ur e E A B S Adding an ar c Adding an ar c Missing an ar c Missing an ar c E A B E A B S S •I ncr eases t he number of I ncr eases t he number of •Cannot be compensat ed by Cannot be compensat ed by par amet er s t o be f it t ed par amet er s t o be f it t ed accur at e f it t ing of par amet er s accur at e f it t ing of par amet er s Wr ong assumpt ions about Wr ong assumpt ions about Also misses causalit y and domain Also misses causalit y and domain causalit y and domain st r uct ur e causalit y and domain st r uct ur e st r uct ur e st r uct ur e 04/ 21/ 2005 CS673 6

Score- -based learning based learning Score • Def ine scoring f unct ion t hat evaluat es how well a st ruct ure mat ches t he dat a E, B, A < Y,N,N> < Y,Y,Y> < N,Y,Y> . E E B E . A < N,N,N> A A B B • Search f or a st ruct ure t hat maximizes t he score 04/ 21/ 2005 CS673 7

Discovering St ruct ure – – Model Select ion Model Select ion Discovering St ruct ure P(G| D) P(G| D) E B R A C •Current pract ice: model select ion Current pract ice: model select ion • P P ick a single high- ick a single high -scoring model scoring model Use t hat model t o inf er domain st ruct ure Use t hat model t o inf er domain st ruct ure 04/ 21/ 2005 CS673 9

Discovering St ruct ure – – Model Averaging Model Averaging Discovering St ruct ure P(G| D) P(G| D) E B E B E B E B E B R R R R A A A A R A C C C C C • Pr oblem • Pr oblem ⇒ Small sample size Small sample size many high scoring models many high scoring models Answer based on one model of t en useless Answer based on one model of t en useless Want f eat ures common t o many models Want f eat ures common t o many models 04/ 21/ 2005 CS673 10

Bayesian Approach Bayesian Approach • Est imat e probabilit y of f eatures – Edge X � Y Bayesian score for G – Markov edge X -- Y – Pat h X � … � Y – ... ∑ = ( | ) ( ) ( | ) P f D f G P G D G Feature of G, e.g., X � Y Indicator function for feature f Huge (super-exponent ial – 2 T (n 2 ) ) number of net works G • • Exact learning - int ract able 04/ 21/ 2005 CS673 11

Approximat e Bayesian Learning Approximat e Bayesian Learning • Rest rict t he search space t o G k , where G k – set of graphs wit h indegree bounded by k -space st ill super-exponent ial • Find a set G of high scoring st ruct ures – Est imat e ∑ ( | ) ( ) P G D f G ≈ G ( | ) P f D ∑ ( | ) P G D G - Hill-climbing – biased sample of st ruct ures 04/ 21/ 2005 CS673 12

Markov Chain Mont e Carlo over Net works Markov Chain Mont e Carlo over Net works MCMC Sampling – Def ine Markov Chain over BNs – Perf orm a walk t hrough t he chain t o get samples G’s whose post eriors converge t o t he post erior P(G| D) of t he t rue st ruct ure • Possible pit f alls: – St ill super-exponent ial number of net works – Time f or chain t o converge t o post erior is unknown – I slands of high post erior, connect ed by low bridges 04/ 21/ 2005 CS673 13

Bet t er Approach t o Approximat e Learning Bet t er Approach t o Approximat e Learning • Furt her const raint s on t he search space – P erf orm model averaging over t he st ruct ures consist ent wit h some know (f ixed) t ot al ordering ‹ • Ordering of variables: – X 1 ‹ X 2 ‹… ‹ X n parent s f or X i must be in X 1 , X 2 ,… , X i-1 • I nt uit ion: Order decouples choice of parent s – Choice of P a(X 7 ) does not rest rict choice of P a(X 12 ) •Can comput e ef f icient ly in closed f orm Can comput e ef f icient ly in closed f orm • (D| ‹ ‹ ) ) Likelihood P Likelihood P (D| ‹ ) , ‹ Feat ure probabilit y P(f | D ) Feat ure probabilit y P(f | D, 04/ 21/ 2005 CS673 14

Sample Orderings Sample Orderings We can writ e ∑ = p p ( | ) ( | , ) ( | ) P f D P f D P D p Sample orderings and approximat e n ∑ = p ( | ) ( | , ) P f D P f i D = 1 i MCMC Sampling • Def ine Markov Chain over orderings Run chain t o get samples f rom post erior P( < • | D) 04/ 21/ 2005 CS673 15

Experiment s: Exact post erior over orders Experiment s: Exact post erior over orders versus order - -MCMC MCMC versus order 04/ 21/ 2005 CS673 16

Experiment s: Convergence Experiment s: Convergence 04/ 21/ 2005 CS673 17

Experiment s: st ruct ure- -MCMC MCMC – – post erior post erior Experiment s: st ruct ure correlat ion f or t wo dif f erent runs correlat ion f or t wo dif f erent runs 04/ 21/ 2005 CS673 18

Experiment s: order - -MCMC MCMC – – post erior post erior Experiment s: order correlat ion f or t wo dif f erent runs correlat ion f or t wo dif f erent runs 04/ 21/ 2005 CS673 19

Conclusion Conclusion • Or der -MCMC bet t er t han st r uct ur e-MCMC 04/ 21/ 2005 CS673 20

Ref erences Ref erences Being Bayesian about Net work St ruct ure: A Bayesian Approach t o St ruct ure Discovery in Bayesian Net works, N. Friedman and D. Koller. Machine Learning J ournal, 2002 NI P S 2001 Tut orial on learning Bayesian net works f rom Dat a. Nir Friedman and Daphne Koller Nir Friedman and Moises Goldzsmidt, AAAI -98 Tut orial on learning Bayesian net works f rom Dat a. D. Hecker man. A Tut or ial on Lear ning wit h Bayesian Net wor ks. I n Lear ning in Gr aphical Models, M. J or dan, ed.. MI T Pr ess, Cambr idge, MA, 1999. Also appear s as Technical Repor t MSR-TR-95-06, Micr osof t Resear ch, Mar ch, 1995. An ear lier ver sion appear s as Bayesian Net wor ks f or Dat a Mining, Dat a Mining and Knowledge Discover y , 1:79-119, 1997. Christ ophe Andrieu, Nando de Freit as, Arnaud Doucet and Michael I . J ordan. An I ntroduction to MCMC f or Machine Learning. Machine Learning, 2002. Art if icial I nt elligence: A Modern Approach. St uart Russell and P et er Norvig 04/ 21/ 2005 CS673 21

Being Bayesian About Being Bayesian About Net work St ruct ure Net - PowerPoint PPT Presentation

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian Approach t o St ruct ure Discovery in Bayesian Net works Nir Friedman and Daphne Koller 04/ 21/ 2005 CS673 1 Roadmap Roadmap Bayesian lear

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

This time... Bayesian Net Belief Propagation Algorithm LDPC/IRA Codes S. Cheng (OU-Tulsa)

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Becky Coffin Kingfisher plc Net Positive 2 Net Positive 3 Net Positive 4 Creating the

Bayesian networks (2) Lirong Xia Last class Bayesian networks compact, graphical

Exact inference (Ch. 14) Bayesian Network A Bayesian network (Bayes net) is: (1) a directed

Net work Management Tasks Prot ect ing t he net work (e.g. int rusion 17: det ect ion) Net

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Distributed Storage and Consistency Distributed Storage and Consistency Storage moves into the

A simple Bayesian regression model Alicia Johnson Associate Professor, Macalester College

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Metaheuristics 2.3 Local Search 2.4 Simulated annealing Adrian Horga 1 2.3 Local Search 2

DELIVERING VALUE Investor Presentation First Quarter 2020 FORWARD LOOKING INFORMATION Caution

Kaggle WISE2014. 2nd-place Solution Team anttip: Antti Puurula (1) and Jesse Read (2) (1)

Presentation workshop Phones, mobility, borders and limits Wotro Programme and Case studies:

Seminar: Black-Box Challenge Michael Gla Faramarz Khosravi Moritz Mhlenthaler Tobias

Hit the brakes! Moving away from a problem-solving approach to mission Brought to you by

Hit the Ground Running Your first 90 Days as a Nurse Leader Aydrian DeDiemar, MBA BSN NEA-BC

3/21/2017 OVERVIEW OF PRESENTATION Performance Management Systems: What do we mean by