Introduc)on to Bayesian methods Lecture 14 David Sontag - PowerPoint PPT Presentation

Introduc)on ¡to ¡Bayesian ¡methods ¡ Lecture ¡14 ¡ David ¡Sontag ¡ New ¡York ¡University ¡ Slides adapted from Luke Zettlemoyer, Carlos Guestrin, Dan Klein, and Vibhav Gogate

Bayesian ¡learning ¡ • Bayesian ¡learning ¡uses ¡ probability ¡ to ¡ model ¡ data ¡and ¡ quan+fy ¡uncertainty ¡ of ¡predic;ons ¡ – Facilitates ¡incorpora;on ¡of ¡prior ¡knowledge ¡ – Gives ¡op;mal ¡predic;ons ¡ • Allows ¡for ¡decision-‑theore;c ¡reasoning ¡

Your ¡first ¡consul;ng ¡job ¡ • A ¡billionaire ¡from ¡the ¡suburbs ¡of ¡ManhaFan ¡asks ¡ you ¡a ¡ques;on: ¡ – He ¡says: ¡I ¡have ¡thumbtack, ¡if ¡I ¡flip ¡it, ¡what’s ¡the ¡ probability ¡it ¡will ¡fall ¡with ¡the ¡nail ¡up? ¡ – You ¡say: ¡Please ¡flip ¡it ¡a ¡few ¡;mes: ¡ – You ¡say: ¡The ¡probability ¡is: ¡ • P(heads) ¡= ¡3/5 ¡ – He ¡says: ¡Why??? ¡ – You ¡say: ¡Because… ¡

Outline ¡of ¡lectures ¡ • Review ¡of ¡probability ¡ (AZer ¡midterm) ¡ Maximum ¡likelihood ¡es;ma;on ¡ 2 ¡examples ¡of ¡Bayesian ¡classifiers: ¡ • Naïve ¡Bayes ¡ • Logis;c ¡regression ¡

Random Variables • A random variable is some aspect of the world about which we (may) have uncertainty – R = Is it raining? – D = How long will it take to drive to work? – L = Where am I? • We denote random variables with capital letters • Random variables have domains – R in {true, false} (sometimes write as {+r, ¬ r}) – D in [0, ∞ ) – L in possible locations, maybe {(0,0), (0,1), …}

Probability Distributions • Discrete random variables have distributions T P W P warm 0.5 sun 0.6 cold 0.5 rain 0.1 fog 0.3 meteor 0.0 • A discrete distribution is a TABLE of probabilities of values • The probability of a state (lower case) is a single number • Must have:

Joint Distributions • A joint distribution over a set of random variables: specifies a real number for each assignment: T W P – How many assignments if n variables with domain sizes d ? hot sun 0.4 hot rain 0.1 – Must obey: cold sun 0.2 cold rain 0.3 • For all but the smallest distributions, impractical to write out or estimate – Instead, we make additional assumptions about the distribution

Marginal Distributions • Marginal distributions are sub-tables which eliminate variables • Marginalization (summing out): Combine collapsed rows by adding T P hot 0.5 T W P cold 0.5 X P ( t ) = P ( t, w ) hot sun 0.4 hot rain 0.1 w cold sun 0.2 W P X P ( w ) = P ( t, w ) cold rain 0.3 sun 0.6 t rain 0.4

Conditional Probabilities • A simple relation between joint and conditional probabilities – In fact, this is taken as the definition of a conditional probability T W P hot sun 0.4 hot rain 0.1 cold sun 0.2 cold rain 0.3

Conditional Distributions • Conditional distributions are probability distributions over some variables given fixed values of others Conditional Distributions Joint Distribution W P T W P sun 0.8 hot sun 0.4 rain 0.2 hot rain 0.1 cold sun 0.2 cold rain 0.3 W P sun 0.4 rain 0.6

The Product Rule • Sometimes have conditional distributions but want the joint • Example: D W P D W P wet sun 0.1 wet sun 0.08 W P dry sun 0.9 dry sun 0.72 sun 0.8 wet rain 0.7 wet rain 0.14 rain 0.2 dry rain 0.3 dry rain 0.06

Bayes ’ Rule • Two ways to factor a joint distribution over two variables: • Dividing, we get: • Why is this at all helpful? – Let’s us build one conditional from its reverse – Often one conditional is tricky but the other one is simple – Foundation of many practical systems (e.g. ASR, MT) • In the running for most important ML equation!

Introduc)on to Bayesian methods Lecture 14 David Sontag - PowerPoint PPT Presentation

Introduc)on to Bayesian methods Lecture 14 David Sontag New York University Slides adapted from Luke Zettlemoyer, Carlos Guestrin, Dan Klein, and Vibhav Gogate Bayesian learning

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Bayesian Methods in Cryo-EM Marcus A. Brubaker York University / Structura Biotechnology Toronto,

Introduc)on to Distributed Systems Arvind Krishnamurthy Todays Lecture Introduc)on

Bayesian Methods for Neural Networks Readings: Bishop, Neural Networks for Pattern Recognition .

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

Bayesian Zig Zag Developing probabilistic models using grid methods and MCMC Allen Downey ACM

Bayesian Networks Volker Sorge Intro to AI: Specifying Probability Distributions Lecture 8

A simple Bayesian regression model Alicia Johnson Associate Professor, Macalester College

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

CS 188: Artificial Intelligence Probability Pieter Abbeel UC Berkeley Many slides adapted

2008 International Workshop on EUV Lithography June 10-12, 2008 Wailea Beach Marriott

Towards Sub-10 nm Diameter III-V VNW Transistors Wenjie Lu, Xin Zhao, Jess A. del Alamo

A CMOS-Compatible Fabrication Process for Scaled Self-Aligned I nGaAs MOSFETs Jianqiang Lin

DRY-SAS/DBMS UPDATE Executive Committee meeting 9 OCTOBER 2020 BACKGROUND DRY-SAS AND DBMS

CSCI 446: Artificial Intelligence Probability Instructor: Michele Van Dyne [These slides were

Memory FIFOs for uncommitted writes Consistency Invalidate queues (for cache coherency)

Dynamical Symmetry Breaking and Collider Physics (Review Talk) 201 .1 . at Osaka