Statistical Geometry Processing Winter Semester 2011/2012 Bayesian - PowerPoint PPT Presentation

Statistical Geometry Processing Winter Semester 2011/2012 Bayesian Statistics

Bayesian Statistics Summary • Importance  The only sound tool to handle uncertainty  Manifold applications: Web search to self-driving cars • Structure  Probability: positive , additive , normed measure  Learning is density estimation  Large dimensions are the source of (almost) all evil  No free lunch: There is no universal learning strategy 2

Motivation

Modern AI Classic artificial intelligence: • Write a complex program with enough rules to understand the world • This has been perceived as not very successful Modern artificial intelligence • Machine learning • Learn structure from data  Minimal amount of “hardwired” rules  “Data driven approach” • Mimics human development (training, early childhood) 4

Data Driven Computer Science Statistical data analysis is everywhere: • Cell phones (transmission, error correction) • Structural biology • Web search • Credit card fraud detection • Face recognition in point-and-shoot cameras • ... 5

Probability Theory (a very brief summary)

Probability Theory (a very brief summary) Part I: Philosophy

What is Probability? Question: • What is probability? Example: • A bin with 50 red and 50 blue balls • Person A takes a ball • Question to Person B: What is the probability for red ? What happened: • Person A took a blue ball • Not visible to person B 8

Philosophical Debate… An old philosophical debate: • What does “probability” actually mean? • Can we assign probabilities to events for which the outcome is already fixed? (but we do not know it for sure) “Fixed outcome” examples: • Probability for life on mars • Probability for J.F. Kennedy having been assassinated by a intra-government conspiracy • Probability that the code you wrote is correct 9

Two Camps Frequentists ’ (traditional) view: • Well defined experiment • Probability is the relative number of positive outcomes • Only meaningful as a mean of many experiments Bayesian view: • Probability expresses a degree of belief • Mathematical model of uncertainty • Can be subjective 10

Mathematical Point of View Mathematics: • Math does not tell you what is true • It only tells you the consequences if you accept other assumptions (axioms) to be true • Mathematicians don’t do philosophy. Mathematical definition of probability: • Properties of probability measures • Consistent with both views • Defines rules for computing with probabilities • Setting up probabilities is not a math problem 11

Probability Theory (a very brief summary) Part II: Probability Measures

Kolmogorov’s Axioms Discrete probability space:  = { w 1 , …, w n } • Elementary events : Subsets A   • General events : • Probability measure: Pr : P (  )   A valid probability measure must ensure: Pr(A)  0 • Positive: [A  B =  ]  [Pr(A) + Pr(B) = Pr( A  B )] • Additive: Pr(  ) = 1 • Normed: 13

Other Properties Follow Properties derived from Kolmogorov’s Axioms: • P(A)  [0..1] • P(A) = P(  \ A) = 1 – P(A) • P(  ) = 0 • Pr(A  B) = Pr(A) + Pr(B) – Pr(A  B) • … counted twice 14

In other words Mathematical probability is a • non-negative , normed , additive measure.  Always  0  Sums to 1  Disjoint pieces add up 15

In other words Mathematical probability is a • non-negative , normed , additive measure. w 1 – elementary event w 2 – elementary event … 1 2 3 4 5 6 7 8 more likely: w 21 8 … 16 … 21 less likely: w 64 Pr( w 21 ) > Pr( w 64 )  64  i Pr( w i ) = 1 • Think of a density on some domain  16

In other words Mathematical probability is a • non-negative , normed , additive measure. A is an event 1 2 3 4 5 6 7 8 8 … … 16 Pr( A ) =  i  A Pr( w i ) 21 22 23 29 30 31 = Pr( w 21 ) + Pr( w 22 ) + Pr( w 23 ) 36 37 38 + Pr( w 29 ) + Pr( w 30 ) + Pr( w 31 )  + Pr( w 36 ) + Pr( w 37 ) + Pr( w 38 ) 64 • Think of a density on some domain  17

In other words Mathematical probability is a • non-negative , normed , additive measure.  Always  0  Sums to 1  Disjoint pieces add up What does this model? • You can always think of an area with density. • All pieces are positive. • Sum of densities is 1. 18

Discrete Models Discrete probability space:  = { w 1 , …, w n } • Elementary events : Subsets A   • General events : • Probability measure: Pr : P (  )   Probability measures: • Sum of elementary probabilities =  w  A Pr ( w i ) Pr( A )  i 19

Continuous Probability Measures Continuous probability space:   ℝ d • Elementary events : “reasonable” *) subsets A   • General events : • Probability measure: Pr : σ (  )   assigns probability to subsets *) of  *) not “ all” subsets: Borel sigma algebra (details omitted) The same axioms: Pr(A)  0 • Positive: [A  B =  ]  [Pr(A) + Pr(B) = Pr(A  B)] • Additive: P(  ) = 1 • Normed: 20

Continuous Density Density model • No elementary probabilities • Instead: density p : ℝ d  ℝ  0 A is an event Pr(A) = ∫ A p ( x ) d x Density p ( x ) with  p ( x )  0 and ∫  p ( x ) d x = 1 21

Random Variables Random Variables • Assign numbers or vectors from ℝ d to outcomes • Notation:  random variable X p  density p ( x ) = Pr( X = x ) • Usually: x = X Variable = domain of the density  22

Unified View Discrete models as special case p ( x ), x  ℝ p ( w i ), w i  {1,...,9} Dirac-Delta pulses p ( x ) = Σ i δ ( x – x i ) p ( w i ) Idealization 1 2 3 4 5 6 7 8 9 ∫ ℝ d δ ( x ) d x = 1 1 3 5 9 w i x δ (0) very large Discrete model Continuous model d(x) = 0 everywhere else 23

Probability Theory (a very brief summary) Part III: Statistical Dependence

Conditional Probability Conditional Probability: • Pr(A | B) = Probability of A given B [is true] • Easy to show: Pr(A  B) = Pr(A | B) · Pr( B) Statistical Independence • A and B independent :  Pr(A  B) = Pr(A) · Pr( B) • Knowing the value of A does not yield information about B (and vice versa) 25

Factorization Independence = Density Factorization p ( x 1 , x 2 ) p ( x 1 ) p ( x 2 )  = x 2 x 2 x 1 x 1 p ( x 1 , x 2 ) = p ( x 1 )  p ( x 2 ) 26

Factorization Independence = Density Factorization p ( x 1 , x 2 ) p ( x 1 ) p ( x 2 ) 1 2 ... k  = x 2 x 2 ... 1 2 ... k 2 1 x 1 k 1 2 ... k x 1 p ( x 1 , x 2 ) = p ( x 1 )  p ( x 2 ) O( d ⋅ k ) O( k d ) 27

Marginals Example 1 • Two random variables p ( a , b ) a , b  [0,1] b 𝑒𝑐 • Joint distribution p ( a , b) • We do not know b 0 (could by anything) a 0 1 • What is the distribution of a ? 1 𝑞 𝑏 = 𝑞 𝑏, 𝑐 𝑒𝑐 a 0 1 0 “Marginal Probability” 28

Bayesian Inference Example: Statistical Inference • Medical test to check for a medical condition • A: Medical test positive?  99% correct if patient is ill  But in 1 of 100 cases, reports illness for healthy patients • B: Patient has disease?  We know: One in 10 000 people have it A patient is diagnosed with the disease: • How likely is it for the patient to actually be sick? 30

Bayesian Inference Apply Bayes’ Rule: A: Medical test positive? B: Patient has disease? Pr(B | A) = Pr(A | B)·Pr(B ) Pr(A) Pr(test pos. | disease)·Pr( deasease ) Pr(disease | test positive) = Pr(test pos.|disease)Pr(disease) + Pr(test pos.|disease)Pr(disease) 0.99 · 0.0001 = 0.000099 = 0.99 ·0.0001 + 0.01·0.9999 0.0100979901  0.0098  1  most likely healthy 100 31

Intuition Soccer Stadium – 10 000 people 100 people with positive test 1 person actually sick 32

Conclusion Pr(B | A)·Pr(A ) Pr(A | B) = Bayes’ Rule: Pr(B) • Used to fuse knowledge  “Prior” knowledge (prevalence of disease)  “Measurement”: tests, sensor data, new information  Can be used repeatedly to add more information • Standard tool for interpreting sensor measurements (Sensor fusion, reconstruction) • Examples:  Image reconstruction (noisy sensors)  Face recognition 33

Chain Rule Incremental update • Probability can be split into chain of conditional probabilities: Pr 𝑌 𝑜 , … , 𝑌 2 , 𝑌 1 = Pr 𝑌 𝑜 𝑌 𝑜−1 , 𝑌 𝑜−2 , … , 𝑌 1 ) ⋯ Pr 𝑌 3 𝑌 2 , 𝑌 1 Pr(𝑌 2 |𝑌 1 )Pr(𝑌 1 ) • Example application:  X i is measurement at time i  Update probability distribution as more data comes in • Attention – although it might look like, this does not reduce the complexity of the joint distribution 34

Probability Theory (a very brief summary) Part IV: Uniqueness – Philosophy Again...

Statistical Geometry Processing Winter Semester 2011/2012 Bayesian - PowerPoint PPT Presentation

Statistical Geometry Processing Winter Semester 2011/2012 Bayesian Statistics Bayesian Statistics Summary Importance The only sound tool to handle uncertainty Manifold applications: Web search to self-driving cars Structure

Stochastic geometry and random generation 1 Stochastic geometry and random generation

48-175 Descriptive Geometry Basic Concepts of Descriptive Geometry Descriptive geometry is

Hyperbolic Geometry Victor Gonzalez Mentor: Ryan Kirk May 4, 2016 Hyperbolic Geometry We are

Geometry Problems Geometry Problems Examples for Typical ACM Instances Elementary Geometry

3d Geometry for Computer Graphics Lesson 1: Basics & PCA 3d geometry 3d geometry 3d

Statistical Geometry Processing Winter Semester 2011/2012 Representations of Geometry Motivation

Statistical Geometry Processing Winter Semester 2011/2012 n r u v Differential Geometry

Computer Graphics at University of Toronto 2 Modeling 5 Geometry Processing is biology 6

FOOD PROCESSING FOOD PROCESSING GREEN BEAN PROCESSING GREEN BEAN PROCESSING GREEN BEAN

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

2.2 Classic Differential Geometry 1 Hao Li http://cs621.hao-li.com 1 Spring 2018 CSCI 621:

3.1 Classic Differential Geometry Hao Li http://cs599.hao-li.com 1 Spring 2014 CSCI 599:

3.2 Classic Differential Geometry 1 Hao Li http://cs621.hao-li.com 1 Spring 2017 CSCI 621:

Geometry Euclid of Alexandria The Founder of Geometry. He was a Greek mathematician, often

Ansys - Old Geometry - Cathode 1 Ansys - New Geometry - Cathode lamella (PCB and copper

Snapshots from the History of Toric Geometry David A. Cox Geometry 19701988 Toric Geometry

short time regularization of diffusive inhomogeneous kinetic equations F . Hrau (Nantes) on

CDM Program Size Complexity Klaus Sutner Carnegie Mellon University kolmogorov 2018/2/8 22:58

Greedy controllability Enrique Zuazua 1 Departamento de Matem aticas Universidad Aut onoma

trt rsts r rt

Higher Randomness and hK-Trivials Paul-Elliot Angls dAuriac Benot Monin March 26, 2019

Quantum Turing machines Hiddensee meeting on BSS machines and computability Andr e Nies

All Quantum Adversaries Are Equivalent Robert palek joint work with Mario Szegedy Quantum

Sampling discretization of integral norms. Lecture 3 Vladimir Temlyakov Chemnitz; September,

Statistical Geometry Processing Winter Semester 2011/2012 Bayesian - PowerPoint PPT Presentation

Statistical Geometry Processing Winter Semester 2011/2012 Bayesian Statistics Bayesian Statistics Summary Importance The only sound tool to handle uncertainty Manifold applications: Web search to self-driving cars Structure

Stochastic geometry and random generation 1 Stochastic geometry and random generation

48-175 Descriptive Geometry Basic Concepts of Descriptive Geometry Descriptive geometry is

Hyperbolic Geometry Victor Gonzalez Mentor: Ryan Kirk May 4, 2016 Hyperbolic Geometry We are

Geometry Problems Geometry Problems Examples for Typical ACM Instances Elementary Geometry

3d Geometry for Computer Graphics Lesson 1: Basics &amp; PCA 3d geometry 3d geometry 3d

Statistical Geometry Processing Winter Semester 2011/2012 Representations of Geometry Motivation

Statistical Geometry Processing Winter Semester 2011/2012 n r u v Differential Geometry

Computer Graphics at University of Toronto 2 Modeling 5 Geometry Processing is biology 6

FOOD PROCESSING FOOD PROCESSING GREEN BEAN PROCESSING GREEN BEAN PROCESSING GREEN BEAN

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

2.2 Classic Differential Geometry 1 Hao Li http://cs621.hao-li.com 1 Spring 2018 CSCI 621:

3.1 Classic Differential Geometry Hao Li http://cs599.hao-li.com 1 Spring 2014 CSCI 599:

3.2 Classic Differential Geometry 1 Hao Li http://cs621.hao-li.com 1 Spring 2017 CSCI 621:

Geometry Euclid of Alexandria The Founder of Geometry. He was a Greek mathematician, often

Ansys - Old Geometry - Cathode 1 Ansys - New Geometry - Cathode lamella (PCB and copper

Snapshots from the History of Toric Geometry David A. Cox Geometry 19701988 Toric Geometry

short time regularization of diffusive inhomogeneous kinetic equations F . Hrau (Nantes) on

CDM Program Size Complexity Klaus Sutner Carnegie Mellon University kolmogorov 2018/2/8 22:58

Greedy controllability Enrique Zuazua 1 Departamento de Matem aticas Universidad Aut onoma

trt rsts r rt

Higher Randomness and hK-Trivials Paul-Elliot Angls dAuriac Benot Monin March 26, 2019

Quantum Turing machines Hiddensee meeting on BSS machines and computability Andr e Nies

All Quantum Adversaries Are Equivalent Robert palek joint work with Mario Szegedy Quantum

Sampling discretization of integral norms. Lecture 3 Vladimir Temlyakov Chemnitz; September,

3d Geometry for Computer Graphics Lesson 1: Basics & PCA 3d geometry 3d geometry 3d