Bayes and Lancaster at the Chinese restaurant. Statistical uses of - PowerPoint PPT Presentation

Bayes and Lancaster at the Chinese restaurant. Statistical uses of the Fleming-Viot Process. Dario Span` o University of Warwick 1st Berlin-Padova Young Researchers Workshop 23-25 October, 2014 0 / 1

Based on joint works with Bob Gri ffi ths (Oxford) Paul Jenkins (Warwick) Matteo Ruggiero (Torino) and Omiros Papaspiliopoulos (Barcelona) 1 / 1

Outline Chinese Restaurant Process and Bayes Computable filters Fleming-Viot Lancaster joins the restaurant 2 / 1

Dirichlet measures and the Chinese restaurant process Infinitely many delegates participate to an important probability young researcher workshop. Day 1: dinner at the chinese restaurant. Delegates enter the room one by one, and, if k tables occupied by n 1 , . . . , n k persons ( P k i =1 n i = n ), then the ( n + 1)-th delegate: joins table with n j people with probability n j / ( n + θ ) ( j = 1 , . . . , k ); chooses a new table with probability θ / ( n + θ ); each new table labelled with a color chosen at random from a space E of colors, using a prob. distribution P 0 . 3 / 1

Dirichlet measures and the Chinese restaurant process Infinitely many delegates participate to an important probability young researcher workshop. Day 1: dinner at the chinese restaurant. Delegates enter the room one by one, and, if k tables occupied by n 1 , . . . , n k persons ( P k i =1 n i = n ), then the ( n + 1)-th delegate: joins table with n j people with probability n j / ( n + θ ) ( j = 1 , . . . , k ); chooses a new table with probability θ / ( n + θ ); each new table labelled with a color chosen at random from a space E of colors, using a prob. distribution P 0 . Let X n = “color of table occupied by n -th delegate”, n 2 N . Denote X ( n ) = ( X 1 , . . . , X n ) and n X e n ( X ( n ) ) := 1 δ X i , n 2 N . n i =1 3 / 1

Bayes at the Chinese restaurant. The sequence ( X 1 , X 2 , . . . ) is infinitely exchangeable. 4 / 1

Bayes at the Chinese restaurant. The sequence ( X 1 , X 2 , . . . ) is infinitely exchangeable. Prior : e ( X ( n ) ) ! a . s . n !1 F where F ⇠ π θ , P 0 Ferguson-Dirichlet : π θ , P 0 ( F ( A 1 ) , . . . , F ( A d )) ⇠ Dir ( θ P 0 ( A 1 ) , . . . , θ P 0 ( A d )) , for every d 2 N and every partition ( A 1 , . . . , A d ) of E . 4 / 1

Bayes at the Chinese restaurant. The sequence ( X 1 , X 2 , . . . ) is infinitely exchangeable. Prior : e ( X ( n ) ) ! a . s . n !1 F where F ⇠ π θ , P 0 Ferguson-Dirichlet : π θ , P 0 ( F ( A 1 ) , . . . , F ( A d )) ⇠ Dir ( θ P 0 ( A 1 ) , . . . , θ P 0 ( A d )) , for every d 2 N and every partition ( A 1 , . . . , A d ) of E . Likelihood: � � = µ ⌦ n L X ( n ) | F = µ 4 / 1

Bayes at the Chinese restaurant. The sequence ( X 1 , X 2 , . . . ) is infinitely exchangeable. Prior : e ( X ( n ) ) ! a . s . n !1 F where F ⇠ π θ , P 0 Ferguson-Dirichlet : π θ , P 0 ( F ( A 1 ) , . . . , F ( A d )) ⇠ Dir ( θ P 0 ( A 1 ) , . . . , θ P 0 ( A d )) , for every d 2 N and every partition ( A 1 , . . . , A d ) of E . Likelihood: � � = µ ⌦ n L X ( n ) | F = µ Posterior: L ( F | x ( n ) ) ⇠ π θ + n , θ + n P 0 . n θ θ + n e ( x ( n ) )+ . 4 / 1

How crowded is your table? Under π θ , P 0 , the pdf of ( F ( A 1 ) , . . . , F ( A d )) is 2 3 ⇣ ⌘ k Y ( x 1 , ..., x d ) 2 [0 , 1] d : | x | = 1 x θ P 0 ( A j ) � 1 4 5 I / j j =1 If E = { 0 , 1 } , then P 0 = p 0 2 [0 , 1] so π θ , p 0 = beta ( θ p 0 , θ (1 � p 0 )) If E any polish, then F ( A ) ⇠ beta ( θ P 0 ( A ) , θ (1 � P 0 ( A ))) 5 / 1

Di ff usion model The time-evolution of a genetic variant, or allele, is well approximated by a di ff usion process on the interval [0 , 1]. 1 Allele frequency 0 Time 6 / 1

Di ff usion model The time-evolution of a genetic variant, or allele, is well approximated by a di ff usion process on the interval [0 , 1]. 1 Allele frequency 0 Time Wright-Fisher SDE p dF t = b θ ( F t ) dt + F t (1 � F t ) dW t , F 0 = µ, t � 0 . The infinitesimal drift, b θ ( x ), encapsulates directional forces such as natural selection, migration, mutation, . . . 6 / 1

Filtering with genetic time series data. We do not observe the path of the frequency di ff usion F = ( F t : t � 0), but only samples taken at distinct time points t 1 < . . . < t k . Key assumption on likelihood iid X 1 ( t ) , . . . , X n ( t ) ( t ) | F t ⇠ F t , t 2 { t 1 , . . . , t k } 1 Allele frequency 0 Time How to infer di ff usion sample path properties given data? 7 / 1

Optimal filter. Assume the di ff usion has stationary measure π and transition function P t ( µ, d ν ). Let f µ ( x ) the likelihood of data given signal, both at stationarity. Two operators. Update operator (Bayes’ rule) : φ x ( π )( d µ ) = f µ ( x ) π ( d µ ) E π ( X ) prediction operator (propagator) : Z ψ t ( π )( d ν ) = π ( d µ ) P t ( µ, d ν ) M 1 Definition The optimal filter is the solution of the recursion π 0 = φ x t 0 ( π ) , π n = φ x tn ( ψ t n � t n − 1 ( π n )) it is called computable filter if iterating n times update/propagation involves finite sums whose number of terms depends on n . 8 / 1

Filtering with genetic time series data. We do not observe the path of the frequency di ff usion F = ( F t : t � 0), but only samples taken at distinct time points t 1 < . . . < t k . Key assumption on likelihood iid X 1 ( t ) , . . . , X n ( t ) ( t ) | F t ⇠ F t , t 2 { t 1 , . . . , t k } 1 Allele frequency 0 Time How to infer di ff usion sample path properties given data? 9 / 1

Filtering with genetic time series data. We do not observe the path of the frequency di ff usion F = ( F t : t � 0), but only samples taken at distinct time points t 1 < . . . < t k . Key assumption on likelihood iid X 1 ( t ) , . . . , X n ( t ) ( t ) | F t ⇠ F t , t 2 { t 1 , . . . , t k } 1 Allele frequency 0 Time A priori , F t 1 ⇠ π 10 / 1

Filtering with genetic time series data. We do not observe the path of the frequency di ff usion F = ( F t : t � 0), but only samples taken at distinct time points t 1 < . . . < t k . Key assumption on likelihood iid X 1 ( t ) , . . . , X n ( t ) ( t ) | F t ⇠ F t , t 2 { t 1 , . . . , t k } 1 Allele frequency 0 Time Update F t 0 | Data at t 1 11 / 1

Filtering with genetic time series data. We do not observe the path of the frequency di ff usion F = ( F t : t � 0), but only samples taken at distinct time points t 1 < . . . < t k . Key assumption on likelihood iid X 1 ( t ) , . . . , X n ( t ) ( t ) | F t ⇠ F t , t 2 { t 1 , . . . , t k } 1 Allele frequency 0 Time Predict F 2 based on posterior update at t 1 via P t 2 � t 1 ( F x ( n 1) ( t 1 ) , · ) 12 / 1

Filtering with genetic time series data. We do not observe the path of the frequency di ff usion F = ( F t : t � 0), but only samples taken at distinct time points t 1 < . . . < t k . Key assumption on likelihood iid X 1 ( t ) , . . . , X n ( t ) ( t ) | F t ⇠ F t , t 2 { t 1 , . . . , t k } Update distribution of F t 2 given Data at t 2 . Carry on for t 3 , t 4 , · · · . 13 / 1

Tractability of a filter. p dF t = b θ ( F t ) dt + F t (1 � F t ) dW t , F 0 = µ, t � 0 . Ideally we would like to be able to Know the stationary distribution π ; Know how to compute posterior P ( F t | Data at t ); Know how to compute P t ( µ, d ν ). Generally all three aspects are intractable. Neutral Fleming-Viot models have them all ! b α , β ( x ) = 1 2[ α (1 � x ) � β x ] , α , β > 0 . 14 / 1

Tractability of a filter. p dF t = b θ ( F t ) dt + F t (1 � F t ) dW t , F 0 = µ, t � 0 . Ideally we would like to be able to Know the stationary distribution π ; Beta( α , β ) distribution Know how to compute posterior P ( F t | Data at t );CRP Know how to compute P t ( µ, d ν ).Lancaster probability Generally all three aspects are intractable. Neutral Fleming-Viot models have them all ! b α , β ( x ) = 1 2[ α (1 � x ) � β x ] , α , β > 0 . 14 / 1

What are Lancaster probabilities? Definition Let ( X , Y ) be an exchangeable pair of random variables with (identical) marginal distn. π . The joint distribution of ( X , Y ) is a Lancaster probability distribution if, for every n , E [ Y n | X = x ] = ρ n x n + polynomial in x of degree less than n The coe ffi cients { ρ n } are termed Canonical Correlation Coe ffi cients . In neutral Fleming-Viot model P t µ n = e � 1 2 n ( n + θ � 1) t µ n + . . . , θ = α + β . Benefit in filtering: Given F 0 = µ sample of size n from µ is su ffi cient to predict sample of size n at time t . 15 / 1

Genealogy and eigenvalues The canonical correlation coe ffi cients e � 1 2 n ( n + θ � 1) t are the eigenvalues of the semigroup P t . A probabilistic interpretation is in terms of the model’s genealogy (dual to the di ff usion). 1 Allele frequency 0 Time 16 / 1

Neutral model, a closer look. Finite population of size N , discrete, non-overlapping generations. at each generation, type of individuals J 1 , . . . , J N labeled with points in some Polish space E (e.g. E = { 0 , 1 } ). At each time k , each individual picks her parent uniformly at random from previous generation k � 1. Any individual with probability 1 � u inherits her parent’s type. With probability u it mutates to a new type chosen from E according to a probability distribution P 0 on E (if E = { 0 , 1 } , then P 0 { 1 } 2 [0 , 1]). Let N X F N ( k ) := 1 δ J i ( k ) , k = 0 , 1 , . . . . N i =1 17 / 1

Bayes and Lancaster at the Chinese restaurant. Statistical uses of - PowerPoint PPT Presentation

Bayes and Lancaster at the Chinese restaurant. Statistical uses of the Fleming-Viot Process. Dario Span` o University of Warwick 1st Berlin-Padova Young Researchers Workshop 23-25 October, 2014 0 / 1 Based on joint works with Bob Gri ffi ths

Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others

WELCOME CHINESE Your Access Channel to the Chinese Market Welcome Chinese mission statement

The Nave Bayes Classifier Machine Learning 1 Todays lecture The nave Bayes Classifier

Bayes Theorem Thomas Bayes (1701-1761) Simple form of Bayes Theorem, for

LIGHTEN UP LANCASTER COUNTY AND LIVEWELL LANCASTER COUNTY ACTIVE TRANSPORTATION EFFORTS Fall

Red Streams Blue Lancaster County Lancaster County Lancaster County Clean Water Partners Clean

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

DATA MINING: NAVE BAYES 1 Nave Bayes Classifier Thomas Bayes 1702 - 1761 We will start off

STAT 339 Naive Bayes Classification 8-10 March 2017 Colin Reimer Dawson Outline Naive Bayes

Bayes Classifiers Nave Bayes Classification Patrick Mair Bayes Classifiers Weather data

I ntroduction to Mobile Robotics Bayes Filter Kalm an Filter Wolfram Burgard 1 Bayes

Beyond the Castle (Lancaster City Park) What is Beyond the Castle? Lancaster Square Routes 'Three

m LANCASTER COUNTY CONVENTION CENTER AUTHORITY LANCASTER COUNTY CONVENTION CENTER AUTHORITY o

Formal Modeling in Cognitive Science Independence Lecture 23: Conditional Probability; Bayes

Arthur Berg Pennsylvania State University Introduction Bayes Estimation Empirical Bayes

Bayes meets Dijkstra Exact Inference by Program Verification Joost-Pieter Katoen Dagstuhl

Discussion of Survival Models and Health Sequences Setup: Subjects indexed by i

Poisson Clusters and Unique Factorization Ken Goodearl University of California at Santa Barbara

De Finetti reductions and parallel repetition of multi-player non-local games joint work with

Hierarchical models (cont.) Dr. Jarad Niemi STAT 544 - Iowa State University February 21, 2019

Quantum Increasing Sequences generate Quantum Permutation Groups Pawe l J oziak Faculty

Distributed Systems Mutual Exclusion & Election Algorithms Mutual Exclusion & Election

2-species exclusion processes and combinatorial algebras Sylvie Corteel Arthur Nunge IRIF, LIGM

Solving interacting particle systems by Fourier-like transforms Leonid Petrov University of

Sambuz

Useful Links

Newsletter

Mail Us

Bayes and Lancaster at the Chinese restaurant. Statistical uses of - PowerPoint PPT Presentation

Bayes and Lancaster at the Chinese restaurant. Statistical uses of the Fleming-Viot Process. Dario Span` o University of Warwick 1st Berlin-Padova Young Researchers Workshop 23-25 October, 2014 0 / 1 Based on joint works with Bob Gri ffi ths

Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others

WELCOME CHINESE Your Access Channel to the Chinese Market Welcome Chinese mission statement

The Nave Bayes Classifier Machine Learning 1 Todays lecture The nave Bayes Classifier

Bayes Theorem Thomas Bayes (1701-1761) Simple form of Bayes Theorem, for

LIGHTEN UP LANCASTER COUNTY AND LIVEWELL LANCASTER COUNTY ACTIVE TRANSPORTATION EFFORTS Fall

Red Streams Blue Lancaster County Lancaster County Lancaster County Clean Water Partners Clean

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

DATA MINING: NAVE BAYES 1 Nave Bayes Classifier Thomas Bayes 1702 - 1761 We will start off

STAT 339 Naive Bayes Classification 8-10 March 2017 Colin Reimer Dawson Outline Naive Bayes

Bayes Classifiers Nave Bayes Classification Patrick Mair Bayes Classifiers Weather data

I ntroduction to Mobile Robotics Bayes Filter Kalm an Filter Wolfram Burgard 1 Bayes

Beyond the Castle (Lancaster City Park) What is Beyond the Castle? Lancaster Square Routes 'Three

m LANCASTER COUNTY CONVENTION CENTER AUTHORITY LANCASTER COUNTY CONVENTION CENTER AUTHORITY o

Formal Modeling in Cognitive Science Independence Lecture 23: Conditional Probability; Bayes

Arthur Berg Pennsylvania State University Introduction Bayes Estimation Empirical Bayes

Bayes meets Dijkstra Exact Inference by Program Verification Joost-Pieter Katoen Dagstuhl

Discussion of Survival Models and Health Sequences Setup: Subjects indexed by i

Poisson Clusters and Unique Factorization Ken Goodearl University of California at Santa Barbara

De Finetti reductions and parallel repetition of multi-player non-local games joint work with

Hierarchical models (cont.) Dr. Jarad Niemi STAT 544 - Iowa State University February 21, 2019

Quantum Increasing Sequences generate Quantum Permutation Groups Pawe l J oziak Faculty

Distributed Systems Mutual Exclusion &amp; Election Algorithms Mutual Exclusion &amp; Election

2-species exclusion processes and combinatorial algebras Sylvie Corteel Arthur Nunge IRIF, LIGM

Solving interacting particle systems by Fourier-like transforms Leonid Petrov University of

Sambuz

Useful Links

Newsletter

Mail Us

Distributed Systems Mutual Exclusion & Election Algorithms Mutual Exclusion & Election