A Bayesian nonparametric method for the LR assessment in case of - PowerPoint PPT Presentation

Random partitions of [ n ] Let [ n ] denote the set [ n ] = { 1 , 2 , ..., n } . A partition of the set [ n ] will be denoted as π [ n ] . Giulia Cereda () Short title October 8, 2015 9 / 33

Random partitions of [ n ] Let [ n ] denote the set [ n ] = { 1 , 2 , ..., n } . A partition of the set [ n ] will be denoted as π [ n ] . Random partitions on the set [ n ] will be denoted as Π [ n ] . Giulia Cereda () Short title October 8, 2015 9 / 33

DNA database can be reduced DATABASE of size 10 Person 1 (4 − 10 − 6 − 7) Person 2 (3 − 5 − 6 − 8) Person 3 (3 − 7 − 8 − 10) Person 4 (10 − 1 − 4 − 5) Person 5 (3 − 7 − 8 − 10) Person 6 (3 − 7 − 8 − 10) Person 7 (1 − 5 − 7 − 2) Person 8 (3 − 7 − 8 − 10) Person 9 (3 − 5 − 6 − 8) Person 10 (3 − 7 − 8 − 10) Giulia Cereda () Short title October 8, 2015 10 / 33

DNA database can be reduced DATABASE of size 10 Person 1 (4 − 10 − 6 − 7) Person 2 (3 − 5 − 6 − 8) Person 3 (3 − 7 − 8 − 10) Person 4 (10 − 1 − 4 − 5) Person 5 (3 − 7 − 8 − 10) Person 6 (3 − 7 − 8 − 10) Person 7 (1 − 5 − 7 − 2) Person 8 (3 − 7 − 8 − 10) Person 9 (3 − 5 − 6 − 8) Person 10 (3 − 7 − 8 − 10) Giulia Cereda () Short title October 8, 2015 11 / 33

DNA database can be reduced DATABASE of size 10 Person 1 (4 − 10 − 6 − 7) Person 2 (3 − 5 − 6 − 8) Person 3 (3 − 7 − 8 − 10) Person 4 (10 − 1 − 4 − 5) Person 5 (3 − 7 − 8 − 10) Person 6 (3 − 7 − 8 − 10) Person 7 (1 − 5 − 7 − 2) Person 8 (3 − 7 − 8 − 10) Person 9 (3 − 5 − 6 − 8) Person 10 (3 − 7 − 8 − 10) Assumption 2 → data can be replaces by the equivalence classes on the indices of the relation “to have the same DNA type”. This is a partition of the set [ n ] : {{ 1 } , { 2 , 9 } , { 3 , 5 , 6 , 8 , 10 } , { 4 } , { 7 }} Giulia Cereda () Short title October 8, 2015 11 / 33

Reduced data Giulia Cereda () Short title October 8, 2015 12 / 33

Reduced data Data D is made of the database + 2 new observations Giulia Cereda () Short title October 8, 2015 12 / 33

Reduced data Data D is made of the database + 2 new observations D = π [ n +2] partition of the set { 1 , 2 , ..., n + 2 } Giulia Cereda () Short title October 8, 2015 12 / 33

Reduced data Data D is made of the database + 2 new observations D = π [ n +2] partition of the set { 1 , 2 , ..., n + 2 } Example: Database → π [10] = {{ 1 } , { 2 , 9 } , { 3 , 5 , 6 , 8 , 10 } , { 4 } , { 7 }} Giulia Cereda () Short title October 8, 2015 12 / 33

Reduced data Data D is made of the database + 2 new observations D = π [ n +2] partition of the set { 1 , 2 , ..., n + 2 } Example: Database → π [10] = {{ 1 } , { 2 , 9 } , { 3 , 5 , 6 , 8 , 10 } , { 4 } , { 7 }} D → π [12] = {{ 1 } , { 2 , 9 } , { 3 , 5 , 6 , 8 , 10 } , { 4 } , { 7 } , { 11 , 12 }} Giulia Cereda () Short title October 8, 2015 12 / 33

Reduced data Data D is made of the database + 2 new observations D = π [ n +2] partition of the set { 1 , 2 , ..., n + 2 } Example: Database → π [10] = {{ 1 } , { 2 , 9 } , { 3 , 5 , 6 , 8 , 10 } , { 4 } , { 7 }} D → π [12] = {{ 1 } , { 2 , 9 } , { 3 , 5 , 6 , 8 , 10 } , { 4 } , { 7 } , { 11 , 12 }} We can see the data as a random variable. In that case, D = Π [ n +2] . Giulia Cereda () Short title October 8, 2015 12 / 33

The distribution of D = Π [ n +2] depends on p . However, it does not depend on the order of the p i . Giulia Cereda () Short title October 8, 2015 13 / 33

The distribution of D = Π [ n +2] depends on p . However, it does not depend on the order of the p i . ↓ We can consider directly the ordered vector p ∈ ∇ ∞ = { ( p 1 , p 2 , .... ) , p 1 ≥ p 2 ≥ ... > 0 , � p i = 1 } . Giulia Cereda () Short title October 8, 2015 13 / 33

The distribution of D = Π [ n +2] depends on p . However, it does not depend on the order of the p i . ↓ We can consider directly the ordered vector p ∈ ∇ ∞ = { ( p 1 , p 2 , .... ) , p 1 ≥ p 2 ≥ ... > 0 , � p i = 1 } . For instance, p 3 = the frequency of the third most frequent DNA type in Nature. Giulia Cereda () Short title October 8, 2015 13 / 33

Prior distribution on p ∈ ∇ ∞ Bayesian nonparametrics: we need a prior for the parameter p . Giulia Cereda () Short title October 8, 2015 14 / 33

Prior distribution on p ∈ ∇ ∞ Bayesian nonparametrics: we need a prior for the parameter p . Two parameter Poisson Dirichlet distribution . Giulia Cereda () Short title October 8, 2015 14 / 33

Prior distribution on p ∈ ∇ ∞ Bayesian nonparametrics: we need a prior for the parameter p . Two parameter Poisson Dirichlet distribution . Parameters: 0 < α < 1, θ > − α Giulia Cereda () Short title October 8, 2015 14 / 33

The model (first part) Giulia Cereda () Short title October 8, 2015 15 / 33

The model (first part) A , Θ ∇ ∞ ∋ P

The model (first part) A , Θ P | α, θ ∼ PD( α, θ ) ∇ ∞ ∋ P

The model (first part) A , Θ P | α, θ ∼ PD( α, θ ) ∇ ∞ ∋ P ... N ∋ X 1 X 2 X n X i = j → the i-th observation has the j th most common type in Nature. Giulia Cereda () Short title October 8, 2015 15 / 33

The model (first part) ( A , Θ) ∼ f A , Θ P | α, θ ∼ PD( α, θ ) ∇ ∞ ∋ P ... N ∋ X 1 X 2 X n

The model (first part) ( A , Θ) ∼ f A , Θ P | α, θ ∼ PD( α, θ ) ∇ ∞ ∋ P ... X n +1 N ∋ X 1 X 2 X n Suspect

The model (first part) ( A , Θ) ∼ f A , Θ P | α, θ ∼ PD( α, θ ) ∇ ∞ ∋ P ... X n +1 N ∋ X 1 X 2 X n X 1 , ..., X n +1 | p ∼ i.i.d p Suspect

The model (first part) ( A , Θ) ∼ f A , Θ P | α, θ ∼ PD( α, θ ) ∇ ∞ ∋ P { H p , H d } ∋ H ... X n +1 X n +2 N ∋ X 1 X 2 X n X 1 , ..., X n +1 | p ∼ i.i.d p Suspect Crime stain

The model (first part) ( A , Θ) ∼ f A , Θ P | α, θ ∼ PD( α, θ ) ∇ ∞ ∋ P { H p , H d } ∋ H ... X n +1 X n +2 N ∋ X 1 X 2 X n X 1 , ..., X n +1 | p ∼ i.i.d p Suspect Crime stain � δ x n +1 if H = H p X n +2 | p , H , x n +1 ∼ if H = H d p Giulia Cereda () Short title October 8, 2015 16 / 33

The model (first part) A , Θ P H ... X n +1 X n +2 X 1 X 2 X n Giulia Cereda () Short title October 8, 2015 17 / 33

Random partitions Some notation: Given X 1 , ..., X n ∈ N , random variables, Π [ n ] ( X 1 , X 2 , ..., X n ) is the random partition defined by the equivalence classes of i ∼ j iff X i = X j . Giulia Cereda () Short title October 8, 2015 18 / 33

Random partitions Some notation: Given X 1 , ..., X n ∈ N , random variables, Π [ n ] ( X 1 , X 2 , ..., X n ) is the random partition defined by the equivalence classes of i ∼ j iff X i = X j . Π [ n ] = π Db X 1 , ..., X n − → [ n ] Π [ n +1] = π Db+ X 1 , ..., X n , X n +1 − → [ n +1] Π [ n +2] = π Db++ X 1 , ..., X n , X n +1 , X n +2 − → [ n +2] Giulia Cereda () Short title October 8, 2015 18 / 33

Random partitions Some notation: Given X 1 , ..., X n ∈ N , random variables, Π [ n ] ( X 1 , X 2 , ..., X n ) is the random partition defined by the equivalence classes of i ∼ j iff X i = X j . Π [ n ] = π Db X 1 , ..., X n − → [ n ] Π [ n +1] = π Db+ X 1 , ..., X n , X n +1 − → [ n +1] Π [ n +2] = π Db++ X 1 , ..., X n , X n +1 , X n +2 − → [ n +2] X 1 , ..., X n are not observed, but generates the same partition as the original database. Data can be defined as D = Π [ n +2] . Giulia Cereda () Short title October 8, 2015 18 / 33

The complete model A , Θ P H X 1 X 2 X n X n +1 X n +2 D Giulia Cereda () Short title October 8, 2015 19 / 33

Pitman sampling formula Giulia Cereda () Short title October 8, 2015 20 / 33

Pitman sampling formula P ∼ PD( α, θ ) X 1 , X 2 , ..., X n | P = p ∼ i.i.d p Giulia Cereda () Short title October 8, 2015 20 / 33

Pitman sampling formula P ∼ PD( α, θ ) X 1 , X 2 , ..., X n | P = p ∼ i.i.d p then Π [ n ] = Π [ n ] ( X 1 , ..., X n ) has the following distribution: Giulia Cereda () Short title October 8, 2015 20 / 33

Pitman sampling formula P ∼ PD( α, θ ) X 1 , X 2 , ..., X n | P = p ∼ i.i.d p then Π [ n ] = Π [ n ] ( X 1 , ..., X n ) has the following distribution: k α,θ ( π [ n ] ) = [ θ + α ] k − 1; α � Pr(Π [ n ] = π [ n ] | α, θ ) = P n [1 − α ] n i − 1;1 , [ θ + 1] n − 1;1 i =1 Giulia Cereda () Short title October 8, 2015 20 / 33

Pitman sampling formula P ∼ PD( α, θ ) X 1 , X 2 , ..., X n | P = p ∼ i.i.d p then Π [ n ] = Π [ n ] ( X 1 , ..., X n ) has the following distribution: k α,θ ( π [ n ] ) = [ θ + α ] k − 1; α � Pr(Π [ n ] = π [ n ] | α, θ ) = P n [1 − α ] n i − 1;1 , [ θ + 1] n − 1;1 i =1 In our model � P n +2 α,θ ( π Db++ [ n +2] ) if h = H d Pr( D | α, θ, h ) = Pr(Π [ n +2] = π Db ++ [ n +2] | α, θ, h ) = P n +1 α,θ ( π Db+ [ n +1] ) if h = H p Giulia Cereda () Short title October 8, 2015 20 / 33

The model, simplified A , Θ H D Giulia Cereda () Short title October 8, 2015 21 / 33

The model, simplified A , Θ H D D = Π [ n +2] . Giulia Cereda () Short title October 8, 2015 21 / 33

Lemma Giulia Cereda () Short title October 8, 2015 22 / 33

Lemma A H X Y Lemma Given four random variables A , H , X and Y , as above, the likelihood function for h , given X = x and Y = y , satisfies lik ( h | x , y ) ∝ E ( p ( y | x , A , h ) | X = x ) . Giulia Cereda () Short title October 8, 2015 22 / 33

Lemma A , Θ H Π [ n +1] Π [ n +2] Giulia Cereda () Short title October 8, 2015 23 / 33

Lemma A , Θ H Π [ n +1] Π [ n +2] lik ( h | π [ n +1] , π [ n +2] ) ∝ E ( p ( π [ n +2] | π [ n +1] , A , Θ , h ) | Π [ n +1] = π [ n +1] ) . Giulia Cereda () Short title October 8, 2015 23 / 33

1 LR = � � 1 − A n +1+Θ | Π [ n +1] = π [ n +1] E Giulia Cereda () Short title October 8, 2015 25 / 33

1 LR = � � 1 − A n +1+Θ | Π [ n +1] = π [ n +1] E 1 − A By defining the random variable Φ = n n +1+Θ we can write the LR as Giulia Cereda () Short title October 8, 2015 25 / 33

1 LR = � � 1 − A n +1+Θ | Π [ n +1] = π [ n +1] E 1 − A By defining the random variable Φ = n n +1+Θ we can write the LR as n LR = E (Φ | Π [ n +1] = π [ n +1] ) . Giulia Cereda () Short title October 8, 2015 25 / 33

1 LR = � � 1 − A n +1+Θ | Π [ n +1] = π [ n +1] E 1 − A By defining the random variable Φ = n n +1+Θ we can write the LR as n LR = E (Φ | Π [ n +1] = π [ n +1] ) . We are interested in the distribution of Φ , Θ | Π [ n +1] Giulia Cereda () Short title October 8, 2015 25 / 33

1 LR = � � 1 − A n +1+Θ | Π [ n +1] = π [ n +1] E 1 − A By defining the random variable Φ = n n +1+Θ we can write the LR as n LR = E (Φ | Π [ n +1] = π [ n +1] ) . We are interested in the distribution of Φ , Θ | Π [ n +1] p ( φ, θ | π [ n +1] ) ∝ p ( π [ n +1] | φ, θ ) f ( φ, θ ) Giulia Cereda () Short title October 8, 2015 25 / 33

Log likelihood with φ and θ log 10 p ( π [ n +1] | φ, θ ) 0 −80 − 1 0 0 − 6 − 4 0 0.55 −20 − 4 . 6 0 5 1 7 − 2 . 9 9 5 7 3 2 0.50 ( φ MLE , θ MLE ) φ ● 0.45 −20 0.40 −40 −80 −60 −100 150 200 250 θ Dutch Y-STR database, 7 loci, N=18,925 Giulia Cereda () Short title October 8, 2015 26 / 33

Log likelihood with φ and θ log 10 p ( π [ n +1] | φ, θ ) logL( φ , θ | π [n + 1] ) −100 0 −80 −80 −60 − 1 0 − 1 ( φ MLE , θ MLE )) 0 logN (( φ MLE , θ MLE ) , I obs − 6 − 4 0 −40 95% confidence interval 99% confidence interval 0.55 −20 −20 − 4 . 6 0 5 1 7 − 2 . 9 9 5 7 3 2 0.50 ( φ MLE , θ MLE ) φ ● −2.995732 0.45 −4.60517 −20 −20 0.40 −40 −40 −80 −60 −80 −60 −100 −100 150 200 250 θ Dutch Y-STR database, 7 loci, N=18,925 Giulia Cereda () Short title October 8, 2015 27 / 33

Log likelihood as a function of φ and θ logL( φ , θ | π [n + 1] ) 0 0 0 0 −80 −80 −60 − 1 1 − − 1 ( φ MLE , θ MLE )) logN (( φ MLE , θ MLE ) , I obs − 6 0 −40 −40 95% confidence interval 99% confidence interval 0.55 −20 −20 −4.60517 −2.995732 0.50 ( φ MLE , θ MLE ) φ ● −2.995732 0.45 −4.60517 − 2 0 −20 0.40 − −40 4 0 −80 −60 −60 −80 −100 1 0 0 − 150 200 250 θ p ( π [ n +1] | φ, θ ) ≈ N (( φ MLE , θ MLE ) , I − 1 MLE ) Giulia Cereda () Short title October 8, 2015 28 / 33

Log likelihood as a function of φ and θ logL( φ , θ | π [n + 1] ) 0 0 0 0 −80 −80 −60 − 1 1 − − 1 ( φ MLE , θ MLE )) logN (( φ MLE , θ MLE ) , I obs − 6 0 −40 −40 95% confidence interval 99% confidence interval 0.55 −20 −20 −4.60517 −2.995732 0.50 ( φ MLE , θ MLE ) φ ● −2.995732 0.45 −4.60517 − 2 0 −20 0.40 − −40 4 0 −80 −60 −60 −80 −100 1 0 0 − 150 200 250 θ p ( π [ n +1] | φ, θ ) ≈ N (( φ MLE , θ MLE ) , I − 1 MLE ) p ( φ, θ | π [ n +1] ) ∝ p ( π [ n +1] | φ, θ ) f ( φ, θ ) Giulia Cereda () Short title October 8, 2015 28 / 33

Log likelihood as a function of φ and θ logL( φ , θ | π [n + 1] ) 0 0 0 0 −80 −80 −60 − 1 − 1 − 1 ( φ MLE , θ MLE )) logN (( φ MLE , θ MLE ) , I obs − 6 0 −40 −40 95% confidence interval 99% confidence interval 0.55 −20 −20 −4.60517 −2.995732 0.50 ( φ MLE , θ MLE ) φ ● −2.995732 0.45 −4.60517 − 2 0 −20 0.40 −40 − 4 0 −80 −60 −60 −80 −100 1 0 0 − 150 200 250 θ p ( π [ n +1] | φ, θ ) ≈ N (( φ MLE , θ MLE ) , I − 1 MLE ) p ( φ, θ | π [ n +1] ) ∝ p ( π [ n +1] | φ, θ ) f ( φ, θ ) If the prior is smooth around the MLE then Giulia Cereda () Short title October 8, 2015 28 / 33

Log likelihood as a function of φ and θ logL( φ , θ | π [n + 1] ) 0 0 0 0 −80 −80 −60 − 1 1 − − 1 ( φ MLE , θ MLE )) logN (( φ MLE , θ MLE ) , I obs − 6 0 −40 −40 95% confidence interval 99% confidence interval 0.55 −20 −20 −4.60517 −2.995732 0.50 ( φ MLE , θ MLE ) φ ● −2.995732 0.45 −4.60517 − 2 0 −20 0.40 − −40 4 0 −80 −60 −60 −80 −100 1 0 0 − 150 200 250 θ p ( π [ n +1] | φ, θ ) ≈ N (( φ MLE , θ MLE ) , I − 1 MLE ) p ( φ, θ | π [ n +1] ) ∝ p ( π [ n +1] | φ, θ ) f ( φ, θ ) If the prior is smooth around the MLE then p ( φ, θ | π [ n +1] ) ≈ N (( φ MLE , θ MLE ) , I − 1 MLE ) . Giulia Cereda () Short title October 8, 2015 28 / 33

Log likelihood as a function of φ and θ logL( φ , θ | π [n + 1] ) 0 0 0 0 −80 −80 −60 − 1 1 − − 1 ( φ MLE , θ MLE )) logN (( φ MLE , θ MLE ) , I obs − 6 0 −40 −40 95% confidence interval 99% confidence interval 0.55 −20 −20 −4.60517 −2.995732 0.50 ( φ MLE , θ MLE ) φ ● −2.995732 0.45 −4.60517 − 2 0 −20 0.40 − −40 4 0 −80 −60 −60 −80 −100 1 0 0 − 150 200 250 θ p ( π [ n +1] | φ, θ ) ≈ N (( φ MLE , θ MLE ) , I − 1 MLE ) p ( φ, θ | π [ n +1] ) ∝ p ( π [ n +1] | φ, θ ) f ( φ, θ ) If the prior is smooth around the MLE then p ( φ, θ | π [ n +1] ) ≈ N (( φ MLE , θ MLE ) , I − 1 MLE ) . It follows that E (Φ | Π [ n +1] = π [ n +1] ) ≈ φ MLE . That is Giulia Cereda () Short title October 8, 2015 28 / 33

p ( φ, θ | π [ n +1] ) ≈ N (( φ MLE , θ MLE ) , I − 1 MLE ) . Giulia Cereda () Short title October 8, 2015 29 / 33

p ( φ, θ | π [ n +1] ) ≈ N (( φ MLE , θ MLE ) , I − 1 MLE ) . It follows that E (Φ | Π [ n +1] = π [ n +1] ) ≈ φ MLE . Giulia Cereda () Short title October 8, 2015 29 / 33

A Bayesian nonparametric method for the LR assessment in case of - PowerPoint PPT Presentation

A Bayesian nonparametric method for the LR assessment in case of rare type match Giulia Cereda October 8, 2015 Giulia Cereda () Short title October 8, 2015 1 / 33 Forensic Statistics Ingredients: Giulia Cereda () Short title October 8,

Introduction to Nonparametric Bayesian Modeling and Gaussian Process Regression Piyush Rai Dept.

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Nonparametric Regression Splines for Nonparametric Regression Splines for Regional Atmospheric

Nonparametric Sequential Change Detection for High-Dimensional Problems Yasin Ylmaz Electrical

The np package np : A Package for Nonparametric Kernel The np package implements a variety of

Nonparametric analysis of CMB Nonparametric analysis of CMB power spectrum data and consistency

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Nonparametric Bayesian Models for Sparse Matrices and Covariances Zoubin Ghahramani Department

Bayesian nonparametric inference for diffusion models with discrete sampling Delft University of

Bayesian Nonparametric Models for Data Exploration Melanie F. Pradier Friday 15 th September,

Dirichlet Processes and Nonparametric Bayesian Modelling Volker Tresp 1 Motivation Infinite

Dr. Nonparametric Bayes Or: How I Learned to Stop Worrying and Love the Dirichlet Process Kurt

The Scientific Method The Scientific Method The Scientific Method involves 6 steps: Problem

CO COVID-19 Vir irtual al Communit ity Meetin ing March 27, 2020 11:00 12:00 AM PDT

Certainty in Uncertain Times Certainty is Only a Molecule Away Investor Call, Q1 FY17 NASDAQ:

Advising the Federal Government Susan L. Graham University of California, Berkeley LISPI

Specific Aims One Page The single most important page in a grant Specific Aims Specific Aims

Nanopore Sequencing Technology and Tools for Genome Assembly: Computational Analysis of the

Cybersecurity for Future Presidents Lecture 8: How can individuals be associated with actions in

Secure Genome me Analysis The privacy workshop is jointly

Welcome to the 2015 Cyber Risk Insights Conference! @Advisen #CyberRisk Welcoming Remarks Bill