The statistical evaluation of DNA crime stains in R Miriam Maruiakov - PowerPoint PPT Presentation

The statistical evaluation of DNA crime stains in R Miriam Marušiaková Department of Statistics, Charles University, Prague Institute of Computer Science, CBI, Academy of Sciences of CR UseR! 2008, Dortmund Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 1 / 36

Introduction Single Crime Scene Independence Population substructure Relatedness R package forensic Mixed stain Independence Population substructure People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 2 / 36

Introduction Outline Introduction Single Crime Scene Independence Population substructure Relatedness R package forensic Mixed stain Independence Population substructure People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 3 / 36

Introduction Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 4 / 36

Introduction Single crime scene stain ◮ Blood stain at the crime scene ◮ Believed it was left by offender ◮ Suspect arrested for reasons unconnected with his DNA profile ◮ Crime sample, suspect sample Hypothesis ◮ H p (prosecution): The suspect left the crime stain. ◮ H d (defense): Some other person left the crime stain. Notation ◮ DNA typing results E = { G C , G S } ◮ non-DNA evidence I Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 5 / 36

Introduction Errors and fallacies Statement The probability of observing this type if the blood came from someone other than suspect is 1 in 100. Pr ( G C | H d , I ) = 1 / 100 Common error The probability that the blood came from someone else is 1 in 100. Pr ( H d | G C , I ) = 1 / 100 There is a 99% probability that it came from the suspect. Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 8 / 36

Introduction Errors and fallacies (cont.’d) Statement The evidence is 100 times more probable if the suspect left the crime stain than if some unknown person left it. 1 Pr ( G C | H d , I ) = 100 Common error It is 100 times more probable that the suspect left the crime stain than some unknown person. Pr ( H p | G C , I ) Pr ( H d | G C , I ) = 100 Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 9 / 36

Single Crime Scene Outline Introduction Single Crime Scene Independence Population substructure Relatedness R package forensic Mixed stain Independence Population substructure People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 10 / 36

Single Crime Scene Independence Outline Introduction Single Crime Scene Independence Population substructure Relatedness R package forensic Mixed stain Independence Population substructure People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 11 / 36

Single Crime Scene Independence Product rule Model of an ideal population ◮ Infinite size ◮ Random mating ◮ Model reliable for most real-world problems Hardy-Weinberg equilibrium p 2 Pr ( G = A i A i ) = i Pr ( G = A i A j ) = 2 p i p j , i � = j Likelihood ratio � � 1 Pr ( G C | H d , I ) = 1 1 1 LR = Pr ( G C | G S , H d , I ) = p 2 2 p i p j i Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 12 / 36

Single Crime Scene Population substructure Outline Introduction Single Crime Scene Independence Population substructure Relatedness R package forensic Mixed stain Independence Population substructure People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 13 / 36

Single Crime Scene Population substructure Population substructure ◮ F - measure of uncertainty about allele proportions in the population of suspects ◮ Genetic interpretation of F (Wright, 1951) ◮ How to estimate F (Weir and Cockerham, 1984) ◮ Recommendations (National Research Council, 1996) F = 0 . 01 large subpopulations (USA) F = 0 . 03 small isolated subpopulations ◮ Match probabilities (Balding and Nichols, 1994) P ( G C = A i A i | G S = A i A i ) = [ 2 F + ( 1 − F ) p i ] [ 3 F + ( 1 − F ) p i ] ( 1 + F )( 1 + 2 F ) P ( G C = A i A j | G S = A i A j ) = 2 [ F + ( 1 − F ) p i ] [ F + ( 1 − F ) p j ] ( 1 + F )( 1 + 2 F ) Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 14 / 36

Single Crime Scene Population substructure Effects of F corrections Likelihood ratio - some numerical values ◮ Heterozygotes A i A j , p i = p j = p F = 0 F = 0 . 001 F = 0 . 01 F = 0 . 03 5 000 4 152 1 295 346 p = 0 . 01 200 193 145 89 p = 0 . 05 50 49 43 34 p = 0 . 10 ◮ Homozygotes A i A i , p i = p F = 0 F = 0 . 001 F = 0 . 01 F = 0 . 03 10 000 6 439 863 157 p = 0 . 01 400 364 186 73 p = 0 . 05 100 96 67 37 p = 0 . 10 Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 15 / 36

Single Crime Scene Relatedness Outline Introduction Single Crime Scene Independence Population substructure Relatedness R package forensic Mixed stain Independence Population substructure People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 16 / 36

� � Single Crime Scene Relatedness Inbreeding ◮ Individuals with common ancestors - related ◮ Their children - inbred ◮ Alleles are ibd (identical by descent) - copies of the same allele Example Alleles h 1 , h 2 transmitted from parent H to X , Y who transmit a , b to I H � � � h 1 h 2 � � � � � � � � � � � � X Y � � � � � � � � � � � a � � b � � � � I Pr ( h 1 is ibd to h 2 ) = Pr ( h 1 ≡ h 2 ) = 0 . 5 Pr ( a ≡ b ) = Pr ( a ≡ h 1 , b ≡ h 2 | h 1 ≡ h 2 ) Pr ( h 1 ≡ h 2 ) = 0 . 5 3 = 0 . 125 Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 17 / 36

Single Crime Scene Relatedness Match probabilities for close relatives Balding and Nichols (1994) � � � k 0 p 4 i + k 1 p 3 i + k 2 p 2 / p 2 i i � � Pr ( G C | G S , H d , I ) = 4 k 0 p 2 i p 2 j + k 1 p i p j ( p i + p j ) + 2 k 2 p i p j / 2 p i p j Kinship coefficients Relationship k 0 k 1 k 2 Parent - child 0 1 0 Siblings 1/4 1/2 1/4 Grandparent - grandchild 1/2 1/2 0 Uncle - nephew 1/2 1/2 0 Cousins 3/4 1/4 0 Unrelated 1 0 0 k i - probability that two persons will share i alleles ibd i = 0 , 1 , 2 Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 18 / 36

Single Crime Scene R package forensic Outline Introduction Single Crime Scene Independence Population substructure Relatedness R package forensic Mixed stain Independence Population substructure People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 19 / 36

Single Crime Scene R package forensic R function Pmatch Usage Pmatch(prob, k = c(1, 0, 0), theta = 0) Example > p <- c(0.057, 0.160, 0.182, 0, 0.024, 0.122) > Pmatch(p) $prob locus 1 locus 2 locus 3 allele 1 0.057 0.182 0.024 allele 2 0.160 0.000 0.122 $match locus 1 locus 2 locus 3 [1,] 0.01824 0.033124 0.005856 $total_match [1] 3.538088e-06 Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 20 / 36

Mixed stain Outline Introduction Single Crime Scene Independence Population substructure Relatedness R package forensic Mixed stain Independence Population substructure People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 21 / 36

The statistical evaluation of DNA crime stains in R Miriam Maruiakov - PowerPoint PPT Presentation

The statistical evaluation of DNA crime stains in R Miriam Maruiakov Department of Statistics, Charles University, Prague Institute of Computer Science, CBI, Academy of Sciences of CR UseR! 2008, Dortmund Miriam Maruiakov (Prague) The

DNA D DNA Double bl Helix DNA stands for: DNA stands for: U d Under a Deoxyribose

Table of Contents Why DNA Computing? The Structure of DNA DNA Computing Operations on DNA

Outline Which stains Special stains in liver pathology Why the stain is done Which, why,

Take out your DNA model DNA and the Human Genome DNA Model How was your How was your model

Table of Contents Why DNA Computing? The Structure of DNA DNA Computing Operations on

DNA Computing Information Processing with DNA Molecules Christian Jacob, 01/2002. Table of

Eastern Shores (GHOTES) DNA A Family Tree DNA Project Family Tree DNA Family Tree DNA or

What is Cyber Crime? Cyber Enabled Crime Cyber Dependant Crime Traditional crime that is

Knife Crime Knife Crime This lesson deal with the issue of knife crime in modern Britain. Read

DNA Evidence and Property DNA Evidence and Property Crime Crime John F. Kennedy School of

DNA IN OUR FOOD? EXTRACTION OF DNA FROM STRAWBERRIES (GETTING THE DNA OUT OF STRAWBERRIES) -OR

The Design of Autonomous DNA The Design of Autonomous DNA Nanomechanical Devices: Devices:

DNA evidence: two important features match between two DNA profiles frequency of the DNA profile in

DNA Nucleus Contains cells genetic info (DNA) controls cell functions DNA Structure

Self-Assembling DNA Self-Assembling DNA N. Jonoska Jonoska, N. C. , N. C. Seeman Seeman, DNA

Go Bananas! Introduction Tell you about DNA Show you how to extract DNA from a Banana

Large chromatographic data sets analysis on the example of metabolomic data Aneta Sawikowska 1 , 2

Genetic Evaluation of Eucalyptus cladocalyx Growth and Form in Western Australia Andrew Callister

P++ models ( DIOGENE software) for adjustment to environmental effects Applications in Genetics.

Introduction 2 / 148 Social Life and Economics The outstanding discovery

Outline 1 The topic 2 Decision support systems 3 Modeling 3.2 Numerical models

Analysis of High-Throughput Biological Data Part II: Computational Bottlenecks and Novel

Learning Links in MeSH Co-occurrence Network Preliminary Results Andrej Kastrin 1 , Thomas C.

From the Foundation of Mathematics to the Birth of Computation Fairouz Kamareddine Heriot-Watt