Applications of in Forensic Science Pushing Out the Frontiers - - PowerPoint PPT Presentation

applications of in forensic science
SMART_READER_LITE
LIVE PREVIEW

Applications of in Forensic Science Pushing Out the Frontiers - - PowerPoint PPT Presentation

Applications of in Forensic Science Pushing Out the Frontiers Nick D. K. Petraco and Many Others John Jay College of Criminal Justice ! Outline Admissibility of Scientific Evidence is a problem! Frye and the Daubert Standards


slide-1
SLIDE 1

Applications of in Forensic Science

Pushing Out the Frontiers

Nick D. K. Petraco and Many Others John Jay College of Criminal Justice!

slide-2
SLIDE 2

Outline

  • Admissibility of Scientific Evidence is a

problem!

  • Frye and the Daubert Standards
  • How chemistry, engineering, math and computers

can help forensic science

  • Current Projects At John Jay:
  • Petroleum Distillates (Fire Debris)
  • Dust (Trace Evidence)
  • Cartridge cases (Firearms)
slide-3
SLIDE 3

Admissibility of scientific evidence!

  • Principal legal standards: Frye and Daubert
  • Frye (1923) – Testimony offered as “scientific”

must “...have gained general acceptance in the particular field in which it belongs”.

  • New York is still a “Frye State”
slide-4
SLIDE 4
  • Daubert (1993)- Judges are the

“gatekeepers” of scientific evidence.

  • Must determine if the science is reliable
  • Has empirical testing been done?
  • Falsifiability
  • Has the science been subject to peer review?
  • Are there known error rates?
  • Is there general acceptance?
  • Federal Government and 26(-ish) States

are “Daubert States”

Frye and Daubert

slide-5
SLIDE 5

Raising Standards with Data and Statistics

  • DNA profiling the most successful application of

statistics in forensic science.

  • Responsible for current interest in “raising standards” of
  • ther branches in forensics.
  • No protocols for the application of statistics to

physical evidence.

  • Our goal: application of objective, numerical

computational pattern comparison to physical evidence

slide-6
SLIDE 6
  • Statistical pattern comparison!
  • Modern algorithms are called

machine learning

  • Idea is to measure

features of the physical evidence that characterize it

  • Train algorithm to

recognize “major” differences between groups of features while taking into account natural variation and measurement error.

What Statistics Can Be Used?

slide-7
SLIDE 7
  • R is not a proprietary black box!
  • Open-source and totally transparent!
  • R maintained by a professional group of

statisticians, and computational scientists

  • From very simple to state-of-the-art procedures available
  • Very good graphics for exhibits and papers
  • R is extensible (it is a full scripting language)
  • R has “commercial versions” too, Revolution

R, S+

Why ?

slide-8
SLIDE 8

Fire Debris Analysis Casework

  • Liquid gasoline samples recovered during

investigation:

  • Unknown history
  • Subjected to various real world conditions.
  • If an individual sample can be discriminated from the

larger group, this can be of forensic interest.

  • Gas-Chromatography Commonly Used to ID gas.
  • Peak comparisons of chromatograms difficult and time

consuming.

  • Does “eye-balling” satisfy Daubert, or even Frye .....????
slide-9
SLIDE 9

Study Design

  • This study was undertaken to examine the variability of

gasoline components in

  • Twenty liquid gasoline samples
  • Samples from fire investigations in the New York City area
  • All samples analyzed using Gas Chromatography-Mass

Spectrometry

  • Keto and Wineman target compounds
  • Fifteen peaks were chosen in this study that represented the

common components present in gasoline.

  • Normalized GC-MS peak areas were utilized to test the

discrimination potential of multiple multivariate methods for discrimination.

slide-10
SLIDE 10

Chosen Peaks

  • M. Gil!
slide-11
SLIDE 11
  • Use prcomp, lda (MASS) and rgl: 10D PCA-3D CVA
  • HOO-CV correct classification rate: 100%.
slide-12
SLIDE 12

Dust

  • N. Petraco!
slide-13
SLIDE 13

Hans Gross

A 19th Century German magistrate influenced by the writings of Sir Arthur Conan Doyle suggested that Dust and other traces be allowed in legal proceedings.

Dust

  • N. Petraco!
slide-14
SLIDE 14

It enables one to identify the people places and things involved in an event.

What can it tell you?

It helps one to associate the people, places and things involved in an event. It can often tell a story. It can help one reconstruct the event.

slide-15
SLIDE 15

Develop a simple method that enables you to identify the trace materials commonly found in dust samples Develop a simple generic data sheet (Tool) that allows you to quickly collect data on the trace materials commonly found in dust samples Write some analysis scripts Analyze the data Convert the data sheet in to an Excel Spreadsheet and load into

  • N. Petraco!
slide-16
SLIDE 16
  • Conformal Prediction TheoryVovk et al.
  • New, but has roots in 1960’s with Kolmogorov’s ideas on

randomness and algorithmic complexity.

  • Can be used with any statistical pattern classification

algorithm.

  • Independent of data’s underlying probability distribution.
  • This is a very important property for forensic pattern

recognition!!

  • Well, …sample should be I.I.D.
  • For identification of patterns, method produces
  • “Confidence region” at Level of confidence, 1- α
  • Confidence: Measure of how likely I.D. procedure is to

be correct

  • Results are valid:

P(identification error) ≤ α

slide-17
SLIDE 17

3D PCA-Clustering can show potential for discrimination

  • Use e1071, caret, pls and custom

scripts:

  • PCA-SVM 27D, refined bootstrap

error rate estimate= 0.7%,

  • 95% CI [0.0%,3.3 %]
  • CPT 99% level of confidence “I.D.”
  • Empirical Error rate = 0%
  • Unique and correct ID intervals

= 93.1%

  • PLS-DA 35D refined Bootstrap error

rate estimate= 0.8%

  • 95% CI [0.0%,3.3%]
slide-18
SLIDE 18

Known Match Comparisons

5/8” Consecutively manufactured chisels

  • G. Petillo

Tool Marks

slide-19
SLIDE 19
  • Obtain striation pattern profiles form 3D confocal microscopy

Approach For Striated Tool Marks

slide-20
SLIDE 20

Glock 19 firing pin impression

Primer!shear!

  • P. Diaczuk!
slide-21
SLIDE 21
  • 3D confocal image of entire shear pattern
slide-22
SLIDE 22

Shear marks on primer of two different Glock 19s

slide-23
SLIDE 23

Mean total profile: Mean “waviness” profile: Mean “roughness” profile:

slide-24
SLIDE 24
  • Primer shears (82-91 profiles)

– PCA-SVM, CPT at the 95% level of confidence

  • Empirical error rate was 4.7%
  • 90.7% of I.D. intervals were unique and correct
  • 7% of I.D. intervals had more than 1 I.D.
  • No “uninformative” intervals were returned

– PCA-SVM, HOO-CV

  • Error rate estimate is 0.0%-4.4%, depending on

the number of replicates

– PLS-DA, Bootstrap (>10 replicates only)

  • 95% confidence interval for error rate: [0%, 0%]
  • 95% confidence interval for average false positive rate: [0%, 0%]
  • 95% confidence interval for average false negative rate: [0%, 0%]

– PLS-DA, HOO-CV

  • Error rate estimate is 0.0%-4.3%, depending on the number of replicates
  • Results so far are on par with expectations

Primer Shear

slide-25
SLIDE 25
  • 3D PCA 36 Glocks, 1080 simulated and real primer shear

profiles:

  • 18D PCA-SVM, refined bootstrap gun I.D. error rate 0.3%, 95% CI [0%, 0.8%]
slide-26
SLIDE 26

Empirical Bayes’

  • Bayes’ Rule: can we realistically estimate posterior

error probabilities empirically/falsifiably??

Pr S- | t +

( ) =

Pr t + | S-

( )

Pr t +

( )

Pr S-

( )

Probability of no actual association given a test/algorithm indicates a positive ID

  • Perhaps. Genomics has spawned similar questions:
  • What is the probability of no disease (S-) given the

differences in expression “scores” of thousands of genes.

slide-27
SLIDE 27

Empirical Bayes’

  • Erfon’s machinery for “empirical Bayes’ two-groups

model”Efron 2007

  • Surprisingly simple!
  • S-, truly no association, Null hypothesis
  • S+, truly an association, Non-null hypothesis
  • z, a Gaussian random variate derived from a machine learning task

to ID an unknown pattern with a group

  • Scheme yields estimate of Pr(S-|z) along with it’s standard

error

  • Called the local false discovery rate (fdr) or posterior error

probability

  • Given a similarity score, fdr is an estimate of the probability that

the computer is wrong in calling a “match”

  • Catch: you need A LOT of z and they should be fairly

independent and Pr(S-) > 0.9

slide-28
SLIDE 28

Empirical Bayes’

  • Machine learning algorithms dump out tons of

“scores” measuring how much they think each unknown “piece of evidence”, “matches” with a known

1 2 3 4 5 6 7 8 1 0.819231296 0.02159198 0.029272183 0.025411563 0.0225503 0.010915617 0.024680949 0.046346112 2 0.765918964 0.02255741 0.050851857 0.030990821 0.02792472 0.016217858 0.028200947 0.057337426 3 0.879527253 0.01795078 0.0184986 0.022467998 0.01577359 0.007162571 0.011941812 0.026677397 4 0.800343998 0.02064226 0.045323988 0.024858598 0.02244252 0.012063858 0.022874428 0.051450344 5 0.767143734 0.02275608 0.040918155 0.035104182 0.02878297 0.012899321 0.028778064 0.063617494 6 0.85119471 0.02110206 0.023900293 0.019113873 0.02001155 0.008916055 0.018147978 0.037613483 7 0.74589297 0.0218173 0.046976363 0.042961529 0.03109739 0.019528189 0.033183337 0.058542922 8 0.858658608 0.01868246 0.028668362 0.019978748 0.01658029 0.008302535 0.015666725 0.033462271 9 0.757389572 0.02335122 0.031192719 0.041061437 0.03268829 0.015409691 0.036750576 0.062156499 10 0.861134581 0.01798733 0.019716501 0.032786291 0.01893542 0.009636956 0.012723958 0.027078964 11 0.705447085 0.04062574 0.039299857 0.038929197 0.03988337 0.021602265 0.031111236 0.083101251 12 0.880200022 0.02285615 0.015669536 0.011783052 0.01489933 0.024089363 0.00854029 0.021962262 13 0.846512708 0.03225958 0.01772897 0.018825927 0.02479879 0.012076544 0.015157749 0.032639732 14 0.906210922 0.01286627 0.015271791 0.014028597 0.01511204 0.006311254 0.009610058 0.020589068 15 0.924204618 0.01267558 0.010580565 0.012674802 0.01076464 0.007229735 0.006838076 0.015031985 16 0.743561031 0.04290822 0.013293883 0.020473147 0.04545213 0.080523563 0.019370505 0.034417526 17 0.863369692 0.01235278 0.045802525 0.02472657 0.01088378 0.00856469 0.007809193 0.026490772 18 0.918347765 0.01613136 0.008822265 0.011735291 0.01131915 0.012266314 0.006711902 0.01466595

Gun ID (knowns) Primer Shear # (unknowns)

  • Platt SVM “probability scores” from Glock 19 ID study:
slide-29
SLIDE 29

Empirical Bayes’

  • Use these SVM “Platt scores” to form p-values
  • Transform p-values with probit function
  • Produces A LOT of z values
  • The z are fairly independent……

Null (Non−Match) Histogram (via HOO)

z−values Density −4 −2 2 4 0.0 0.1 0.2 0.3 0.4 −4 −2 2 4 0.0 0.1 0.2 0.3 0.4

! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !

−3 −2 −1 1 2 3 −3 −2 −1 1 2

Normal Q−Q Plot

Theoretical Quantiles Sample Quantiles

Checking assumptions on z-scores of Glock 19 primer shears!

slide-30
SLIDE 30

Empirical Bayes’

  • Use locfdr: Gives a “calibrated posterior association

probability” model

−4 −2 2 0.0 0.2 0.4 0.6 0.8 1.0 1.2

Estimated Posterior Error Probilities (Local FDRs)

z value Pr(S−|z) est. −4 −2 2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 −4 −2 2 0.0 0.2 0.4 0.6 0.8 1.0 1.2

−4.8 −4.6 −4.4 −4.2 0.0000 0.0005 0.0010 0.0015

Estimated Posterior Error Probilities (Local FDRs)

z value Pr(S−|z) est. −4.8 −4.6 −4.4 −4.2 0.0000 0.0005 0.0010 0.0015 −4.8 −4.6 −4.4 −4.2 0.0000 0.0005 0.0010 0.0015

This is the est. prob of no association Computer outputs “match” for: unknown-known from “Bob the burglar”, falls here! This is an uncertainty in the estimate

slide-31
SLIDE 31

Empirical Bayes’

−5.5 −5.0 −4.5 −4.0 −3.5 −3.0 −2.5 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 z value Pr(S−|z) est. −5.5 −5.0 −4.5 −4.0 −3.5 −3.0 −2.5 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 −5.5 −5.0 −4.5 −4.0 −3.5 −3.0 −2.5 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4

! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !

−5.5 −5.0 −4.5 −4.0 −3.5 −3.0 −2.5 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4

  • For the Glock 19 primer shear study:
  • Posterior error probs. (black circles) estimated by HOO-CV

The SVM alg got these Primer shear IDs wrong!

slide-32
SLIDE 32

Just Getting Started: Things to Come

  • Footwear
  • Soil
  • Wrenches
  • Chisels
  • Q.D.
  • Tire Tracks
  • Hair
  • Blood Spatter
  • Gun Shot

Residue

  • N. Petraco!
  • N. Petraco!
slide-33
SLIDE 33

Acknowledgements

  • Research Team:
  • Mr. Peter Diaczuk
  • Dr. Peter De Forest
  • Ms. Carol Gambino
  • Mr. Mark Gil
  • Dr. James Hamby
  • Dr. Thomas Kubic
  • Off. Patrick McLaughlin
  • Dr. Linton Mohammed
  • Mr. Jerry Petillo
  • Mr. Nicholas Petraco
  • Dr. Peter A. Pizzola
  • Dr. Graham Rankin
  • Dr. Jacqueline Speir
  • Dr. Peter Shenkin
  • Mr. Peter Tytell
  • Helen Chan
  • Manny Chaparro
  • Julie Cohen
  • Aurora Dimitrova
  • Eric Gosslin
  • Frani Kammerman
  • Brooke Kammrath
  • Loretta Kuo
  • Dale Purcel
  • Stephanie Pollut
  • Rebecca Smith
  • Elizabeth Willie
  • Chris Singh
  • Melodie Yu