Sequential Estimation in the Group Testing Yaakov Malinovsky - PowerPoint PPT Presentation

Sequential Estimation in the Group Testing Yaakov Malinovsky University of Maryland, Baltimore County Joint work with Gregory Haber (UMBC) and Paul Albert (NCI) QPRC 2017 The 34th Quality and Productivity Research Conference Department of Statistics, University of Connecticut June 13, 2017 Y. Malinovsky (UMBC) Estimation in the Group Testing 1 / 22

Group Testing for the Estimating Prevalence Rate An early example of group testing to estimate the prevalence of a trait is due to Marion A. Watson (1936). In this example, aphids are grouped on to potential host plants and observations are made on the subsequent development of disease transmitted by the aphids. The maximum likelihood estimator (MLE) indicates that the probability of disease transmission was about 0 . 05 − 0 . 15. Watson M. A. (1936). Factors Affecting the Amount of Infection Obtained by Aphis Transmission of the Virus Hy. III. Trans. Roy. Soc. London, Ser. B. 226, 457–489. Y. Malinovsky (UMBC) Estimation in the Group Testing 2 / 22

Probabilistic Model Let members of a population be represented by independent random variables ϕ i ∼ Bernoulli ( p ) , i = 1 , 2 , 3 , . . . , where p is the quantity we wish to estimate. For group tests with groups of size k , we have the new random variable ϑ ( k ) = max { ϕ i 1 , ϕ i 2 , . . . , ϕ i k } ∼ Bernoulli ( 1 − q k ) , i where q = 1 − p . Y. Malinovsky (UMBC) Estimation in the Group Testing 3 / 22

Fisher Information Contains in One Observation � ∂ � 2 I ( θ ) = E θ ∂θ log p ( X , θ ) . I k ( p ) = k 2 q k 1 − q k , q = 1 − p . q 2 Y. Malinovsky (UMBC) Estimation in the Group Testing 4 / 22

Example: Fisher Information Contains in One Observation Fisher Inforamtion 45 k=1 k=5 40 35 30 25 20 15 10 5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 p Y. Malinovsky (UMBC) Estimation in the Group Testing 5 / 22

Fixed Sample Design: Model (a) We observe a random sample ϑ ( k ) 1 , ϑ ( k ) 2 , . . . , ϑ ( k ) n . n � n , 1 − q k � � ϑ ( k ) Define, X = ∼ Binomial . 1 i = 1 1 − q k MLE ( a ) ( X ) = X � n . � � 1 / k 1 − X � p MLE ( a ) ( X ) = 1 − . n Y. Malinovsky (UMBC) Estimation in the Group Testing 6 / 22

Burrows Estimator- Model (a) An alternative estimator was proposed by Burrows (1987) which reduced the MLE bias of order 1 n . � � 1 / k � � n − X + a ⇒ a = b = k − 1 min a , b E 1 − − p = . n + b 2 k � � 1 / k , b k = k − 1 n X � p B ( a ) ( X ) = 1 − 1 − . n + b k n 2 k Burrows, P . M. (1987). Improved Estimation of Pathogen Transmission Rates by Group Testing. Phytopathology 77, 363–365. Y. Malinovsky (UMBC) Estimation in the Group Testing 7 / 22

�� Example: Relative Bias E p − p p Relative Bias %, n=10, k=5 70 MLE Burrows 60 50 40 30 20 10 0 −10 −20 0 0.1 0.2 0.3 0.4 0.5 p Y. Malinovsky (UMBC) Estimation in the Group Testing 8 / 22

Example: MSE MSE, n=10, k=5 0.2 MLE Burrows 0.18 Individual 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 p Y. Malinovsky (UMBC) Estimation in the Group Testing 9 / 22

Binomial Sampling Plans S All plans begin at the origin and, until a point γ = ( X ( γ ) , Y ( γ )) ∈ B S (set of boundary points) is reached, the X or Y coordinate is increased iteratively by one with probability θ or 1 − θ respectively. The boundary point γ ∈ B S at which sampling stops is a sufficient statistic for θ . For each such point γ , we define N S ( γ ) = Y ( γ ) + X ( γ ) . An important characteristic of any plan then will be E ( N S ) . If N S ( γ ) = n for some positive integer n and all γ ∈ B S , then S is a fixed binomial sampling plan. If N S ( γ ) < M for some positive integer M and all γ ∈ B S , then S is a finite binomial sampling plan. Girshick, M. A., Mosteller, F., and Savage, L. J. (1946). Unbiased Estimates for Certain Binomial Sampling Problems with Applications. Annals of Mathematical Statistics 1 7, 13–23. Y. Malinovsky (UMBC) Estimation in the Group Testing 10 / 22

Unbiased Estimator under Finite Sampling Plans Result Let F be the set of all finite binomial sampling plans with probability of success θ , and k any positive integer greater than one. Then, there does not exist an estimator f under any sampling plan F ∈ F such that f is an unbiased estimator of θ 1 / k or ( 1 − θ ) 1 / k . For the group testing problem, where θ = 1 − ( 1 − p ) k or θ = ( 1 − p ) k , it follows immediately that the non-existence of an unbiased estimator of p extends to this broader class of sampling plans as well. Remark : A randomized binomial sampling scheme to estimating a function of the form p α , α > 0 is presented in Banerjee and Sinha (1979). Banerjee, P . K. and Sinha, B. K. (1979). Generating an Event with Probability p α , α > 0. Sankhy¯ a, Series B 4 1, 282–285. Y. Malinovsky (UMBC) Estimation in the Group Testing 11 / 22

Inverse Binomial Sampling: Models (b) and (c) Model (b) Sample the groups ϑ ( k ) 1 , ϑ ( k ) 2 , . . . until the c positive groups. Model (c) Sample the groups ϑ ( k ) 1 , ϑ ( k ) 2 , . . . until the c negative groups. Y. Malinovsky (UMBC) Estimation in the Group Testing 12 / 22

DeGroot (1959) Result Result � c + w − 1 � θ c ( 1 − θ ) w . Let W ∼ NB ( c , θ ) : P ( W = w ) = w Then, a function h ( θ ) is estimable unbiasedly if and only if it can be expanded in a Taylor series on the interval | θ | < 1 . If h ( θ ) is estimable unbiasedly, then its unique estimator is given by � � d w ( c − 1 )! h ( θ ) ˆ h ( w ) = , w = 0 , 1 , 2 , . . . . d θ w ( 1 − θ ) c ( w + c − 1 )! θ = 0 Degroot, M. H. (1959). Unbiased Sequential Estimation for Binomial Populations. Annals of Mathematical Statistics 3 0, 80–101. Y. Malinovsky (UMBC) Estimation in the Group Testing 13 / 22

Construction of Unbiased Estimator: Model (c) B = { γ : Y ( γ ) = c } . Define X to be the number of positive groups prior to this event: � c + x − 1 � ( q k ) c ( 1 − q k ) x , x = 0 , 1 , 2 , . . . P ( X = x ) = x θ = 1 − q k and want to estimate h ( θ ) = ( 1 − θ ) 1 / k = q . � 0 , x = 0 , � � ˆ 1 − � x p D ( c ) ( x ) = j + c − 1 − 1 / k , x = 1 , 2 , 3 , . . . . j = 1 j + c − 1 Y. Malinovsky (UMBC) Estimation in the Group Testing 14 / 22

Example E ( N DeGroot ) = c q k . MSE, c=10, k=5 −3 x 10 2 MLE Burrows 1.8 DeGroot 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0 0.1 0.2 0.3 0.4 0.5 p Y. Malinovsky (UMBC) Estimation in the Group Testing 15 / 22

Model (b): No Unbiased Estimator B = { γ : X ( γ ) = c } . Define Y to be the number of negative groups prior to this event: � c + y − 1 � ( 1 − q k ) c ( q k ) y , y = 0 , 1 , 2 , . . . P ( Y = y ) = y We have θ = q k so that h ( θ ) = θ 1 / k = q . However, h does not have a Taylor expansion at the point θ = 0. Therefore, by Degroot’s Theorem no unbiased estimator exists under this model. Y. Malinovsky (UMBC) Estimation in the Group Testing 16 / 22

Extension of Burrows to Models (b) and (c) We extend the idea of Burrows in the fixed sampling case to the sequential models discussed here, with the modification that we seek to remove terms of order O ( 1 / E [ N ]) from the bias. Model (b) � � 1 / k y + b k , b k = k − 1 ˆ p B ( b ) ( y ) = 1 − . y + c + b k − 1 2 k Model (c) � � 1 / k c + b k − 1 , b k = k − 1 ˆ p B ( c ) ( x ) = 1 − . x + c + b k − 1 2 k Y. Malinovsky (UMBC) Estimation in the Group Testing 17 / 22

Model (c): Relative Bias Relative Bias %, c=10, k=5 1 MLE Burrows 0 −1 −2 −3 −4 −5 −6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 p Y. Malinovsky (UMBC) Estimation in the Group Testing 18 / 22

Model (c): MSE −3 MSE, c=10, k=5 x 10 2 MLE Burrows 1.8 DeGroot 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 p Y. Malinovsky (UMBC) Estimation in the Group Testing 19 / 22

Numerical Comparisons We present comparisons based on MSE. Comparisons can be challenging due to the number of variables which must be considered (including p , E ( N ) , and k ). To deal with this, we considered p and E ( N ) fixed and then chose the value of k ∈ { 2 , . . . , 50 } for each estimator which yields the smallest MSE. Y. Malinovsky (UMBC) Estimation in the Group Testing 20 / 22

MSE Comparisons for E ( N ) = 25 (10000 × MSE ) ˆ p \ p 0 . 01 0 . 05 0 . 1 0 . 2 0 . 3 0 . 5 ˆ 0 . 1119 1 . 9982 7 . 3243 24 . 5901 46 . 1621 82 . 4696 p MLE ( a ) ˆ p MLE ( b ) 1 . 3059 4 . 9489 12 . 8547 38 . 6643 62 . 1209 101 . 5301 ˆ p MLE ( c ) 0 . 1010 1 . 6105 6 . 0341 22 . 6446 43 . 7033 96 . 6345 ˆ p B ( a ) 0 . 1039 1 . 6010 3 . 6165 13 . 2301 26 . 7432 56 . 3798 ˆ p B ( b ) 0 . 1477 1 . 5911 4 . 8515 17 . 2066 33 . 6451 64 . 2978 ˆ p B ( c ) 0 . 1046 1 . 6237 6 . 1142 22 . 8252 42 . 7642 90 . 0256 ˆ p D ( c ) 0 . 1046 1 . 6230 6 . 1124 22 . 8217 42 . 7695 90 . 0741 Y. Malinovsky (UMBC) Estimation in the Group Testing 21 / 22

Thank you! Y. Malinovsky (UMBC) Estimation in the Group Testing 22 / 22

Sequential Estimation in the Group Testing Yaakov Malinovsky - PowerPoint PPT Presentation

Sequential Estimation in the Group Testing Yaakov Malinovsky University of Maryland, Baltimore County Joint work with Gregory Haber (UMBC) and Paul Albert (NCI) QPRC 2017 The 34th Quality and Productivity Research Conference Department of

{Sequential Code} {Sequential Code} {Sequential Code} {Sequential Code} {Sequential Code}

Random Sampling Florian Schoppmann August 24, 2010 Non-Sequential Sequential Sequential with

Hardware Design with VHDL Sequential Stmts ECE 443 Sequential Statements This slide set covers

Sequential Files : Outline ! Overview ! Ordered vs. Unordered ! Physical sequential Files !

Sequential techniques for Hypothesis testing & Change detection George V. Moustakides

Chapter 5 Synchronous Sequential Logic 5-1 Outline ! Sequential Circuits ! Latches ! Flip-Flops

Sequential Supervised Learning Sequential Supervised Learning Many Application Problems Require

Introduction to Synchronous Sequential Introduction to Synchronous Sequential Circuits Circuits

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

Property-Based Testing Matt Bachmann @mattbachmann Testing is Important Testing is Important

Software Testing Overview What is software testing? General testing criteria Testing

Software testing Software Testing Introduction Testing levels Automated testing Principles and

1. Test page This page is for testing. This page is for testing. This page is for testing.

Motion Estimation by Affine Transforms Motion Estimation by Affine Transforms Motion Estimation

Model Estimation, Testing, and Reporting Model Estimation, Testing, and Reporting PSYC 575 PSYC

Flexible Mixture Modeling and Model-Based Clustering in R Bettina Grn September 2017 c

Real-Time Resampling Processor for SWARM Mark Peryer Harvard Smithsonian Center for Astrophysics

Stat 8931 (Aster Models) Lecture Slides Deck 8 Conditional Aster Models Charles J. Geyer School

Advanced SQL 01 The Core of SQL Torsten Grust Universitt Tbingen, Germany 1 The Core

MA THEMA TICAL INDUCTION Induction and Deduction Mathematical Induction (its

Eth thnomedici icinal st stud udie ies on on medicin icinal l plan lants ts us used by

BotaniTours: Aggregating information about botanical points of interest in Scotland. Beatrice

Biodiversity and Ecosystem Informatics: Research, Technology Transfer, or Application

Sequential Estimation in the Group Testing Yaakov Malinovsky - PowerPoint PPT Presentation

Sequential Estimation in the Group Testing Yaakov Malinovsky University of Maryland, Baltimore County Joint work with Gregory Haber (UMBC) and Paul Albert (NCI) QPRC 2017 The 34th Quality and Productivity Research Conference Department of

{Sequential Code} {Sequential Code} {Sequential Code} {Sequential Code} {Sequential Code}

Random Sampling Florian Schoppmann August 24, 2010 Non-Sequential Sequential Sequential with

Hardware Design with VHDL Sequential Stmts ECE 443 Sequential Statements This slide set covers

Sequential Files : Outline ! Overview ! Ordered vs. Unordered ! Physical sequential Files !

Sequential techniques for Hypothesis testing &amp; Change detection George V. Moustakides

Chapter 5 Synchronous Sequential Logic 5-1 Outline ! Sequential Circuits ! Latches ! Flip-Flops

Sequential Supervised Learning Sequential Supervised Learning Many Application Problems Require

Introduction to Synchronous Sequential Introduction to Synchronous Sequential Circuits Circuits

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

Property-Based Testing Matt Bachmann @mattbachmann Testing is Important Testing is Important

Software Testing Overview What is software testing? General testing criteria Testing

Software testing Software Testing Introduction Testing levels Automated testing Principles and

1. Test page This page is for testing. This page is for testing. This page is for testing.

Motion Estimation by Affine Transforms Motion Estimation by Affine Transforms Motion Estimation

Model Estimation, Testing, and Reporting Model Estimation, Testing, and Reporting PSYC 575 PSYC

Flexible Mixture Modeling and Model-Based Clustering in R Bettina Grn September 2017 c

Real-Time Resampling Processor for SWARM Mark Peryer Harvard Smithsonian Center for Astrophysics

Stat 8931 (Aster Models) Lecture Slides Deck 8 Conditional Aster Models Charles J. Geyer School

Advanced SQL 01 The Core of SQL Torsten Grust Universitt Tbingen, Germany 1 The Core

MA THEMA TICAL INDUCTION Induction and Deduction Mathematical Induction (its

Eth thnomedici icinal st stud udie ies on on medicin icinal l plan lants ts us used by

BotaniTours: Aggregating information about botanical points of interest in Scotland. Beatrice

Biodiversity and Ecosystem Informatics: Research, Technology Transfer, or Application

Sequential techniques for Hypothesis testing & Change detection George V. Moustakides