Sequential Estimation in the Group Testing Yaakov Malinovsky - - PowerPoint PPT Presentation

sequential estimation in the group testing
SMART_READER_LITE
LIVE PREVIEW

Sequential Estimation in the Group Testing Yaakov Malinovsky - - PowerPoint PPT Presentation

Sequential Estimation in the Group Testing Yaakov Malinovsky University of Maryland, Baltimore County Joint work with Gregory Haber (UMBC) and Paul Albert (NCI) QPRC 2017 The 34th Quality and Productivity Research Conference Department of


slide-1
SLIDE 1

Sequential Estimation in the Group Testing

Yaakov Malinovsky

University of Maryland, Baltimore County Joint work with Gregory Haber (UMBC) and Paul Albert (NCI)

QPRC 2017 The 34th Quality and Productivity Research Conference Department of Statistics, University of Connecticut June 13, 2017

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 1 / 22

slide-2
SLIDE 2

Group Testing for the Estimating Prevalence Rate

An early example of group testing to estimate the prevalence of a trait is due to Marion A. Watson (1936). In this example, aphids are grouped on to potential host plants and

  • bservations are made on the subsequent development of disease

transmitted by the aphids. The maximum likelihood estimator (MLE) indicates that the probability of disease transmission was about 0.05 − 0.15.

Watson M. A. (1936). Factors Affecting the Amount of Infection Obtained by Aphis Transmission of the Virus Hy. III. Trans.

  • Roy. Soc. London, Ser. B. 226, 457–489.
  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 2 / 22

slide-3
SLIDE 3

Probabilistic Model

Let members of a population be represented by independent random variables ϕi ∼ Bernoulli(p), i = 1, 2, 3, . . . , where p is the quantity we wish to estimate. For group tests with groups of size k, we have the new random variable ϑ(k)

i

= max{ϕi1, ϕi2, . . . , ϕik} ∼ Bernoulli(1 − qk), where q = 1 − p.

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 3 / 22

slide-4
SLIDE 4

Fisher Information Contains in One Observation

I (θ) = Eθ ∂ ∂θ log p (X, θ) 2 . Ik (p) = k2 q2 qk 1 − qk , q = 1 − p.

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 4 / 22

slide-5
SLIDE 5

Example: Fisher Information Contains in One Observation

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 5 10 15 20 25 30 35 40 45 p Fisher Inforamtion k=1 k=5

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 5 / 22

slide-6
SLIDE 6

Fixed Sample Design: Model (a)

We observe a random sample ϑ(k)

1 , ϑ(k) 2 , . . . , ϑ(k) n .

Define, X =

n

  • i=1

ϑ(k)

1

∼ Binomial

  • n, 1 − qk

.

  • 1 − qk MLE(a) (X) = X

n .

  • pMLE(a) (X) = 1 −
  • 1 − X

n 1/k .

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 6 / 22

slide-7
SLIDE 7

Burrows Estimator- Model (a)

An alternative estimator was proposed by Burrows (1987) which reduced the MLE bias of order 1

n.

min

a,b E

  • 1 −

n − X + a n + b 1/k − p = ⇒ a = b = k − 1 2k .

  • pB(a) (X) = 1 −
  • 1 −

n n + bk X n 1/k , bk = k − 1 2k .

Burrows, P . M. (1987). Improved Estimation of Pathogen Transmission Rates by Group Testing. Phytopathology 77, 363–365.

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 7 / 22

slide-8
SLIDE 8

Example: Relative Bias E

  • p
  • − p

p

0.1 0.2 0.3 0.4 0.5 −20 −10 10 20 30 40 50 60 70 p Relative Bias %, n=10, k=5 MLE Burrows

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 8 / 22

slide-9
SLIDE 9

Example: MSE

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 p MSE, n=10, k=5 MLE Burrows Individual

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 9 / 22

slide-10
SLIDE 10

Binomial Sampling Plans S

All plans begin at the origin and, until a point γ = (X(γ), Y(γ)) ∈ BS (set of boundary points) is reached, the X

  • r Y coordinate is increased iteratively by one with probability θ or

1 − θ respectively. The boundary point γ ∈ BS at which sampling stops is a sufficient statistic for θ. For each such point γ, we define NS(γ) = Y(γ) + X(γ). An important characteristic of any plan then will be E(NS). If NS(γ) = n for some positive integer n and all γ ∈ BS, then S is a fixed binomial sampling plan. If NS(γ) < M for some positive integer M and all γ ∈ BS, then S is a finite binomial sampling plan.

Girshick, M. A., Mosteller, F., and Savage, L. J. (1946). Unbiased Estimates for Certain Binomial Sampling Problems with

  • Applications. Annals of Mathematical Statistics 17, 13–23.
  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 10 / 22

slide-11
SLIDE 11

Unbiased Estimator under Finite Sampling Plans

Result

Let F be the set of all finite binomial sampling plans with probability of success θ, and k any positive integer greater than one. Then, there does not exist an estimator f under any sampling plan F ∈ F such that f is an unbiased estimator of θ1/k or (1 − θ)1/k. For the group testing problem, where θ = 1 − (1 − p)k or θ = (1 − p)k, it follows immediately that the non-existence of an unbiased estimator

  • f p extends to this broader class of sampling plans as well.

Remark: A randomized binomial sampling scheme to estimating a function of the form pα, α > 0 is presented in Banerjee and Sinha (1979).

Banerjee, P . K. and Sinha, B. K. (1979). Generating an Event with Probability pα, α > 0. Sankhy¯ a, Series B 41, 282–285.

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 11 / 22

slide-12
SLIDE 12

Inverse Binomial Sampling: Models (b) and (c)

Model (b) Sample the groups ϑ(k)

1 , ϑ(k) 2 , . . . until the c positive groups.

Model (c) Sample the groups ϑ(k)

1 , ϑ(k) 2 , . . . until the c negative groups.

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 12 / 22

slide-13
SLIDE 13

DeGroot (1959) Result

Result

Let W ∼ NB(c, θ) : P (W = w) = c + w − 1 w

  • θc(1 − θ)w.

Then, a function h(θ) is estimable unbiasedly if and only if it can be expanded in a Taylor series on the interval |θ| < 1. If h(θ) is estimable unbiasedly, then its unique estimator is given by ˆ h(w) = (c − 1)! (w + c − 1)! dw dθw

  • h(θ)

(1 − θ)c

  • θ=0

, w = 0, 1, 2, . . . .

Degroot, M. H. (1959). Unbiased Sequential Estimation for Binomial Populations. Annals of Mathematical Statistics 30, 80–101.

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 13 / 22

slide-14
SLIDE 14

Construction of Unbiased Estimator: Model (c)

B = {γ : Y(γ) = c}. Define X to be the number of positive groups prior to this event: P (X = x) = c + x − 1 x

  • (qk)c(1 − qk)x, x = 0, 1, 2, . . .

θ = 1 − qk and want to estimate h(θ) = (1 − θ)1/k = q. ˆ pD(c)(x) =

  • 0,

x = 0, 1 − x

j=1

  • j+c−1−1/k

j+c−1

  • ,

x = 1, 2, 3, . . . .

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 14 / 22

slide-15
SLIDE 15

Example E(NDeGroot) = c qk .

0.1 0.2 0.3 0.4 0.5 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 10

−3

p MSE, c=10, k=5 MLE Burrows DeGroot

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 15 / 22

slide-16
SLIDE 16

Model (b): No Unbiased Estimator

B = {γ : X(γ) = c}. Define Y to be the number of negative groups prior to this event: P (Y = y) = c + y − 1 y

  • (1 − qk)c(qk)y, y = 0, 1, 2, . . .

We have θ = qk so that h(θ) = θ1/k = q. However, h does not have a Taylor expansion at the point θ = 0. Therefore, by Degroot’s Theorem no unbiased estimator exists under this model.

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 16 / 22

slide-17
SLIDE 17

Extension of Burrows to Models (b) and (c)

We extend the idea of Burrows in the fixed sampling case to the sequential models discussed here, with the modification that we seek to remove terms of order O(1/E[N]) from the bias. Model (b) ˆ pB(b) (y) = 1 −

  • y + bk

y + c + bk − 1 1/k , bk = k − 1 2k . Model (c) ˆ pB(c) (x) = 1 −

  • c + bk − 1

x + c + bk − 1 1/k , bk = k − 1 2k .

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 17 / 22

slide-18
SLIDE 18

Model (c): Relative Bias

0.1 0.2 0.3 0.4 0.5 0.6 0.7 −6 −5 −4 −3 −2 −1 1 p Relative Bias %, c=10, k=5 MLE Burrows

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 18 / 22

slide-19
SLIDE 19

Model (c): MSE

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 10

−3

p MSE, c=10, k=5 MLE Burrows DeGroot

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 19 / 22

slide-20
SLIDE 20

Numerical Comparisons

We present comparisons based on MSE. Comparisons can be challenging due to the number of variables which must be considered (including p, E(N), and k). To deal with this, we considered p and E(N) fixed and then chose the value of k ∈ {2, . . . , 50} for each estimator which yields the smallest MSE.

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 20 / 22

slide-21
SLIDE 21

MSE Comparisons for E(N) = 25 (10000 × MSE)

ˆ p\p 0.01 0.05 0.1 0.2 0.3 0.5 ˆ pMLE(a) 0.1119 1.9982 7.3243 24.5901 46.1621 82.4696 ˆ pMLE(b) 1.3059 4.9489 12.8547 38.6643 62.1209 101.5301 ˆ pMLE(c) 0.1010 1.6105 6.0341 22.6446 43.7033 96.6345 ˆ pB(a) 0.1039 1.6010 3.6165 13.2301 26.7432 56.3798 ˆ pB(b) 0.1477 1.5911 4.8515 17.2066 33.6451 64.2978 ˆ pB(c) 0.1046 1.6237 6.1142 22.8252 42.7642 90.0256 ˆ pD(c) 0.1046 1.6230 6.1124 22.8217 42.7695 90.0741

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 21 / 22

slide-22
SLIDE 22

Thank you!

  • Y. Malinovsky (UMBC)

Estimation in the Group Testing 22 / 22