[PPT] - Sequential Detection and Isolation of a Correlated Pair Anamitra PowerPoint Presentation

SLIDE 1

Sequential Detection and Isolation of a Correlated Pair

Anamitra Chaudhuri Department of Statistics University of Illinois, Urbana-Champaign Joint work with Georgios Fellouris 2020 IEEE International Symposium on Information Theory Los Angeles, California 21-26 June, 2020

SLIDE 2

Introduction

SLIDE 3

Motivation

– Quickest inference about the underlying dependence structure. – Environmental monitoring, sensor networks, fault detection in power grid, neural coding etc. – In this context,

– data are observed sequentially and the sample size is not fixed in advance, – there are multiple hypotheses regarding the dependence structure. – Goal: stop sampling as quickly as possible and identify the true hypothesis while controlling the probability of errors.

SLIDE 4

Related works

– Detection and isolation of the correlation structure in a p−variate Gaussian random vector.

– p = 2: Sequential hypothesis testing for the correlation coefficient ρ in bivariate Gaussian

Binary hypothesis testing [Choi, 1971, Kowalski, 1971,

Pradhan and Sathe, 1975, Wolde-Tsadik, 1976, Wald, 1945, . . . ]

Two sided version [Woodroofe, 1979]

– p > 2: Sequential multiple testing and design

Observation from only one component is taken at each time,

temporal dependence [Heydari and Tajer, 2017]

– Sequentially observed data from independent streams, simultaneous testing of multiple binary hypotheses. [Song and Fellouris, 2017]

SLIDE 5

Goal

In this work, – data from all sources are observed sequentially, – the observations are independent over time, – at most one pair of its components is correlated. Goal: – stop sampling as quickly as possible, – identify the correlated pair, if there is any, – control three kinds of errors:

False Alarm: Detecting a correlated pair when there is none.
Missed Detection: Failing to detect a correlated pair when there is
ne.
Wrong Isolation: Identifying the wrong correlated pair when there is
ne.

SLIDE 6

Problem formulation

SLIDE 7

Problem Setup

– p information sources: {Xi(t) : t ∈ N}, i = 1 . . . p.

For a fixed source i ∈ {1, . . . , p}, Xi(t)

iid

∼ N(0, 1), t ∈ N.

The set of all (unordered) pairs: E := {(i, j) : 1 ≤ i < j ≤ p}
At each time t ∈ N, Corr(Xk(t), Xl(t)) = ρe, where e ∈ E such that

e = (k, l).

– Given a user-specified value ρ∗ ∈ (0, 1), we perform multiple testing

for each e ∈ E, H0 : ρe = 0 vs. H1 : |ρe| = ρ∗,
when at most one of the p

2

nulls should be rejected.

SLIDE 8

Problem Setup

– Ft = σ(X(1), . . . , X(t)), where X(t) = (X1(t), X2(t), . . . , Xp(t)). – A sequential test (τ, d) consists of:

an {Ft}-stopping time, τ, at which we stop sampling,
and an {Fτ}-measurable decision rule d, which denotes the subset of

pairs declared to be correlated upon stopping.

– Since there is at most one correlated pair, let

P0 : prob. measure when all sources are independent.
Pe+ (resp. Pe−): when the pair e has correlation ρ∗ (resp. −ρ∗) and

all other sources are independent.

SLIDE 9

Problem Setup

– ∆(α, β, γ): the class of sequential tests (τ, d) for which

False alarm:

P0(d = ∅) ≤ α,

Missed detection: for all e ∈ E,

Pe+(d = ∅), Pe+(d = ∅) ≤ β,

Wrong Isolation: for all e ∈ E,

Pe+(d = ∅, d = {e}), Pe−(d = ∅, d = {e}) ≤ γ.

– Problem: Find (τ, d) ∈ ∆(α, β, γ) that minimizes E[τ] under P0 and Pe+, Pe− for every e ∈ E to a first order asymptotic approximation as α, β, γ → 0.

SLIDE 10

Notations and Statistics

– For each e ∈ E, the likelihood ratios Λe+(n) := dPe+ dP0 (F(n)), Λe−(n) := dPe− dP0 (F(n)). – Mixture likelihood ratio statistic for the two sided testing problem: Λe(n) := Λe+(n) + Λe−(n) 2 . – At time n, the ordered mixture likelihood ratio statistics are: Λ(1)(n) ≥ . . . Λ(K)(n), and Λik(n) ≡ Λ(k)(n), k = 1 . . . K := p 2

.

SLIDE 11

Proposed Procedure

SLIDE 12

Proposed Rule

Inspired by the gap-intersection rule proposed in [Song and Fellouris, 2017], our proposed procedure is (τ∗, d∗), where – τ∗ := min{τ1, τ2}, with

τ1 := inf{n ≥ 1 : Λ(1)(n) ≤ 1/A},
τ2 := inf{n ≥ 1 : Λ(1)(n) ≥ B, Λ(1)(n)/Λ(2)(n) ≥ C}.

– d∗ :=

∅

if τ1 < τ2, i1(τ∗) if τ2 < τ1.

SLIDE 13

Illustration

Σ =

1

0.8 0.8 1 1

.

Σ =

1

1 1

.

5 10 15 20 25 30 −15 −10 −5 5 10 15 sample size log(statistic) log(C) −log(A) log(B) stop sampling

(1,2) (2,3) (3,1)

5 10 15 20 25 30 −15 −10 −5 5 10 15 sample size log(statistic) −log(A) log(B) stop sampling

(1,2) (2,3) (3,1)

SLIDE 14

Error Control

Recall, K = p

2

.

Theorem For any A, B, C > 1, we have P0(d∗ = ∅) ≤ K/B, Pe+(d∗ = ∅) = Pe−(d∗ = ∅) ≤ 1/A, Pe+(d∗ = ∅, d∗ = {e}) = Pe−(d∗ = ∅, d∗ = {e}) ≤ (K − 1)/C. In particular, (τ∗, d∗) ∈ ∆(α, β, γ) when A = 1 β , B = K α and C = K − 1 γ . (1)

SLIDE 15

Asymptotic Upper Bound

– For each e ∈ E, the KL information numbers D0 := E0[− log Λe+(1)] = E0[− log Λe−(1)], D1 := Ee+[log Λe+(1)] = Ee−[log Λe−(1)]. – Let x ∧ y := min{x, y}, x ∨ y := max{x, y}. Lemma Let e ∈ E. As A, B, C → ∞ we have E0[τ∗] ≤ log A D0 (1 + o(1)), Ee−[τ∗], Ee+[τ∗] ≤ log B D1

log C

D0 + D1

(1 + o(1)).

SLIDE 16

Asymptotic Optimality

SLIDE 17

Universal Lower Bound

Let

h(x, y) := x log

x

1 − y

+ (1 − x) log

1 − x y

,

x, y ∈ (0, 1). Lemma If α, β, γ ∈ (0, 1) such that α + β < 1 and β + 2γ < 1, e ∈ E, and (τ, d) ∈ ∆(α, β, γ), then E0[τ] ≥ h(α, β) D0 , Ee+[τ], Ee−[τ] ≥ h(β, α) D1 h(β + γ, γ) ∨ h(γ, β + γ) D0 + D1 .

SLIDE 18

Main Result: Asymptotic Optimality

The definition of the function h allows us to have, when x, y → 0,

h(x, y) ∼ | log y|,
h(x, y) ∨ h(y, x) ∼ | log(x ∧ y)|.

Theorem Suppose the thresholds in (τ∗, d∗) are selected according to (1). Then, for every e ∈ E, as α, β, γ → 0 we have E0[τ∗] ∼ inf

(τ,d)∈∆(α,β,γ) E0[τ] ∼ | log β|

D0 , Ee+[τ∗] ∼ inf

(τ,d)∈∆(α,β,γ) Ee+[τ] ∼ | log α|

D1 | log γ| D0 + D1 , Ee−[τ∗] ∼ inf

(τ,d)∈∆(α,β,γ) Ee−[τ] ∼ | log α|

D1 | log γ| D0 + D1 .

SLIDE 19

Simulation Study

SLIDE 20

An Alternate Rule

– An alternate rule (τint, dint) is a modification of the intersection rule proposed in [De and Baron, 2012], where

τint := inf{n ≥ 1 : 0 ≤ p(n) ≤ 1 and Λe(n) /

∈ (1/A, B) for all e ∈ E},

dint :=
∅

if p(τint) = 0, i1(τint)

therwise.

,

p(n) = |{e ∈ E : Λe(n) > 1}|.

– (τint, dint) ∈ ∆(α, β, γ) when the thresholds are A = 1 β and B = max K α , K − 1 γ

.

SLIDE 21

Illustration

Σ =

1

0.8 0.8 1 1

.

Σ =

1

1 1

.

5 10 15 20 25 30 −15 −10 −5 5 10 15 sample size log(statistic) log(C) −log(A) log(B) proposed rule stops intersection rule (modified) stops

(1,2) (2,3) (3,1)

5 10 15 20 25 30 −15 −10 −5 5 10 15 sample size log(statistic) −log(A) log(B) proposed rule stops intersection rule (modified) stops

(1,2) (2,3) (3,1)

SLIDE 22

Comparison

– p = 10, ρ∗ = 0.7, α = β = 10−2, γ = 10−3. – only one pair is correlated with correlation coefficient ρ, all others are uncorrelated. – varied the value of ρ in the interval (−0.9, 0.9).

20 40 60 80 100 True value of correlation in the correlated pair Expected Sample Size −0.7 0.0 0.7 Intersection Rule Proposed Rule

SLIDE 23

Summary

SLIDE 24

Summary

– Proposed the problem of quick detection and isolation of a correlated pair in a Gaussian random vector.

– Sequential multiple testing that controls three kinds of error: false alarm, missed detection and wrong isolation. – Goal: Minimize the average sample size subject to three error constraints.

– Proposed a very simple rule based on the mixture likelihood ratios

f the pairs and established its asymptotic optimality.

– We compared our rule with an alternative one numerically and showed that its performance is significantly better, especially when the true value of the correlation is much higher.

SLIDE 25

References

SLIDE 26

References i

Choi, S. C. (1971). Sequential test for correlation coefficients. Journal of the American Statistical Association, 66(335):575–576. De, S. K. and Baron, M. (2012). Sequential bonferroni methods for multiple hypothesis testing with strong control of family-wise error rates i and ii. Sequential Analysis, 31(2):238–262. Heydari, J. and Tajer, A. (2017). Quickest search for local structures in random graphs. IEEE Transactions on Signal and Information Processing over Networks, 3(3):526–538.

SLIDE 27

References ii

Kowalski, C. J. (1971). The oc and asn functions of some sprt’s for the correlation coefficient. Technometrics, 13(4):833–841. Pradhan, M. and Sathe, Y. S. (1975). An unbiased estimator and a sequential test for the correlation coefficient. Journal of the American Statistical Association, 70(349):160–161. Song, Y. and Fellouris, G. (2017). Asymptotically optimal, sequential, multiple testing procedures with prior information on the number of signals.

Electron. J. Statist., 11(1):338–363.

SLIDE 28

References iii

Wald, A. (1945). Sequential tests of statistical hypotheses. The Annals of Mathematical Statistics, 16(2):117–186. Wolde-Tsadik, G. (1976). A generalization of an sprt for the correlation coefficient. Journal of the American Statistical Association, 71(355):709–710. Woodroofe, M. (1979). Repeated likelihood ratio tests. Biometrika, 66(3):453–463.

SLIDE 29

Sequential Detection and Isolation of a Correlated Pair

Anamitra Chaudhuri Department of Statistics University of Illinois, Urbana-Champaign Joint work with Georgios Fellouris 2020 IEEE International Symposium on Information Theory Los Angeles, California 21-26 June, 2020

Introduction

Motivation

– Quickest inference about the underlying dependence structure. – Environmental monitoring, sensor networks, fault detection in power grid, neural coding etc. – In this context,

– data are observed sequentially and the sample size is not fixed in advance, – there are multiple hypotheses regarding the dependence structure. – Goal: stop sampling as quickly as possible and identify the true hypothesis while controlling the probability of errors.

Related works

– Detection and isolation of the correlation structure in a p−variate Gaussian random vector.

– p = 2: Sequential hypothesis testing for the correlation coefficient ρ in bivariate Gaussian

Pradhan and Sathe, 1975, Wolde-Tsadik, 1976, Wald, 1945, . . . ]

– p > 2: Sequential multiple testing and design

temporal dependence [Heydari and Tajer, 2017]

– Sequentially observed data from independent streams, simultaneous testing of multiple binary hypotheses. [Song and Fellouris, 2017]

Goal

Problem formulation

Problem Setup

– p information sources: {Xi(t) : t ∈ N}, i = 1 . . . p.

∼ N(0, 1), t ∈ N.

e = (k, l).

– Given a user-specified value ρ∗ ∈ (0, 1), we perform multiple testing

Problem Setup

– Ft = σ(X(1), . . . , X(t)), where X(t) = (X1(t), X2(t), . . . , Xp(t)). – A sequential test (τ, d) consists of:

pairs declared to be correlated upon stopping.

– Since there is at most one correlated pair, let

all other sources are independent.

Problem Setup

– ∆(α, β, γ): the class of sequential tests (τ, d) for which

P0(d = ∅) ≤ α,

Pe+(d = ∅), Pe+(d = ∅) ≤ β,

Pe+(d = ∅, d = {e}), Pe−(d = ∅, d = {e}) ≤ γ.

– Problem: Find (τ, d) ∈ ∆(α, β, γ) that minimizes E[τ] under P0 and Pe+, Pe− for every e ∈ E to a first order asymptotic approximation as α, β, γ → 0.

Notations and Statistics

Proposed Procedure

Proposed Rule

Inspired by the gap-intersection rule proposed in [Song and Fellouris, 2017], our proposed procedure is (τ∗, d∗), where – τ∗ := min{τ1, τ2}, with

– d∗ :=

if τ1 < τ2, i1(τ∗) if τ2 < τ1.

Illustration

Σ =

Σ =

Error Control

Recall, K = p

Theorem For any A, B, C > 1, we have P0(d∗ = ∅) ≤ K/B, Pe+(d∗ = ∅) = Pe−(d∗ = ∅) ≤ 1/A, Pe+(d∗ = ∅, d∗ = {e}) = Pe−(d∗ = ∅, d∗ = {e}) ≤ (K − 1)/C. In particular, (τ∗, d∗) ∈ ∆(α, β, γ) when A = 1 β , B = K α and C = K − 1 γ . (1)

Asymptotic Upper Bound

D0 + D1

Asymptotic Optimality

Universal Lower Bound

h(x, y) := x log

1 − y

1 − x y

x, y ∈ (0, 1). Lemma If α, β, γ ∈ (0, 1) such that α + β < 1 and β + 2γ < 1, e ∈ E, and (τ, d) ∈ ∆(α, β, γ), then E0[τ] ≥ h(α, β) D0 , Ee+[τ], Ee−[τ] ≥ h(β, α) D1 h(β + γ, γ) ∨ h(γ, β + γ) D0 + D1 .

Main Result: Asymptotic Optimality

The definition of the function h allows us to have, when x, y → 0,

Theorem Suppose the thresholds in (τ∗, d∗) are selected according to (1). Then, for every e ∈ E, as α, β, γ → 0 we have E0[τ∗] ∼ inf

D0 , Ee+[τ∗] ∼ inf

D1 | log γ| D0 + D1 , Ee−[τ∗] ∼ inf

D1 | log γ| D0 + D1 .

Simulation Study

An Alternate Rule

– An alternate rule (τint, dint) is a modification of the intersection rule proposed in [De and Baron, 2012], where

∈ (1/A, B) for all e ∈ E},

if p(τint) = 0, i1(τint)

,

– (τint, dint) ∈ ∆(α, β, γ) when the thresholds are A = 1 β and B = max K α , K − 1 γ

Illustration

Σ =

Σ =

Comparison

– p = 10, ρ∗ = 0.7, α = β = 10−2, γ = 10−3. – only one pair is correlated with correlation coefficient ρ, all others are uncorrelated. – varied the value of ρ in the interval (−0.9, 0.9).

Summary

Summary

– Proposed the problem of quick detection and isolation of a correlated pair in a Gaussian random vector.

– Sequential multiple testing that controls three kinds of error: false alarm, missed detection and wrong isolation. – Goal: Minimize the average sample size subject to three error constraints.

– Proposed a very simple rule based on the mixture likelihood ratios

– We compared our rule with an alternative one numerically and showed that its performance is significantly better, especially when the true value of the correlation is much higher.

References

References i

References ii

References iii

Thank you!