sequential detection and isolation of a correlated pair
play

Sequential Detection and Isolation of a Correlated Pair Anamitra - PowerPoint PPT Presentation

Sequential Detection and Isolation of a Correlated Pair Anamitra Chaudhuri Department of Statistics University of Illinois, Urbana-Champaign Joint work with Georgios Fellouris 2020 IEEE International Symposium on Information Theory Los


  1. Sequential Detection and Isolation of a Correlated Pair Anamitra Chaudhuri Department of Statistics University of Illinois, Urbana-Champaign Joint work with Georgios Fellouris 2020 IEEE International Symposium on Information Theory Los Angeles, California 21-26 June, 2020

  2. Introduction

  3. Motivation – Quickest inference about the underlying dependence structure. – Environmental monitoring, sensor networks, fault detection in power grid, neural coding etc. – In this context, – data are observed sequentially and the sample size is not fixed in advance, – there are multiple hypotheses regarding the dependence structure. – Goal: stop sampling as quickly as possible and identify the true hypothesis while controlling the probability of errors .

  4. Related works – Detection and isolation of the correlation structure in a p − variate Gaussian random vector. – p = 2: Sequential hypothesis testing for the correlation coefficient ρ in bivariate Gaussian - Binary hypothesis testing [Choi, 1971, Kowalski, 1971, Pradhan and Sathe, 1975, Wolde-Tsadik, 1976, Wald, 1945, . . . ] - Two sided version [Woodroofe, 1979] – p > 2: Sequential multiple testing and design - Observation from only one component is taken at each time, temporal dependence [Heydari and Tajer, 2017] – Sequentially observed data from independent streams, simultaneous testing of multiple binary hypotheses. [Song and Fellouris, 2017]

  5. Goal In this work, – data from all sources are observed sequentially, – the observations are independent over time , – at most one pair of its components is correlated. Goal: – stop sampling as quickly as possible, – identify the correlated pair, if there is any, – control three kinds of errors: - False Alarm: Detecting a correlated pair when there is none. - Missed Detection: Failing to detect a correlated pair when there is one. - Wrong Isolation: Identifying the wrong correlated pair when there is one.

  6. Problem formulation

  7. Problem Setup – p information sources: { X i ( t ) : t ∈ N } , i = 1 . . . p . iid - For a fixed source i ∈ { 1 , . . . , p } , X i ( t ) ∼ N (0 , 1) , t ∈ N . - The set of all (unordered) pairs: E := { ( i , j ) : 1 ≤ i < j ≤ p } - At each time t ∈ N , Corr ( X k ( t ) , X l ( t )) = ρ e , where e ∈ E such that e = ( k , l ). – Given a user-specified value ρ ∗ ∈ (0 , 1), we perform multiple testing - for each e ∈ E , H 0 : ρ e = 0 vs. H 1 : | ρ e | = ρ ∗ , - when at most one of the � p � nulls should be rejected. 2

  8. Problem Setup – F t = σ ( X (1) , . . . , X ( t )), where X ( t ) = ( X 1 ( t ) , X 2 ( t ) , . . . , X p ( t )). – A sequential test ( τ, d ) consists of: - an {F t } -stopping time, τ , at which we stop sampling , - and an {F τ } -measurable decision rule d , which denotes the subset of pairs declared to be correlated upon stopping. – Since there is at most one correlated pair, let - P 0 : prob. measure when all sources are independent. - P e + (resp. P e − ): when the pair e has correlation ρ ∗ (resp. − ρ ∗ ) and all other sources are independent.

  9. Problem Setup – ∆( α, β, γ ): the class of sequential tests ( τ, d ) for which - False alarm: P 0 ( d � = ∅ ) ≤ α, - Missed detection: for all e ∈ E , P e + ( d = ∅ ) , P e + ( d = ∅ ) ≤ β, - Wrong Isolation: for all e ∈ E , P e + ( d � = ∅ , d � = { e } ) , P e − ( d � = ∅ , d � = { e } ) ≤ γ. – Problem: Find ( τ, d ) ∈ ∆( α, β, γ ) that minimizes E [ τ ] under P 0 and P e + , P e − for every e ∈ E to a first order asymptotic approximation as α, β, γ → 0.

  10. Notations and Statistics – For each e ∈ E , the likelihood ratios Λ e + ( n ) := dP e + Λ e − ( n ) := dP e − ( F ( n )) , ( F ( n )) . dP 0 dP 0 – Mixture likelihood ratio statistic for the two sided testing problem: Λ e ( n ) := Λ e + ( n ) + Λ e − ( n ) . 2 – At time n , the ordered mixture likelihood ratio statistics are: � p � Λ (1) ( n ) ≥ . . . Λ ( K ) ( n ) , and Λ i k ( n ) ≡ Λ ( k ) ( n ) , k = 1 . . . K := . 2

  11. Proposed Procedure

  12. Proposed Rule Inspired by the gap-intersection rule proposed in [Song and Fellouris, 2017], our proposed procedure is ( τ ∗ , d ∗ ), where – τ ∗ := min { τ 1 , τ 2 } , with - τ 1 := inf { n ≥ 1 : Λ (1) ( n ) ≤ 1 / A } , - τ 2 := inf { n ≥ 1 : Λ (1) ( n ) ≥ B , Λ (1) ( n ) / Λ (2) ( n ) ≥ C } . � ∅ if τ 1 < τ 2 , – d ∗ := i 1 ( τ ∗ ) if τ 2 < τ 1 .

  13. Illustration � � � � 1 0 . 8 0 1 0 0 Σ = . Σ = . 0 . 8 1 0 0 1 0 0 0 1 0 0 1 15 15 (1,2) 10 10 log(B) log(B) 5 5 log(C) log(statistic) log(statistic) 0 0 (2,3) (2,3) −5 −5 −log(A) −log(A) (1,2) −10 −10 (3,1) (3,1) stop sampling stop sampling −15 −15 0 5 10 15 20 25 30 0 5 10 15 20 25 30 sample size sample size

  14. Error Control � p � Recall, K = . 2 Theorem For any A , B , C > 1 , we have P 0 ( d ∗ � = ∅ ) ≤ K / B , P e + ( d ∗ = ∅ ) = P e − ( d ∗ = ∅ ) ≤ 1 / A , P e + ( d ∗ � = ∅ , d ∗ � = { e } ) = P e − ( d ∗ � = ∅ , d ∗ � = { e } ) ≤ ( K − 1) / C . In particular, ( τ ∗ , d ∗ ) ∈ ∆( α, β, γ ) when A = 1 α and C = K − 1 B = K β , . (1) γ

  15. Asymptotic Upper Bound – For each e ∈ E , the KL information numbers D 0 := E 0 [ − log Λ e + (1)] = E 0 [ − log Λ e − (1)] , D 1 := E e + [log Λ e + (1)] = E e − [log Λ e − (1)] . – Let x ∧ y := min { x , y } , x ∨ y := max { x , y } . Lemma Let e ∈ E . As A , B , C → ∞ we have E 0 [ τ ∗ ] ≤ log A (1 + o (1)) , D 0 � log B log C � � E e − [ τ ∗ ] , E e + [ τ ∗ ] ≤ (1 + o (1)) . D 0 + D 1 D 1

  16. Asymptotic Optimality

  17. Universal Lower Bound - Let � x � � 1 − x � + (1 − x ) log x , y ∈ (0 , 1) . h ( x , y ) := x log , 1 − y y Lemma If α, β, γ ∈ (0 , 1) such that α + β < 1 and β + 2 γ < 1 , e ∈ E , and ( τ, d ) ∈ ∆( α, β, γ ) , then E 0 [ τ ] ≥ h ( α, β ) , D 0 � h ( β + γ, γ ) ∨ h ( γ, β + γ ) E e + [ τ ] , E e − [ τ ] ≥ h ( β, α ) . D 1 D 0 + D 1

  18. Main Result: Asymptotic Optimality The definition of the function h allows us to have, when x , y → 0, - h ( x , y ) ∼ | log y | , - h ( x , y ) ∨ h ( y , x ) ∼ | log( x ∧ y ) | . Theorem Suppose the thresholds in ( τ ∗ , d ∗ ) are selected according to (1) . Then, for every e ∈ E , as α, β, γ → 0 we have ( τ, d ) ∈ ∆( α,β,γ ) E 0 [ τ ] ∼ | log β | E 0 [ τ ∗ ] ∼ inf , D 0 � | log γ | ( τ, d ) ∈ ∆( α,β,γ ) E e + [ τ ] ∼ | log α | E e + [ τ ∗ ] ∼ inf , D 0 + D 1 D 1 � | log γ | ( τ, d ) ∈ ∆( α,β,γ ) E e − [ τ ] ∼ | log α | E e − [ τ ∗ ] ∼ inf . D 0 + D 1 D 1

  19. Simulation Study

  20. An Alternate Rule – An alternate rule ( τ int , d int ) is a modification of the intersection rule proposed in [De and Baron, 2012], where - τ int := inf { n ≥ 1 : 0 ≤ p ( n ) ≤ 1 and Λ e ( n ) / ∈ (1 / A , B ) for all e ∈ E} , � ∅ if p ( τ int ) = 0 , - d int := , i 1 ( τ int ) otherwise . - p ( n ) = |{ e ∈ E : Λ e ( n ) > 1 }| . – ( τ int , d int ) ∈ ∆( α, β, γ ) when the thresholds are A = 1 � K α , K − 1 � and B = max . β γ

  21. Illustration � � � � 1 0 . 8 0 1 0 0 Σ = . Σ = . 0 . 8 1 0 0 1 0 0 0 1 0 0 1 15 15 (1,2) 10 10 log(B) log(B) 5 5 log(C) log(statistic) log(statistic) 0 0 (1,2) (2,3) −5 −5 (2,3) −log(A) −log(A) −10 −10 (3,1) (3,1) proposed rule stops proposed rule stops intersection rule (modified) stops intersection rule (modified) stops −15 −15 0 5 10 15 20 25 30 0 5 10 15 20 25 30 sample size sample size

  22. Comparison – p = 10 , ρ ∗ = 0 . 7 , α = β = 10 − 2 , γ = 10 − 3 . – only one pair is correlated with correlation coefficient ρ , all others are uncorrelated. – varied the value of ρ in the interval ( − 0 . 9 , 0 . 9). 100 Intersection Rule Proposed Rule 80 Expected Sample Size 60 40 20 −0.7 0.0 0.7 True value of correlation in the correlated pair

  23. Summary

  24. Summary – Proposed the problem of quick detection and isolation of a correlated pair in a Gaussian random vector. – Sequential multiple testing that controls three kinds of error : false alarm, missed detection and wrong isolation. – Goal: Minimize the average sample size subject to three error constraints. – Proposed a very simple rule based on the mixture likelihood ratios of the pairs and established its asymptotic optimality. – We compared our rule with an alternative one numerically and showed that its performance is significantly better, especially when the true value of the correlation is much higher .

  25. References

  26. References i Choi, S. C. (1971). Sequential test for correlation coefficients. Journal of the American Statistical Association , 66(335):575–576. De, S. K. and Baron, M. (2012). Sequential bonferroni methods for multiple hypothesis testing with strong control of family-wise error rates i and ii. Sequential Analysis , 31(2):238–262. Heydari, J. and Tajer, A. (2017). Quickest search for local structures in random graphs. IEEE Transactions on Signal and Information Processing over Networks , 3(3):526–538.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend