second order asymptotics of sequential hypothesis testing
play

Second-Order Asymptotics of Sequential Hypothesis Testing Yonglong - PowerPoint PPT Presentation

Second-Order Asymptotics of Sequential Hypothesis Testing Yonglong Li and Vincent Y. F. Tan June 4, 2020 1/20 1 / 20 Outline Problem Setup Literature Review Main Result Numerical Examples Proof of the Main Result 2/20 2 /


  1. Second-Order Asymptotics of Sequential Hypothesis Testing Yonglong Li and Vincent Y. F. Tan June 4, 2020 1/20 1 / 20

  2. Outline ◮ Problem Setup ◮ Literature Review ◮ Main Result ◮ Numerical Examples ◮ Proof of the Main Result 2/20 2 / 20

  3. Sequential Hypothesis Testing Problem Setup ◮ Binary hypothesis testing: H 0 : P = P 0 and H 1 : P = P 1 , where P 0 and P 1 are probability distributions defined on the same alphabet X . 3/20 3 / 20

  4. Sequential Hypothesis Testing Problem Setup ◮ Binary hypothesis testing: H 0 : P = P 0 and H 1 : P = P 1 , where P 0 and P 1 are probability distributions defined on the same alphabet X . ◮ { X k } ∞ k =1 are i.i.d. random variables distributed according to P . 3/20 3 / 20

  5. Sequential Hypothesis Testing Problem Setup ◮ Binary hypothesis testing: H 0 : P = P 0 and H 1 : P = P 1 , where P 0 and P 1 are probability distributions defined on the same alphabet X . ◮ { X k } ∞ k =1 are i.i.d. random variables distributed according to P . ◮ F n is the σ -algebra generated by X n 1 . 3/20 3 / 20

  6. Sequential Hypothesis Testing Problem Setup ◮ Binary hypothesis testing: H 0 : P = P 0 and H 1 : P = P 1 , where P 0 and P 1 are probability distributions defined on the same alphabet X . ◮ { X k } ∞ k =1 are i.i.d. random variables distributed according to P . ◮ F n is the σ -algebra generated by X n 1 . ◮ T is a stopping time adapted to the filtration {F n } ∞ n =1 and F T is the σ -algebra associated with T . 3/20 3 / 20

  7. Sequential Hypothesis Testing Problem Setup ◮ Binary hypothesis testing: H 0 : P = P 0 and H 1 : P = P 1 , where P 0 and P 1 are probability distributions defined on the same alphabet X . ◮ { X k } ∞ k =1 are i.i.d. random variables distributed according to P . ◮ F n is the σ -algebra generated by X n 1 . ◮ T is a stopping time adapted to the filtration {F n } ∞ n =1 and F T is the σ -algebra associated with T . ◮ δ is a { 0 , 1 } -valued function measurable with respect to F T . δ = i means H i is the underlying hypothesis. A pair ( δ, T ) is called a sequential hypothesis test (SHT). 3/20 3 / 20

  8. Sequential Hypothesis Testing Problem Setup ◮ P 1 | 0 ( δ, T ) = P T P 0 | 1 ( δ, T ) = P T 0 ( δ = 1) and 1 ( δ = 0) . ◮ Expectation constraint on the sample size T : for any integer n , max i =0 , 1 E P i [ T ] ≤ n . Sequential Probability Ratio Test One important class of SHTs is the family of sequential probability ratio tests (SPRTs). Let Y k = log p 0 ( X k ) p 1 ( X k ) and S n = � n k =1 Y k . For any pair of positive real numbers α and β , an SPRT with parameters ( α, β ) is defined as follows � 0 if S T > β δ = 1 if S T < − α, 4/20 where T = inf { n ≥ 1 : S n / ∈ [ − α, β ] } . 4 / 20

  9. Problem Setup Error Exponents Given a sequence of SHTs { ( δ n , T n ) } ∞ n =1 satisfy the expectation constraint, we are concerned with the error exponents ( E 0 , E 1 ) defined as 1 1 1 1 E 0 = lim inf n log and E 1 = lim inf n log P 0 | 1 ( δ n , T n ) . P 1 | 0 ( δ n , T n ) n →∞ n →∞ 5/20 5 / 20

  10. Literature Review ◮ In 1948, Wald and Wolfowitz showed that E 0 ≤ D ( P 1 � P 0 ) and E 1 ≤ D ( P 0 � P 1 ) and the error exponent can be achieved by a sequence of SPRTs. 6/20 6 / 20

  11. Literature Review ◮ In 1948, Wald and Wolfowitz showed that E 0 ≤ D ( P 1 � P 0 ) and E 1 ≤ D ( P 0 � P 1 ) and the error exponent can be achieved by a sequence of SPRTs. ◮ In 1986 and 1988, Lotov studied the series expansion of the error probabilities for a sequence of SPRTs. 6/20 6 / 20

  12. Literature Review ◮ In 1948, Wald and Wolfowitz showed that E 0 ≤ D ( P 1 � P 0 ) and E 1 ≤ D ( P 0 � P 1 ) and the error exponent can be achieved by a sequence of SPRTs. ◮ In 1986 and 1988, Lotov studied the series expansion of the error probabilities for a sequence of SPRTs. ◮ For the fixed length binary hypothesis testing problem, Strassen showed that the backoff from the optimal exponent D ( P 0 � P 1 ) is of the order Θ( 1 √ n ) and characterized the implied constant as a function of the relative entropy variance and the Gaussian cumulative distribution function. 6/20 6 / 20

  13. Main Result Secon-order Term under Expectation Constraint For fixed λ ∈ [0 , 1], � 1 � F n ( λ ) = sup λ n log P 1 | 0 ( δ n , T n ) + D ( P 1 � P 0 ) ( δ n , T n ):max i =0 , 1 E Pi [ T n ] ≤ n � 1 � + (1 − λ ) n log P 0 | 1 ( δ n , T n ) + D ( P 0 � P 1 ) . (1) Let F ( λ ) = lim sup n →∞ F n ( λ ) and F ( λ ) = lim inf n →∞ F n ( λ ) . If F ( λ ) = F ( λ ), then we term this common value as the second-order exponent of SHT under the expectation constraint and we denote it simply as F ( λ ). 7/20 7 / 20

  14. Main Result Second-order Asymptotics under the Expectation Constraint Let { α k } ∞ k =1 and { β k } ∞ k =1 be two increasing sequences of positive real numbers such that α k → ∞ and β k → ∞ as k → ∞ . Let T ( β k ) = inf { n ≥ 1 : S n > β k } and ˜ T ( α k ) = inf { n ≥ 1 : − S n > α k } . Furthermore, let R k = S T ( β k ) − β k and ˜ R k = − S ˜ T ( α k ) − α k . It is known that ◮ if the true hypothesis is H 0 , { R k } ∞ k =1 converges in distribution to some random variable R and the limit is independent of the choice of { α k } ∞ k =1 ; ◮ if the true hypothesis is H 1 , { ˜ R k } ∞ k =1 converges in distribution to some random variable ˜ R and the limit is independent of the choice of { β k } ∞ k =1 . 8/20 8 / 20

  15. Main Result Second-order Asymptotics under the Expectation Constraint Define A ( P 0 , P 1 ) = E [ ˜ ˜ A ( P 0 , P 1 ) = E [ R ] , R ] , B ( P 0 , P 1 ) = log E [ e − ˜ B ( P 0 , P 1 ) = log E [ e − R ] , ˜ R ] . We note that ˜ A ( P 0 , P 1 ) � = A ( P 1 , P 0 ) and ˜ B ( P 0 , P 1 ) � = B ( P 1 , P 0 ) in general. Theorem 1 �� � 2 � � log p 0 ( X 1 ) � Let P 0 and P 1 be such that max i =0 , 1 E P i < ∞ and p 1 ( X 1 ) log p 0 ( X 1 ) p 1 ( X 1 ) is non-arithmetic when X 1 ∼ P 0 . Then for every λ ∈ [0 , 1] , F ( λ ) = F ( λ ) = F ( λ ) = � ˜ A ( P 0 , P 1 ) + ˜ � � � λ B ( P 0 , P 1 ) + (1 − λ ) A ( P 0 , P 1 ) + B ( P 0 , P 1 ) . 9/20 9 / 20

  16. Second-order Results Remark 1 The rate of convergence of the optimal λ -weighted finite-length exponents sup ( δ n , T n ) − λ n log P 1 | 0 ( δ n , T n ) − 1 − λ log P 0 | 1 ( δ n , T n ) to n the λ -weighted exponents λ D ( P 1 � P 0 ) + (1 − λ ) D ( P 0 � P 1 ) is Θ( 1 n ) . 10/20 10 / 20

  17. Numerical Examples Example 1 Let γ 0 and γ 1 be two positive real numbers such that γ 0 < γ 1 . Let p 0 ( x ) = γ 0 e − γ 0 x and p 1 ( x ) = γ 1 e − γ 1 x for x > 0. We can numerically compute the second-order exponent under the expectation constraint. This is illustrated in Figure 3 for various λ ’s. F 4 λ = 0.1 λ = 0.5 3 λ = 0.9 2 1 γ 0.2 0.4 0.6 0.8 1.0 Figure: Exponential distributions as in Example 2 with γ 0 = γ and γ 1 = 1 11/20 11 / 20

  18. Numerical Examples Example 2 Let θ 0 and θ 1 be two distinct real numbers. Let 2 π e − ( x − θ 0)2 2 π e − ( x − θ 1)2 1 1 p 0 ( x ) = √ and p 1 ( x ) = √ for x ∈ R . Let 2 2 ∆( θ ) = | θ 1 − θ 2 | . Then we can numerically compute the second-order exponent under the expectation constraint. This is illustrated in Figure 3. We note that for this case of discriminating between two Gaussians, F ( λ ) does not depend on λ ∈ [0 , 1]. F 20 15 10 5 | Δθ | 2 4 6 8 Figure: Gaussian distributions 12/20 12 / 20

  19. Proof of the Main Result Auxiliary Tools In the proof of Theorem 1, we use the following results on the asymptotics of the first passage time. Let { α i } ∞ i =1 and { β i } ∞ i =1 be two increasing sequences of positive real numbers such that α i → ∞ and β i → ∞ as i → ∞ . Let ( δ i , T i ) be an SPRT with parameters ( α i , β i ). Theorem 2 (Woodfore) Assume that max { E P 1 [ Y 2 1 ] , E P 0 [ Y 2 1 ] } < ∞ and Y 1 is non-arithmetic. Then as n → ∞ , D ( P 0 � P 1 ) + A ( P 0 , P 1 ) β n E P 0 [ T n ] = D ( P 0 � P 1 ) + o (1) , and ˜ α n A ( P 0 , P 1 ) E P 1 [ T n ] = D ( P 1 � P 0 ) + D ( P 1 � P 0 ) + o (1) . 13/20 13 / 20

  20. Proof of the Main Result Auxiliary Tools Theorem 3 (Woodfore) Assume that max { E P 1 [ Y 2 1 ] , E P 0 [ Y 2 1 ] } < ∞ and Y 1 is non-arithmetic. Then, ˜ i →∞ P 1 | 0 ( δ i , T i ) e α i = e B ( P 0 , P 1 ) , lim i →∞ P 0 | 1 ( δ i , T i ) e β i = e B ( P 0 , P 1 ) . lim The following lemma characterizes the optimality of the SPRT. Lemma 4 (Ferguson) Let ( δ, T ) be an SPRT. Let (˜ δ, ˜ T ) be any SHT such that E P 0 [ ˜ E P 1 [ ˜ T ] ≤ E P 0 [ T ] T ] ≤ E P 1 [ T ] . Then and P 0 | 1 ( δ, T ) ≤ P 0 | 1 (˜ δ, ˜ P 1 | 0 ( δ, T ) ≤ P 1 | 0 (˜ δ, ˜ T ) and T ) . 14/20 14 / 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend