lecture 8 hypothesis testing
play

Lecture 8 Hypothesis Testing I-Hsiang Wang Department of - PowerPoint PPT Presentation

Lecture 8 Hypothesis Testing I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 21, 2016 1 / 25 I-Hsiang Wang IT Lecture 8 In this lecture, we elaborate more on binary hypothesis


  1. Lecture 8 Hypothesis Testing I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 21, 2016 1 / 25 I-Hsiang Wang IT Lecture 8

  2. In this lecture, we elaborate more on binary hypothesis testing, focusing on the following aspects: 1 Fundamental performance limits of binary hypothesis testing. Log likelihood, Neyman-Pearson Test Optimal trade-off between α and β ( α : Probability of false alarm/type-I error/false positive) ( β : Probability of miss detection/type-II error/false negative) . 2 Asymptotic performance of testing from n i.i.d. samples as n → ∞ . Stein's regime vs. Chernoff's regime Error exponents Along the side, we will introduce large deviation theory , an important set of probabilistic theoretical tools that not only help characterize the asymptotic performance limits of binary hypothesis testing but also play an important role in other problems. 2 / 25 I-Hsiang Wang IT Lecture 8

  3. Binary Hypothesis Testing: More Details Binary Hypothesis Testing: More Details 1 Recap: Log Likelihood, Neyman-Pearson Test Tradeoff between α and β Asymptotic Performance: Prelude Asymptotic Performance in Stein's Regime 3 / 25 I-Hsiang Wang IT Lecture 8

  4. Binary Hypothesis Testing: More Details Recap: Log Likelihood, Neyman-Pearson Test Binary Hypothesis Testing: More Details 1 Recap: Log Likelihood, Neyman-Pearson Test Tradeoff between α and β Asymptotic Performance: Prelude Asymptotic Performance in Stein's Regime 4 / 25 I-Hsiang Wang IT Lecture 8

  5. Binary Hypothesis Testing: More Details Recap: Log Likelihood, Neyman-Pearson Test Setup (Recap) { H 0 : X ∼ P 0 (Null Hypothesis, θ = 0 ) (1) H 1 : X ∼ P 1 (Alternative Hypothesis, θ = 1 ) Unknown binary parameter θ . Decision rule (randomized test) φ : X → [0 , 1] . Outcome ˆ Data generating distribution P θ . θ = 1 with probability φ ( X ) . Loss function: 0-1 loss 1 { ˆ Data/Observation/Sample X ∼ P θ . θ ̸ = θ } . Probability of Errors (prove the following as an exercise)  Probability of Type-I Error : α φ = E X ∼ P 0 [ φ ( X )]  (2) Probability of Type-II Error : β φ = E X ∼ P 1 [1 − φ ( X )]  5 / 25 I-Hsiang Wang IT Lecture 8

  6. Binary Hypothesis Testing: More Details Recap: Log Likelihood, Neyman-Pearson Test Likelihood, Log Likelihood Ratio, and Likelihood Ratio Test 1 L ( θ | x ) ≜ P θ ( x ) (viewed as a function of parameter θ given the data x ) is called the likelihood function of θ . 2 For binary HT, likelihood ratio L ( x ) ≜ L (1 | x ) L (0 | x ) = P 1 ( x ) P 0 ( x ) 3 Log likelihood ratio (LLR) l ( x ) ≜ log L ( x ) = log P 1 ( x ) − log P 0 ( x ) . 4 A (randomized) likelihood ratio test (LRT) is a test φ τ,γ defined as follows: (parametrized by cosntants τ ∈ R and γ ∈ (0 , 1) )  if l ( x ) > τ 1   . φ τ,γ ( x ) = if l ( x ) = τ γ  if l ( x ) < τ 0  Remark : In this lecture we assume the logarithm above is base- 2 . 6 / 25 I-Hsiang Wang IT Lecture 8

  7. Binary Hypothesis Testing: More Details Recap: Log Likelihood, Neyman-Pearson Test Performance of LRT For a LRT φ τ,γ , the probabilities of errors are { α = P 0 { l ( X ) > τ } + γP 0 { l ( X ) = τ } = L 0 { l > τ } + γ L 0 { l = τ } (3) β = P 1 { l ( X ) < τ } + (1 − γ ) P 1 { l ( X ) = τ } = L 1 { l ≤ τ } − γ L 1 { l = τ } where L 0 , L 1 are the distributions of the LLR under P 0 and P 1 respectively. The following facts will be useful later. The proofs are left as exercise. Proposition 1 For a LRT φ τ,γ , its probabilities of type-I and type II errors satisfy α ≤ 2 − τ , β ≤ 2 τ , L 0 { l > τ } ≤ α ≤ L 0 { l ≥ τ } , and L 1 { l < τ } ≤ β ≤ L 1 { l ≤ τ } . Furthermore, the distributions of LLR satisfy L 1 ( l ) = 2 l L 0 ( l ) . 7 / 25 I-Hsiang Wang IT Lecture 8

  8. Binary Hypothesis Testing: More Details Recap: Log Likelihood, Neyman-Pearson Test Neyman-Pearson Theorem and Neyman-Pearson Test Recall the Neyman-Pearson problem which aims to find the lowest probability of type-II error under the constraint that the probability of the type-I error is at most α : β ∗ ( α ) ≜ inf φ : X→ [0 , 1] (4) β φ α φ ≤ α Let us re-state the Neyman-Pearson Theorem to emphasize the fact that β ∗ ( α ) can be attained by a randomized LRT φ τ,γ , called the Neyman-Pearson Test. Theorem 1 (Neyman-Pearson: (Randomized) LRT is Optimal) For any α ∈ [0 , 1] , β ∗ ( α ) is attained by a (randomized) LRT φ τ ∗ ,γ ∗ with the parameters ( τ ∗ , γ ∗ ) , where the pair ( τ ∗ , γ ∗ ) ∈ R × [0 , 1] is the unique solution to α = L 0 { l > τ } + γ L 0 { l = τ } . Hence, the inf {·} in (4) is attainable and hence becomes min {·} . 8 / 25 I-Hsiang Wang IT Lecture 8

  9. Binary Hypothesis Testing: More Details Tradeoff between α and β Binary Hypothesis Testing: More Details 1 Recap: Log Likelihood, Neyman-Pearson Test Tradeoff between α and β Asymptotic Performance: Prelude Asymptotic Performance in Stein's Regime 9 / 25 I-Hsiang Wang IT Lecture 8

  10. Binary Hypothesis Testing: More Details Tradeoff between α and β Tradeoff between Probability of Type-I and Type-II Errors Define the collection of all feasible ( Probability of Type-I Error , Probability of Type-II Error ) as follows: R ( P 0 , P 1 ) ≜ { ( α φ , β φ ) | φ : X → [0 , 1] } (5) Proposition 2 (Properties of R ( P 0 , P 1 ) ) R ( P 0 , P 1 ) satisfies the following properties: 1 It is closed and convex. 2 It contains the diagonal line { ( a, 1 − a ) | a ∈ [0 , 1] } . 3 It is symmetric w.r.t. the diagonal line: ( α, β ) ∈ R ( P 0 , P 1 ) ⇐ ⇒ (1 − α, 1 − β ) ∈ R ( P 0 , P 1 ) . 4 Lower boundary (below the diagonal line) { β ∗ ( α ) | α ∈ [0 , 1] } is attained by Neyman-Pearson Test. 10 / 25 I-Hsiang Wang IT Lecture 8

  11. �� � ��� � � � ��� ��� �� � ��� � � � ��� ��� Binary Hypothesis Testing: More Details Tradeoff between α and β β ( P MD ) β ( P MD ) R ( P 0 , P 1 ) R ( P 0 , P 1 ) 1 1 α ( P FA ) α ( P FA ) 1 1 (a) |X| = ∞ (b) |X| < ∞ Intuition : R ( P 0 , P 1 ) tells how "dissimilar" P 0 and P 1 are. The larger R ( P 0 , P 1 ) is, the easier it is to distinguish P 0 and P 1 . 11 / 25 I-Hsiang Wang IT Lecture 8

  12. Binary Hypothesis Testing: More Details Tradeoff between α and β Proof Sketch of Proposition 2 Closedness is due to Neyman-Pearson Theorem ( inf {·} is attainable and becomes min {·} ) and the symmetry property (Property 3) . Convexity is proved by consider a convex combination φ ( λ ) of two tests φ (0) and φ (1) , where φ ( λ ) ( x ) ≜ (1 − λ ) φ (0) ( x ) + λφ (1) ( x ) . Derive its ( α, β ) and convexity is immediately proved. Consider a blind test which flips a biased ( Ber ( a ) ) coin to make decision regardless of which x observes. In other words, φ ( x ) = a, ∀ x ∈ X . Then show that the type-I error probability and type-II error probability are indeed a and (1 − a ) respectively. Symmetry is proved by consider an opposite test φ against the test φ achieving ( α, β ) , where φ ( x ) = 1 − φ ( x ) , ∀ x ∈ X . 12 / 25 I-Hsiang Wang IT Lecture 8

  13. Binary Hypothesis Testing: More Details Tradeoff between α and β Example 1 Draw R ( P 0 , P 1 ) for the following cases: P 0 = Ber ( a ) and P 1 = Ber ( b ) . P 0 = P 1 . P 0 ⊥ P 1 , that is, ⟨ P 0 , P 1 ⟩ = 0 . 13 / 25 I-Hsiang Wang IT Lecture 8

  14. Binary Hypothesis Testing: More Details Tradeoff between α and β Bounds on R ( P 0 , P 1 ) We can run through Neyman-Pearson Tests over all α ∈ [0 , 1] and obtain the lower boundary { β ∗ ( α ) | α ∈ [0 , 1] } of the region R ( P 0 , P 1 ) , which suffices to characterize the entire region. However, this might be more challenging than you initially think, especially when the observation becomes high-dimensional (like the decoding of channel code). Hence, we often would like to have inner and outer bounds of the region R ( P 0 , P 1 ) . Inner Bound is about achievability: Come up with tests which have tractable performance bounds. Often we use deterministic LRT with carefully chosen a Outer Bound is about converse: Show that the performance of all feasible tests must satisfy certain properties. 14 / 25 I-Hsiang Wang IT Lecture 8

  15. Binary Hypothesis Testing: More Details Tradeoff between α and β Outer Bounds Lemma 1 (Weak Converse) For all ( α, β ) ∈ R ( P 0 , P 1 ) , d b (1 − α ∥ β ) ≤ D ( P 0 ∥ P 1 ) , d b ( β ∥ 1 − α ) ≤ D ( P 1 ∥ P 0 ) . Remark : The weak converse bound is characterized by the information divergence. Interestingly, information divergences are the expectation of LLR: D ( P 1 ∥ P 0 ) = E X ∼ P 1 [ l ( X )] = E L 1 [ l ] . D ( P 0 ∥ P 1 ) = E X ∼ P 0 [ − l ( X )] = − E L 0 [ l ] , Lemma 2 (Strong Converse) For all ( α, β ) ∈ R ( P 0 , P 1 ) and τ ∈ R , α + 2 − τ β ≥ L 0 { l > τ } , β + 2 τ α ≥ L 1 { l > τ } . Remark : The strong converse requires the knowledge of the distributions of the LLR, while the weak converse only needs the expected values of the LLR. 15 / 25 I-Hsiang Wang IT Lecture 8

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend