minimax testing of a composite null hypothesis defined
play

Minimax testing of a composite null hypothesis defined via a - PowerPoint PPT Presentation

Minimax testing of a composite null hypothesis defined via a quadratic functional Joint work with L. Comminges Asymptotic Statistics and Related Topics Tokyo, Japan Arnak S. Dalalyan ENSAE / CREST / GENES Motivation 1 Testing the relevance


  1. Minimax testing of a composite null hypothesis defined via a quadratic functional Joint work with L. Comminges Asymptotic Statistics and Related Topics Tokyo, Japan Arnak S. Dalalyan ENSAE / CREST / GENES

  2. Motivation 1 Testing the relevance of a group of variables � We observe a sampled signal f : R d → R t = ( t 1 , . . . , t d ) ⊤ �→ f ( t ) in a noisy environment. � The dimension d is large. � Based on a training sample, some variable selection procedure suggests the irrelevance of the subset of variables t J c := { t j : j ∈ J c } . � Based on a testing sample we would like to check the irrelevance of J c . This amounts to testing the hypothesis E [ Var ( f ( t ) | t J )] = 0. � Dalalyan, A.S. c Sept. 2, 2013 2

  3. Motivation 2 Testing the validity of a partial linear model � We observe a sampled signal obeying the partial linear model : f ( t ) = g ( t J ) + β ⊤ t J c in a noisy environment. � g , J and β are unknown. � The dimension d is large, but the cardinal of J is small. � For a given set J 0 , we would like to test the hypothesis J = J 0 . This amounts to testing the hypothesis Var [ ∇ J c 0 f ( t )] = 0. � Dalalyan, A.S. c Sept. 2, 2013 3

  4. Motivation 3 Testing the equality of two norms � Two noisy (sub)images g 1 and g 2 are observed. � The goal is to check whether they coincide up to a rotation and illumination change : g 1 ( z ) = g 2 ( R z ) + a , ∀ z ∈ D ⊂ R 2 , for some orthogonal matrix R and some a ∈ R . � This requires testing the hypothesis H 0 : ∃ ( R , a ) s.t. g 1 ( z ) = g 2 ( R z ) + a , ∀ z ∈ D (1) which is usually very time-consuming (involves a nonlinear and nonconvex minimization step). A simpler strategy is to start with testing H ′ 0 : Var [ g 1 ( Z )] = Var [ g 2 ( Z )] , and to reject the hypothesis H 0 if H ′ 0 is rejected. � Dalalyan, A.S. c Sept. 2, 2013 4

  5. Unifying framework Testing the nullspace of a quadratic functional in regression � Dalalyan, A.S. c Sept. 2, 2013 5

  6. Relation to previous work Non Sampled Multi- Beyond Beyond Gaussian variate Q = I Q � 0 Ingster & Stepa- x x x x � nova 2011 Ingster & Sapati- x � � x x nas 2009 Ingster, Sapa- x x x � x tinas & Suslina 2012 Laurent, Loubes x x x x � & Marteau 2011 Comminges & D. � � � � � 2012 Remark The approach adopted in the first three references is purely asymptotic, whereas Laurent et al. (2011) obtained nonasymptotic rates of separation. � Dalalyan, A.S. c Sept. 2, 2013 6

  7. Overview of our results Testing procedure • We observe { ( x i , t i ) } i = 1 ,..., n ⊂ R × [ 0 , 1 ] d such that f ( t ) = � x i = f ( t i )+ ξ i , ℓ ∈ L θ ℓ [ f ] ϕ ℓ ( t ) , iid ∼ U [ 0 , 1 ] d . where ξ i iid with E [ ξ 1 ] = 0 and t i • We wish to test the hypothesis H 0 : Q [ f ] = � ℓ ∈ L q ℓ θ ℓ [ f ] 2 = 0 H 1 : | Q [ f ] | > ρ 2 . • Each θ ℓ [ f ] 2 is unbiasedly estimated by � � 1 θ 2 ℓ = i � = i ′ x i x i ′ ϕ ℓ ( t i ) ϕ ℓ ( t i ′ ) . n ( n − 1 ) • Given a sequence of weights w = { w ℓ } , we estimate Q [ f ] by n = � ℓ ∈ L w ℓ q ℓ � � θ 2 Q w ℓ . • Test : we fix a threshold u > 0 and reject H 0 if | � Q w n | > u . � Dalalyan, A.S. c Sept. 2, 2013 7

  8. Overview of our results Basics on the minimax rates of separation For any estimator � Q n , we can write � Q n = Q [ f ] + ǫ n [ f ] . • Under H 0 : | � Q n | ≤ sup f ∈F 0 | ǫ n [ f ] | . Q n | ≥ ρ 2 − sup f ∈F 1 ( ρ ) | ǫ n [ f ] | . • Under H 1 : | � • The testing statistic � Q n leads to a consistent test if | ǫ n [ f ] | < ρ 2 − sup | ǫ n [ f ] | (with prob. 1 − γ ) . sup f ∈F 0 f ∈F 1 ( ρ ) • Let ρ n ( � Q ) be the smallest possible ρ > 0 satisfying sup f ∈F 0 | ǫ n [ f ] | + sup f ∈F 1 ( ρ ) | ǫ n [ f ] | < ρ 2 , (with prob. 1 − γ ) . Q n ρ n ( � • Minimax rate of separation : ρ ∗ n ≍ inf � Q ) . Where the difference with the minimax rate of estimation comes from : replacing sup f ∈F 1 ( ρ ) with sup ρ> 0 sup f ∈F 1 ( ρ ) leads to the minimax rate of estimation, but this is sub-optimal ! � Dalalyan, A.S. c Sept. 2, 2013 8

  9. Overview of our results Minimax rates of separation • Let us call the ratio | q ℓ | / c ℓ the importance of the axis ϕ ℓ . • Let N ( T ) be the set of indices with importance ≥ T > 0. • Let M ( T ) = � ℓ ∈N ( T ) q 2 ℓ . • In the general case, the minimax rate of separation is given by � � 1 / 2 � 4 � √ B 1 M ( T ) + B 2 n n ,γ ) 2 = inf ( ρ ∗ + 2 2 T n γ 1 / 2 T > 0 � M ( T ) 1 / 2 � � n − 1 / 2 . ≍ inf + T n T > 0 • Interestingly, in the case of positive Q � 0, � M ( T ) 1 / 2 � n ,γ ) 2 ≍ inf ( ρ ∗ + T . n T > 0 • In both cases, the test defined using the statistic � Q w n with the weights w ℓ = 1 l ( | q ℓ | / c ℓ ≥ T ) achieves the optimal rate. � Dalalyan, A.S. c Sept. 2, 2013 9

  10. Relation to the norm estimation Phase transition/ “Elbow” effect ℓ = 1 and c ℓ = � d 2 σ j Let us assume the simple case q 2 , ℓ ∈ Z d . j = 1 ℓ j � σ − 1 σ ) where ¯ σ − 1 = 1 One can check that M ( T ) ≍ T − d / ( 2 ¯ . d j In hypotheses testing : • If Q is positive, the mmx rate of separation is n ) 2 ≍ n − 4 ¯ ( ρ ∗ σ/ ( 4 ¯ σ + d ) . • If Q is neither positive nor negative, the mmx rate of separation σ + d ) � 1 / 2 ) . is n ) 2 ≍ n − ( 4 ¯ ( ρ ∗ σ/ ( 4 ¯ � Dalalyan, A.S. c Sept. 2, 2013 10

  11. Relation to the norm estimation Phase transition/ “Elbow” effect ℓ = 1 and c ℓ = � d 2 σ j Let us assume the simple case q 2 , ℓ ∈ Z d . j = 1 ℓ j � σ − 1 σ ) where ¯ σ − 1 = 1 One can check that M ( T ) ≍ T − d / ( 2 ¯ . d j In hypotheses testing : • If Q is positive, the mmx rate of separation is n ) 2 ≍ n − 4 ¯ ( ρ ∗ σ/ ( 4 ¯ σ + d ) . • If Q is neither positive nor negative, the mmx rate of separation σ + d ) � 1 / 2 ) . is n ) 2 ≍ n − ( 4 ¯ ( ρ ∗ σ/ ( 4 ¯ In functional estimation : • If Q [ f ] = � f � 2 , the mmx rate of estimation is (Lepski et al. ’99) r ∗ n ≍ n − 2 ¯ σ/ ( 4 ¯ σ + d ) . • If Q [ f ] = � f � 2 2 , the mmx rate of estimation is (Donoho and σ + d ) � 1 / 2 ) . Nussbaum ’90) r ∗ n ≍ n − ( 4 ¯ σ/ ( 4 ¯ � Dalalyan, A.S. c Sept. 2, 2013 10

  12. Main result I Positive functionals Theorem 1. Assume that E [ ξ 4 1 ] < ∞ and for every T > 0, the set N ( T ) = { ℓ : q ℓ ≥ Tc ℓ } is finite. For a γ ∈ ( 0 , 1 ) , let T n ,γ be such that : � � 1 / 2 � � � � n ( n − 1 ) ℓ ( q ℓ − Tc ℓ ) 2 = ℓ c ℓ ( q ℓ − Tc ℓ ) + ( 2 z 1 − γ/ 2 + o ( 1 )) . + 2 Let us define �� � 1 / 2 l ∈ L q ℓ ( q ℓ − T n ,γ c ℓ ) + ρ ∗ � n ,γ = . l ∈ L c ℓ ( q ℓ − T n ,γ c ℓ ) + If several conditions are fulfilled, then the test based on the array � � 1 − T n ,γ c ℓ w ∗ � l , n = q ℓ + n ,γ ) , � satisfies γ n ( F 0 , F 1 ( ρ ∗ φ ∗ n ) ≤ γ + o ( 1 ) , as n → ∞ . � Dalalyan, A.S. c Sept. 2, 2013 11

  13. Testing partial derivatives • Let α ∈ R d + and σ ∈ R d + be two given vectors. C [ f ] = � d � σ j j α j f /∂ t α 1 1 . . . ∂ t α d d � 2 j = 1 � ∂ σ j f /∂ t j � 2 • Let Q [ f ] = � ∂ 2 , 2 . • Let us define δ , ¯ σ , ( κ j ) and κ by δ = � d � d 1 σ = 1 1 j = 1 α j /σ j , σ j . ¯ d j = 1 • If δ < 1 and σ > d / 4 , ¯ then the exact mmx rate ρ ∗ n ,γ is given by ρ ∗ n ,γ = C ∗ γ ρ ∗ n ( 1 + o ( 1 )) , • where the minimax rate ρ ∗ n and the exact separation constant are n = n − 2 ¯ σ ( 1 − δ ) ρ ∗ σ + d , 4 ¯ 2 ( 1 + δ ) ¯ σ + d � ¯ σ ( 1 − δ ) � α j 4 z 2 ( 1 + 2 κ − 1 ) 1 4 ¯ σ + d and C ∗ 4 ¯ σ + d 2 ( 4 ¯ σ + d ) γ = 1 − γ/ 2 κ C ( d , σ , α ) with κ j = 2 σ j + σ ( 1 − δ ) and σ j 2 ¯ � d κ = � d i = 1 Γ( κ i ) j = 1 κ j and C ( d , σ , α ) = π − d � � d � ( 1 − δ )Γ( κ + 2 ) . i = 1 σ i � Dalalyan, A.S. c Sept. 2, 2013 12

  14. Conclusion • We established minimax rates of separation in the model of regression with random design for null hypotheses corresponding to the nullspace of a general quadratic functionals. • In the case of positive functionals, we also proved sharp-minimax optimality of the proposed procedure. • When comparing two norms, the minimax rate of separation is : σ + d ∧ 1 2 ¯ σ ρ ∗ n = n − 4 . This rate shows that the watershed between the 4 ¯ two regimes corresponds to the condition ¯ σ = d / 4. In other terms, we are in the regular regime when ¯ σ > d / 4. It is interesting to note, even if we are unable to establish a direct connection, that this is also the regime under which the Sobolev 2 ⊂ L 4 ([ 0 , 1 ] d ) holds true. embedding W σ • Open questions : adaptation to the unknown smoothness, unknown noise level, the case of (sparse) Besov bodies,... � Dalalyan, A.S. c Sept. 2, 2013 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend