gram matrix estimation in high dimension
play

Gram Matrix estimation in high dimension Ilaria Giulini INRIA - PowerPoint PPT Presentation

Gram Matrix estimation in high dimension Ilaria Giulini INRIA (project CLASSIC) D epartement de Math ematiques et Applications ENS, 45 rue dUlm, 75005 Paris Joint work with Olivier Catoni Journ ee DIM RDM-IdF 2013 12 septembre


  1. Gram Matrix estimation in high dimension Ilaria Giulini INRIA (project CLASSIC) D´ epartement de Math´ ematiques et Applications ENS, 45 rue d’Ulm, 75005 Paris Joint work with Olivier Catoni Journ´ ee DIM RDM-IdF 2013 12 septembre 2013

  2. General Setting Let P ∈ M 1 + ( R d ) . The Gram matrix is � x x ⊤ dP ( x ) G = Estimate G is equivalent to estimate � � x , θ � 2 dP ( x ) N ( θ ) = since N ( θ ) = θ ⊤ G θ P is unknown X 1 , . . . , X n ∈ R d ∼ P i.i.d. sample Goal: Estimate N ( θ ) for every θ ∈ R d from the sample

  3. General Setting Let P ∈ M 1 + ( R d ) . The Gram matrix is � x x ⊤ dP ( x ) G = Estimate G is equivalent to estimate � � x , θ � 2 dP ( x ) N ( θ ) = since N ( θ ) = θ ⊤ G θ P is unknown X 1 , . . . , X n ∈ R d ∼ P i.i.d. sample Goal: Estimate N ( θ ) for every θ ∈ R d from the sample

  4. General Setting Let P ∈ M 1 + ( R d ) . The Gram matrix is � x x ⊤ dP ( x ) G = Estimate G is equivalent to estimate � � x , θ � 2 dP ( x ) N ( θ ) = since N ( θ ) = θ ⊤ G θ P is unknown X 1 , . . . , X n ∈ R d ∼ P i.i.d. sample Goal: Estimate N ( θ ) for every θ ∈ R d from the sample

  5. General Setting Let P ∈ M 1 + ( R d ) . The Gram matrix is � x x ⊤ dP ( x ) G = Estimate G is equivalent to estimate � � x , θ � 2 dP ( x ) N ( θ ) = since N ( θ ) = θ ⊤ G θ P is unknown X 1 , . . . , X n ∈ R d ∼ P i.i.d. sample Goal: Estimate N ( θ ) for every θ ∈ R d from the sample

  6. Assumption: � � x � 2 dP ( x ) < + ∞ . Tr ( G ) = Our goal: estimate � � θ, x � 2 dP ( x ) N ( θ ) = that is, built ˆ N (depending on X 1 , . . . , X n ) such that, with probability 1 − ǫ, for any θ ∈ R d , | N ( θ ) − ˆ N ( θ ) | ≤ η ( n , θ, ǫ ) where η ( n , θ, ǫ ) → 0 as n → ∞ Tecnhiques: PAC-Bayesiennes

  7. Dimension Dependent Bound � � θ, x � 4 dP ( x ) Let κ = sup θ � = 0 2 < + ∞ . For any ǫ > 0 and n such that ( � θ, x � 2 dP ( x ) ) � � 2 � √ 5 κ − 4 � κ d + log ( ǫ − 1 ) + 1 . 11 d n > 27 , � 2 ( κ − 1 ) with probability at least 1 − 2 ǫ, for any θ ∈ R d , µ � � � ˆ N ( θ ) − N ( θ ) � ≤ N ( θ ) 1 − 3 µ, (1) � � where � � 2 ( κ − 1 ) 2 κ × 89 d µ = ( log ( ǫ − 1 ) + 1 . 11 d ) + n n Remark: Var ( � θ, X � 2 ) ∼ ( κ − 1 ) N ( θ ) 2

  8. Dimension Dependent Bound � � θ, x � 4 dP ( x ) Let κ = sup θ � = 0 2 < + ∞ . For any ǫ > 0 and n such that ( � θ, x � 2 dP ( x ) ) � � 2 � √ 5 κ − 4 � κ d + log ( ǫ − 1 ) + 1 . 11 d n > 27 , � 2 ( κ − 1 ) with probability at least 1 − 2 ǫ, for any θ ∈ R d , µ � � � ˆ N ( θ ) − N ( θ ) � ≤ N ( θ ) 1 − 3 µ, (1) � � where � � 2 ( κ − 1 ) 2 κ × 89 d µ = ( log ( ǫ − 1 ) + 1 . 11 d ) + n n Remark: Var ( � θ, X � 2 ) ∼ ( κ − 1 ) N ( θ ) 2

  9. Dimension-free Bound With probability at least 1 − 2 ǫ , for any θ ∈ R d , the same estimator ˆ N is such that � � ˆ N ( θ ) µ � � ✶ { 4 µ< 1 } N ( θ ) − 1 � ≤ � � 1 − 4 µ � � � where, for n < 10 20 , � 2 . 07 ( κ − 1 ) � log ( ǫ − 1 ) + 4 . 3 + 1 . 6 × � θ � 2 Tr ( G ) � µ = N ( θ ) n � n × 92 � θ � 2 Tr ( G ) 2 κ + N ( θ )

  10. Remark Let θ i , i = 1 , . . . , d be a ON basis d � � x � 2 dP ( x ) = � Tr ( G ) = N ( θ i ) i = 1 If the energy is equally distributed, that is N ( θ i ) = N ( θ ) for any i = 1 , . . . , d then � d i = 1 N ( θ i ) Tr ( G ) = dN ( θ ) N ( θ ) = N ( θ ) = d N ( θ )

  11. Remark Let θ i , i = 1 , . . . , d be a ON basis d � � x � 2 dP ( x ) = � Tr ( G ) = N ( θ i ) i = 1 If the energy is equally distributed, that is N ( θ i ) = N ( θ ) for any i = 1 , . . . , d then � d i = 1 N ( θ i ) Tr ( G ) = dN ( θ ) N ( θ ) = N ( θ ) = d N ( θ )

  12. PAC-Bayesian approach Let X 1 , . . . , X n ∼ P be an i.i.d. sample D. McAllester; O. Catoni (2012) Let ν ∈ M 1 + (Θ) be a prior probability measure. ∀ f , ∀ posterior ρ ∈ M 1 + (Θ) such that K ( ρ, ν ) < + ∞ � n � 1 � 1 + f ( X i , θ ′ , λ ) d ρ ( θ ′ ) ≤ � � P log n i = 1 � f ( x , θ ′ , λ ) dP ( x ) d ρ ( θ ′ ) + K ( ρ, ν ) + log ( ǫ − 1 ) � ≥ 1 − ǫ n where the Kullback divergence of ρ with respect to ν is � � �� d ρ log d ρ if ρ ≪ ν d ν K ( ρ, ν ) = + ∞ otherwise

  13. With probability at least 1 − 2 ǫ, for any θ ∈ R d , 1 ˆ B − ( θ ) ≤ N ( θ ) ≤ ˆ B + ( θ ) Definition of ˆ N 2 ˆ B + ( θ ) + ˆ B − ( θ ) ˆ N ( θ ) = 2 Results: 3 With probability at least 1 − 2 ǫ, for any θ ∈ R d , B + ( θ ) − ˆ ˆ B − ( θ ) � � � N ( θ ) − ˆ N ( θ ) � ≤ � � 2

  14. With probability at least 1 − 2 ǫ, for any θ ∈ R d , 1 ˆ B − ( θ ) ≤ N ( θ ) ≤ ˆ B + ( θ ) Definition of ˆ N 2 ˆ B + ( θ ) + ˆ B − ( θ ) ˆ N ( θ ) = 2 Results: 3 With probability at least 1 − 2 ǫ, for any θ ∈ R d , B + ( θ ) − ˆ ˆ B − ( θ ) � � � N ( θ ) − ˆ N ( θ ) � ≤ � � 2

  15. With probability at least 1 − 2 ǫ, for any θ ∈ R d , 1 ˆ B − ( θ ) ≤ N ( θ ) ≤ ˆ B + ( θ ) Definition of ˆ N 2 ˆ B + ( θ ) + ˆ B − ( θ ) ˆ N ( θ ) = 2 Results: 3 With probability at least 1 − 2 ǫ, for any θ ∈ R d , B + ( θ ) − ˆ ˆ B − ( θ ) � � � N ( θ ) − ˆ N ( θ ) � ≤ � � 2

  16. Work in progress dimension-free bounds for the quadratic form associated to the empirical Gram matrix n G = 1 ˆ � X i X ⊤ i n i = 1 Stability of algorithms for spectral clustering (PCA)

  17. Bibliography O. Catoni, Estimating the Gram matrix through PAC-Bayes bounds , preprint. O. Catoni. Challenging the empirical mean and empirical variance: a deviation study , Ann. Inst. H. Poincar´ e Probab. Statist. Vol. 48, No 4 (2012). G. Biau, A. Mas. PCA-Kernel Estimation , Stat. Risk. Model. 29, No. 1 (2012). J. Langford, J. Shawe-Taylor, PAC-Bayes & Margins , Advances in Neural Information Processing Systems (2002). D. McAllester, Simplified PAC-Bayesian margin bounds , In COLT (2003).

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend