Gram Matrix estimation in high dimension Ilaria Giulini INRIA - PowerPoint PPT Presentation

Gram Matrix estimation in high dimension Ilaria Giulini INRIA (project CLASSIC) D´ epartement de Math´ ematiques et Applications ENS, 45 rue d’Ulm, 75005 Paris Joint work with Olivier Catoni Journ´ ee DIM RDM-IdF 2013 12 septembre 2013

General Setting Let P ∈ M 1 + ( R d ) . The Gram matrix is � x x ⊤ dP ( x ) G = Estimate G is equivalent to estimate � � x , θ � 2 dP ( x ) N ( θ ) = since N ( θ ) = θ ⊤ G θ P is unknown X 1 , . . . , X n ∈ R d ∼ P i.i.d. sample Goal: Estimate N ( θ ) for every θ ∈ R d from the sample

Assumption: � � x � 2 dP ( x ) < + ∞ . Tr ( G ) = Our goal: estimate � � θ, x � 2 dP ( x ) N ( θ ) = that is, built ˆ N (depending on X 1 , . . . , X n ) such that, with probability 1 − ǫ, for any θ ∈ R d , | N ( θ ) − ˆ N ( θ ) | ≤ η ( n , θ, ǫ ) where η ( n , θ, ǫ ) → 0 as n → ∞ Tecnhiques: PAC-Bayesiennes

Dimension Dependent Bound � � θ, x � 4 dP ( x ) Let κ = sup θ � = 0 2 < + ∞ . For any ǫ > 0 and n such that ( � θ, x � 2 dP ( x ) ) � � 2 � √ 5 κ − 4 � κ d + log ( ǫ − 1 ) + 1 . 11 d n > 27 , � 2 ( κ − 1 ) with probability at least 1 − 2 ǫ, for any θ ∈ R d , µ � � � ˆ N ( θ ) − N ( θ ) � ≤ N ( θ ) 1 − 3 µ, (1) � � where � � 2 ( κ − 1 ) 2 κ × 89 d µ = ( log ( ǫ − 1 ) + 1 . 11 d ) + n n Remark: Var ( � θ, X � 2 ) ∼ ( κ − 1 ) N ( θ ) 2

Dimension-free Bound With probability at least 1 − 2 ǫ , for any θ ∈ R d , the same estimator ˆ N is such that � � ˆ N ( θ ) µ � � ✶ { 4 µ< 1 } N ( θ ) − 1 � ≤ � � 1 − 4 µ � � � where, for n < 10 20 , � 2 . 07 ( κ − 1 ) � log ( ǫ − 1 ) + 4 . 3 + 1 . 6 × � θ � 2 Tr ( G ) � µ = N ( θ ) n � n × 92 � θ � 2 Tr ( G ) 2 κ + N ( θ )

Remark Let θ i , i = 1 , . . . , d be a ON basis d � � x � 2 dP ( x ) = � Tr ( G ) = N ( θ i ) i = 1 If the energy is equally distributed, that is N ( θ i ) = N ( θ ) for any i = 1 , . . . , d then � d i = 1 N ( θ i ) Tr ( G ) = dN ( θ ) N ( θ ) = N ( θ ) = d N ( θ )

PAC-Bayesian approach Let X 1 , . . . , X n ∼ P be an i.i.d. sample D. McAllester; O. Catoni (2012) Let ν ∈ M 1 + (Θ) be a prior probability measure. ∀ f , ∀ posterior ρ ∈ M 1 + (Θ) such that K ( ρ, ν ) < + ∞ � n � 1 � 1 + f ( X i , θ ′ , λ ) d ρ ( θ ′ ) ≤ � � P log n i = 1 � f ( x , θ ′ , λ ) dP ( x ) d ρ ( θ ′ ) + K ( ρ, ν ) + log ( ǫ − 1 ) � ≥ 1 − ǫ n where the Kullback divergence of ρ with respect to ν is � � �� d ρ log d ρ if ρ ≪ ν d ν K ( ρ, ν ) = + ∞ otherwise

With probability at least 1 − 2 ǫ, for any θ ∈ R d , 1 ˆ B − ( θ ) ≤ N ( θ ) ≤ ˆ B + ( θ ) Definition of ˆ N 2 ˆ B + ( θ ) + ˆ B − ( θ ) ˆ N ( θ ) = 2 Results: 3 With probability at least 1 − 2 ǫ, for any θ ∈ R d , B + ( θ ) − ˆ ˆ B − ( θ ) � � � N ( θ ) − ˆ N ( θ ) � ≤ � � 2

Work in progress dimension-free bounds for the quadratic form associated to the empirical Gram matrix n G = 1 ˆ � X i X ⊤ i n i = 1 Stability of algorithms for spectral clustering (PCA)

Bibliography O. Catoni, Estimating the Gram matrix through PAC-Bayes bounds , preprint. O. Catoni. Challenging the empirical mean and empirical variance: a deviation study , Ann. Inst. H. Poincar´ e Probab. Statist. Vol. 48, No 4 (2012). G. Biau, A. Mas. PCA-Kernel Estimation , Stat. Risk. Model. 29, No. 1 (2012). J. Langford, J. Shawe-Taylor, PAC-Bayes & Margins , Advances in Neural Information Processing Systems (2002). D. McAllester, Simplified PAC-Bayesian margin bounds , In COLT (2003).

Gram Matrix estimation in high dimension Ilaria Giulini INRIA - PowerPoint PPT Presentation

Gram Matrix estimation in high dimension Ilaria Giulini INRIA (project CLASSIC) D epartement de Math ematiques et Applications ENS, 45 rue dUlm, 75005 Paris Joint work with Olivier Catoni Journ ee DIM RDM-IdF 2013 12 septembre

21 st Century Antibiotics Gram Negative Antibiotic Gram Positive Antibiotic Plasmid Library

Joshua Hartigan Supervisor: Judy-anne Osborn Heres a matrix And heres its Gram

More microscopic slides of bacteria Gram stain Good example of bacilli gram stain that is

On the Eigenspectrum Eigenspectrum of the Gram of the Gram On the Matrix and the Generalisation

N-gram models Unsmoothed n-gram models (finish slides from last class) Smoothing

N-Gram Model Formulas Estimating Probabilities N-gram conditional probabilities can be

GOLD/SILVER/PLATINUM BARS & COINS RSBL 0.5 Gram 999 Purity Platinum Bar/Coin More Details

[3] The Matrix What is a matrix? Traditional answer Neo: What is the Matrix? Trinity: The answer

Matrix Multiplication Matrix Multiplication via Matrix-Vector Mult Defn. If matrix A is m n

VC-dimension and Erd os-P osa property Nicolas Bousquet LIRMM, University Montpellier II

A Tale of Two Theories: A Tale of Two Theories: Reconciling Reconciling random matrix theory

Building an IoT Platform with Matrix matthew@matrix.org http://www.matrix.org What is Matrix?

Introductory Matrix Operations Matrix Entries Defn. For matrix A , notation a ij means the en-

Gov 2000: 10. Multiple Regression in Matrix Form Matthew Blackwell Fall 2016 1 / 64 1. Matrix

Liberating Communication with Matrix matthew@matrix.org http://www.matrix.org What is Matrix?

Dimension Reduction and Nearest Neighbor Search Advanced Algorithms Nanjing University, Fall

Investor Presentation November 2018 District Scale Gold-Copper Exploration First World Mining

CORPORATE PRESENTATION JULY 2020 Qualified Person / Legal Cautions QUALIFIED PERSON: All

ASX.KWR 19 May 2020 Disclaimer FOR CONSIDERATION This presentation has been prepared by Kingwest

A NEW CONCEPT FOR IMPROVING FOOD & BEVERAGE QUALITY SHADI RIAZI, PHD PLT HEALTH SOLUTIONS

Measurement What type of observation is a measurement? Explain why. Quantitative observations

CORPORATE PRESENTATION DECEMBER 2018 A Leading Global Cannabis Company Our Vision: Beleave

Resource Driven. TSX: DEF | OTC: DNCVF Defiance Silver Corp. Investor Presentation December 2016

Parseval Frame Construction Nathan Bush, Meredith Caldwell, Trey Trampel LSU, LSU, USA July 6,

Gram Matrix estimation in high dimension Ilaria Giulini INRIA - PowerPoint PPT Presentation

Gram Matrix estimation in high dimension Ilaria Giulini INRIA (project CLASSIC) D epartement de Math ematiques et Applications ENS, 45 rue dUlm, 75005 Paris Joint work with Olivier Catoni Journ ee DIM RDM-IdF 2013 12 septembre

21 st Century Antibiotics Gram Negative Antibiotic Gram Positive Antibiotic Plasmid Library

Joshua Hartigan Supervisor: Judy-anne Osborn Heres a matrix And heres its Gram

More microscopic slides of bacteria Gram stain Good example of bacilli gram stain that is

On the Eigenspectrum Eigenspectrum of the Gram of the Gram On the Matrix and the Generalisation

N-gram models Unsmoothed n-gram models (finish slides from last class) Smoothing

N-Gram Model Formulas Estimating Probabilities N-gram conditional probabilities can be

GOLD/SILVER/PLATINUM BARS &amp; COINS RSBL 0.5 Gram 999 Purity Platinum Bar/Coin More Details

[3] The Matrix What is a matrix? Traditional answer Neo: What is the Matrix? Trinity: The answer

Matrix Multiplication Matrix Multiplication via Matrix-Vector Mult Defn. If matrix A is m n

VC-dimension and Erd os-P osa property Nicolas Bousquet LIRMM, University Montpellier II

A Tale of Two Theories: A Tale of Two Theories: Reconciling Reconciling random matrix theory

Building an IoT Platform with Matrix matthew@matrix.org http://www.matrix.org What is Matrix?

Introductory Matrix Operations Matrix Entries Defn. For matrix A , notation a ij means the en-

Gov 2000: 10. Multiple Regression in Matrix Form Matthew Blackwell Fall 2016 1 / 64 1. Matrix

Liberating Communication with Matrix matthew@matrix.org http://www.matrix.org What is Matrix?

Dimension Reduction and Nearest Neighbor Search Advanced Algorithms Nanjing University, Fall

Investor Presentation November 2018 District Scale Gold-Copper Exploration First World Mining

CORPORATE PRESENTATION JULY 2020 Qualified Person / Legal Cautions QUALIFIED PERSON: All

ASX.KWR 19 May 2020 Disclaimer FOR CONSIDERATION This presentation has been prepared by Kingwest

A NEW CONCEPT FOR IMPROVING FOOD &amp; BEVERAGE QUALITY SHADI RIAZI, PHD PLT HEALTH SOLUTIONS

Measurement What type of observation is a measurement? Explain why. Quantitative observations

CORPORATE PRESENTATION DECEMBER 2018 A Leading Global Cannabis Company Our Vision: Beleave

Resource Driven. TSX: DEF | OTC: DNCVF Defiance Silver Corp. Investor Presentation December 2016

Parseval Frame Construction Nathan Bush, Meredith Caldwell, Trey Trampel LSU, LSU, USA July 6,

GOLD/SILVER/PLATINUM BARS & COINS RSBL 0.5 Gram 999 Purity Platinum Bar/Coin More Details

A NEW CONCEPT FOR IMPROVING FOOD & BEVERAGE QUALITY SHADI RIAZI, PHD PLT HEALTH SOLUTIONS