fast adaptive estimation of log additive exponential
play

Fast adaptive estimation of log-additive exponential models in - PowerPoint PPT Presentation

Theoretic results Simulation study Fast adaptive estimation of log-additive exponential models in Kullback-Leibler divergence Colloque Jeunes Probabilistes et Statisticiens Richard Fischer EDF R&D MRI, CERMICS, LAMA Supervisors: Cristina


  1. Theoretic results Simulation study Fast adaptive estimation of log-additive exponential models in Kullback-Leibler divergence Colloque Jeunes Probabilistes et Statisticiens Richard Fischer EDF R&D MRI, CERMICS, LAMA Supervisors: Cristina Butucea (LAMA), Jean-François Delmas (CERMICS), Anne Dutfoy (EDF R&D MRI) 18/04/2016 Richard Fischer Fast adaptive estimation of log-additive exponential models 1 / 21

  2. Theoretic results Simulation study Summary 1 Theoretic results 2 Simulation study Richard Fischer Fast adaptive estimation of log-additive exponential models 2 / 21

  3. Theoretic results Simulation study Summary 1 Theoretic results 2 Simulation study Richard Fischer Fast adaptive estimation of log-additive exponential models 3 / 21

  4. Theoretic results Simulation study Estimation problem Suppose that we have an i.i.d. sample X n = ( X 1 , X 2 , . . . , X n ) of a d -dimensional distribution whose density has a product form on △ = { x = ( x 1 , . . . , x d ) ∈ R d : 0 ≤ x 1 ≤ x 2 ≤ . . . ≤ x d ≤ 1 } : d � � d p i ( x i ) 1 △ ( x ) = e ( i = 1 ℓ 0 i ( x i ) − a 0 ) 1 △ ( x ) f ( x ) = i = 1 � [ 0 , 1 ] ℓ 0 such that i q i dx = 0 with q i the i -th marginal of the Lebesgue measure on △ , and a 0 a normalizing constant Suppose that for all 1 ≤ i ≤ d , ℓ 0 i belong to a Sobolev space W 2 r i ( q i ) with r i ∈ N ∗ unknown : � � h ∈ L 2 ( q i ); h ( r i − 1 ) is abs. cont. and h ( r i ) ∈ L 2 ( q i ) W 2 r i ( q i ) = . The product structure of the density suggests a log-additive model to reduce the d -variate problem to d univariate problems Richard Fischer Fast adaptive estimation of log-additive exponential models 4 / 21

  5. Theoretic results Simulation study Log-Additive Exponential Series Estimator Log-additive exponential family For θ = ( θ i , k ; 1 ≤ i ≤ d , 1 ≤ k ≤ m i ) : � d � m i � � θ i , k ϕ i , k ( x i ) − ψ ( θ ) f θ ( x ) = exp 1 △ ( x ) i = 1 k = 1 We require a family of functions ( ϕ i , k ( x i ); 1 ≤ i ≤ d , k ∈ N ) adapted to △ (“orthonormality” w.r.t. the Lebesgue measure on △ ) Basis functions For 1 ≤ i ≤ d , k ∈ N , we define for t ∈ I : ϕ i , k ( t ) = ρ i , k P ( d − i , i − 1 ) ( 2 t − 1 ) , k where P ( d − i , i − 1 ) is the k -th degree Jacobi polynomial and ρ i , k a k constant. Richard Fischer Fast adaptive estimation of log-additive exponential models 5 / 21

  6. Theoretic results Simulation study Maximum likelihood estimator � � We have a sample of size n : X n = X j = ( X j 1 , . . . , X j d ) j = 1 .. n Maximum likelihood estimator ˆ θ m , n verifies, for 1 ≤ i ≤ d , f m , n = f ˆ 1 ≤ k ≤ m i : n µ m , n , i , k = 1 � ϕ i , k ( X j θ m , n [ ϕ i , k ( X i )] = ˆ i ) E f ˆ n j = 1 � �� � empirical mean This is equivalent to (with | m | = � d i = 1 m i ) : ˆ θ m , n = argmax θ ∈ R | m | θ · ˆ µ m , n − ψ ( θ ) n = argmax θ ∈ R | m | 1 � log ( f θ ( X j )) n j = 1 � �� � log-likelihood Richard Fischer Fast adaptive estimation of log-additive exponential models 6 / 21

  7. Theoretic results Simulation study Result of non-adaptive convergence rate I. Theorem �� d � Let f 0 ( x ) = exp i = 1 ℓ 0 1 △ ( x ) . Assume that ℓ 0 i ∈ W 2 i ( x i ) − a 0 r i ( q i ) , r i ∈ N , r i > d. Choose m i = m i ( n ) → ∞ such that : d � | m | 2 d + 1 / n → 0 , | m | 2 d m − 2 r i → 0 and i i = 1 then the Kullback-Leibler divergence between f and f ˆ θ satisfies : � d �� � + m i � D ( f 0 || ˆ m − 2 r i f m , n ) = O P i n i = 1 Richard Fischer Fast adaptive estimation of log-additive exponential models 7 / 21

  8. Theoretic results Simulation study Result of non-adaptive convergence rate II. Optimal convergence rate If we choose m i proportional to n 1 / ( 2 r i + 1 ) , we obtain the optimal univariate rate : � d � − 2 ri � � � − 2 min ( r ) D ( f || ˆ f m , n ) = O P n 2 ri + 1 = O P n 2 min ( r )+ 1 i = 1 Same rate with m i = n 1 / ( 2 min ( r )+ 1 ) for all 1 ≤ i ≤ d Uniform convergence � � f 0 = e � d i ) ( r i ) � L 2 ( q i ) ≤ κ i = 1 ℓ 0 [ i ] − a 0 ; � ℓ 0 i � ∞ ≤ κ, � ( ℓ 0 K r ( κ ) = The convergence in probability is uniform on the set K r ( κ ) of densities : � d � � � � � + | m | � f 0 � ˆ m − 2 r i K →∞ lim sup lim sup ≥ = 0 P D f m , n K i n n →∞ f 0 ∈K r ( κ ) i = 1 Richard Fischer Fast adaptive estimation of log-additive exponential models 8 / 21

  9. Theoretic results Simulation study Adaptive estimation The optimal choice m i ∼ n 1 / ( 2 min ( r )+ 1 ) depends on r , which is unknown Adaptation method : 1 Split the sample into two parts : X n X n X n 1 2 Aggregation Estimators 2 Create multiple estimators ˆ f m , n = f ˆ θ m , n with m ∈ M n based on the sample X n 1 Number of estimators : N n , increasing with n Each m ∈ M n corresponds to regularity parameters r with min ( r ) fixed 3 Perform a convex aggregation on the logarithms of ˆ f m , n with the sample X n 2 to obtain the final estimator f ˆ λ ∗ n Richard Fischer Fast adaptive estimation of log-additive exponential models 9 / 21

  10. Theoretic results Simulation study Choice of estimators Number of estimators N n = o ( log ( n )) , lim n →∞ N n = + ∞ The grid : � � 1 2 ( d + j )+ 1 ⌋ , 1 ≤ j ≤ N n N n = ⌊ n Same number of basis functions in each direction : � � m = ( v , . . . , v ) ∈ R d , v ∈ N n M n = m 2 m 1 Richard Fischer Fast adaptive estimation of log-additive exponential models 10 / 21

  11. Theoretic results Simulation study Choice of estimators Number of estimators N n = o ( log ( n )) , lim n →∞ N n = + ∞ The grid : � � 1 2 ( d + j )+ 1 ⌋ , 1 ≤ j ≤ N n N n = ⌊ n Same number of basis functions in each direction : � � m = ( v , . . . , v ) ∈ R d , v ∈ N n M n = m 2 m 1 Richard Fischer Fast adaptive estimation of log-additive exponential models 10 / 21

  12. Theoretic results Simulation study Choice of estimators Number of estimators N n = o ( log ( n )) , lim n →∞ N n = + ∞ The grid : � � 1 2 ( d + j )+ 1 ⌋ , 1 ≤ j ≤ N n N n = ⌊ n Same number of basis functions in each direction : � � m = ( v , . . . , v ) ∈ R d , v ∈ N n M n = m 2 m 1 Richard Fischer Fast adaptive estimation of log-additive exponential models 10 / 21

  13. Theoretic results Simulation study Choice of estimators Number of estimators N n = o ( log ( n )) , lim n →∞ N n = + ∞ The grid : � � 1 2 ( d + j )+ 1 ⌋ , 1 ≤ j ≤ N n N n = ⌊ n Same number of basis functions in each direction : � � m = ( v , . . . , v ) ∈ R d , v ∈ N n M n = m 2 m 1 Richard Fischer Fast adaptive estimation of log-additive exponential models 10 / 21

  14. Theoretic results Simulation study Convex aggregation of log-densities Convex combination of log-densities ℓ m , n ( x ) = � d � m i Let ˆ k = 1 ˆ θ i , k ϕ i , k ( x i ) for m ∈ M n i = 1 � � � λ m ˆ ℓ m , n ( x ) − ψ λ f λ ( x ) = exp 1 △ ( x ) m ∈M n with λ ∈ Λ + = { ( λ m , m ∈ M n ) , λ m ≥ 0 and � m ∈M n λ m = 1 } Selection of weights ˆ λ ∗ n based on the sample X n 2 : 1 − 1 � � � ˆ λ ∗ f λ ( X j ) n = argmax log 2 pen ( λ ) |X n 2 | λ ∈ Λ + X j ∈X n 2 � �� � � �� � log-likelihood penalty � � with pen ( λ ) = � f λ � ˆ m ∈M n λ m D f m , n Richard Fischer Fast adaptive estimation of log-additive exponential models 11 / 21

  15. Theoretic results Simulation study Sharp oracle inequality for aggregation Lemma Let n ∈ N ∗ be fixed. The convex aggregate estimator f ˆ n verifies for any λ ∗ x > 0 with probability greater than 1 − exp ( − x ) : � � � � ≤ β ( log ( N n ) + x ) f 0 � ˆ f 0 � f ˆ D − min m ∈M n D f m , n , λ ∗ n n with a constant β = β ( � ℓ 0 � ∞ , � ( ℓ 0 i ) ( r i ) � L 2 ( q i ) ) . Order of the remainder term log ( N n ) / n negligible compared to n − 2 min ( r ) / ( 2 min ( r )+ 1 ) . Richard Fischer Fast adaptive estimation of log-additive exponential models 12 / 21

  16. Theoretic results Simulation study Adaptive estimation - Main result Theorem The convex aggregate estimator f ˆ n converges to f in probability with λ ∗ the convergence rate : � � 2 min ( r ) n − D ( f || f ˆ n ) = O P . 2 min ( r )+ 1 λ ∗ Uniform convergence The convergence is uniform for r ∈ R n = { j , d + 1 ≤ j ≤ R n } : � � � � � � 2 min ( r ) f 0 � f ˆ n − K →∞ lim sup lim sup sup D ≥ K = 0 , P 2 min ( r )+ 1 λ ∗ n n →∞ r ∈ ( R n ) d f 0 ∈K r ( κ ) where R n satisfies : � � 2 log ( log ( N n )) − 1 log ( n ) 1 R n ≤ N n + d , R n ≤ R n ≤ n , 2 ( d + Nn )+ 1 2 Richard Fischer Fast adaptive estimation of log-additive exponential models 13 / 21

  17. Theoretic results Simulation study Summary 1 Theoretic results 2 Simulation study Richard Fischer Fast adaptive estimation of log-additive exponential models 14 / 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend