on nonparametric estimation of the fisher information
play

On Nonparametric Estimation of the Fisher Information Wei Cao 1 , - PowerPoint PPT Presentation

On Nonparametric Estimation of the Fisher Information Wei Cao 1 , Alex Dytso 2 , Michael Fau 2 , H. Vincent Poor 2 , and Gang Feng 1 1 University of Electronic Science and Technology of China 2 Princeton University IEEE International Symposium


  1. On Nonparametric Estimation of the Fisher Information Wei Cao 1 , Alex Dytso 2 , Michael Fauß 2 , H. Vincent Poor 2 , and Gang Feng 1 1 University of Electronic Science and Technology of China 2 Princeton University IEEE International Symposium on Information Theory (ISIT) June 2020

  2. Presentation Outline 1 Introduction 2 The Bhatuacharya Estimator 3 A Clipped Estimator 4 The Gaussian Noise Case 5 Conclusion 1 / 19

  3. Introduction allocation at the transmituer Problem: (1) 2 / 19 Fisher information for location of a pdf f : ( f ′ ( t )) 2 � I ( f ) = d t, f ( t ) t ∈ R where f ′ is the derivative of f . ▶ An important quantity providing fundamental performance bounds ▶ In practice: no closed-form solutions, distributions rarely known exactly ▶ In Gaussian noise: Fisher information of the received signal allows for optimal power Estimating I ( f ) based on n random samples Y 1 , . . . , Y n independently drawn from f .

  4. Introduction Available estimators Asymptotic results with inexplicit constants Main contributions estimator for the MMSE 1 PK Bhatuacharya. “Estimation of a probability density function and its derivatives”. In: Sankhyā: The Indian Journal of Statistics, Series A 29.4 (1967), pp. 373–382. 2 David L Donoho. “One-sided inference about functionals of a density”. In: The Annals of Statistics 16.4 (1988), pp. 1390–1420. 3 / 19 ▶ Bhatuacharya estimator 1 : kernel based, straightforward and easy to implement ▶ Donoho estimator 2 : lower bound FI over neighborhood of empirical CDF ▶ Explicit and tighter non-asymptotic results for the Bhatuacharya estimator ▶ A new estimator with betuer bounds on convergence rate ▶ Evaluation for the case of a r.v. contaminated by Gaussian noise, and a consistent

  5. Presentation Outline 1 Introduction 2 The Bhatuacharya Estimator 3 A Clipped Estimator 4 The Gaussian Noise Case 5 Conclusion 4 / 19

  6. The Bhattacharya Estimator Kernel density estimator (3) The Bhatuacharya estimator continuously difgerential pdf. (2) 5 / 19 n � t − Y i � f n ( t ) = 1 1 � aK , n a i =1 where a > 0 is the bandwidth parameter. The kernel K ( · ) is assumed to be a ( f ′ n ( t )) 2 � I n ( f n ) = d t, f n ( t ) | t |≤ k n for some k n ≥ 0 .

  7. The Bhattacharya Estimator Estimating density and its derivatives pp. 1269–1283. 4 Pascal Massart. “The tight constant in the Dvoretzky-Kiefer-Wolfowitz inequality”. In: The Annals of Probability (1990), Statistics 40.4 (1969), pp. 1187–1195. 3 Eugene F Schuster. “Estimation of a probability density function and its derivatives”. In: The Annals of Mathematical (4) sup 6 / 19 Theorem 1 � � � � � � � � K ( r +1) ( t ) f ( r ) − f ( r ) ( t ) Let r ∈ { 0 , 1 } , v r = � d t , and δ r,a = sup t ∈ R n ( t ) � . Then, for any ϵ > δ r,a and � � � E � � any n ≥ 1 the following bound holds: − 2 n a 2 r +2( ϵ − δr,a )2 � � � � � f ( r ) n ( t ) − f ( r ) ( t ) v 2 � > ϵ ≤ 2 e . P � � r t ∈ R ▶ Based on the proof by Schuster 3 ▶ Using the best possible constant for the DKW inequality 4

  8. The Bhattacharya Estimator Analysis of the Bhatuacharya Estimator a r.v. contaminated by Gaussian noise), preventing the estimators from being practical (5) 7 / 19 Theorem 2 1 Assume there exists a function ϕ such that sup | t |≤ x f ( t ) ≤ ϕ ( x ) , for all x . Then, provided that � � � f ( i ) n ( t ) − f ( i ) ( t ) � ≤ ϵ i , i ∈ { 0 , 1 } , and ϵ 0 ϕ ( k n ) < 1 , the following bound holds: � � sup | t |≤ k n | I ( f ) − I n ( f n ) | ≤ 4 ϵ 1 k n ρ max ( k n ) + 2 ϵ 2 1 k n ϕ ( k n ) + ϵ 0 ϕ ( k n ) I ( f ) + c ( k n ) , 1 − ϵ 0 ϕ ( k n ) � � ( f ′ ( t )) 2 � f ′ ( t ) � where ρ max ( k n ) = sup | t |≤ k n � and c ( k n ) = d t . � � f ( t ) | t |≥ k n f ( t ) ▶ A non-asymptotic refinement of result in [1, Theorem 3], which contains ϵ 0 ϕ 4 ( k n ) ▶ ϕ ( k n ) increases with k n (usually very fast, e.g., ϕ ( k n ) increases with k n exponentially for

  9. Presentation Outline 1 Introduction 2 The Bhatuacharya Estimator 3 A Clipped Estimator 4 The Gaussian Noise Case 5 Conclusion 8 / 19

  10. A Clipped Estimator (7) (8) The clipped estimator: 9 / 19 (6) Assume there exists a function ρ such that | ρ ( t ) | ≤ | ρ ( t ) | , for all t ∈ R and let ρ n ( t ) = f ′ n ( t ) f n ( t ) . � k n min {| ρ n ( t ) | , | ρ ( t ) |} | f ′ n ( f n ) = n ( t ) | d t. I c − k n � � � f ′ ( t ) ▶ We can set ρ ( k n ) = ρ max ( k n ) , where ρ max ( k n ) = sup | t |≤ k n � � f ( t ) �

  11. A Clipped Estimator (9) (11) (10) Analysis of the clipped estimator where 10 / 19 Theorem 3 � � � f ( i ) n ( t ) − f ( i ) ( t ) � ≤ ϵ i , i ∈ { 0 , 1 } , it holds that � � Under the assumptions sup | t |≤ k n n ( f n ) | ≤ 4 ϵ 1 Φ 1 max ( k n ) + 2 ϵ 0 Φ 2 | I ( f ) − I c max ( k n ) + c ( k n ) , ( f ′ ( t )) 2 � c ( k n ) = d t f ( t ) | t |≥ k n � x Φ m | ρ m ( t ) | d t. max ( x ) = − x The proof is based on two auxiliary estimators: to under- and overestimate I c n

  12. Presentation Outline 1 Introduction 2 The Bhatuacharya Estimator 3 A Clipped Estimator 4 The Gaussian Noise Case 5 Conclusion 11 / 19

  13. Estimation of the FI of a R.V. Contaminated by Gaussian Noise (12) where: arbitrary random variable Gaussian kernel Lemma 1 evaluates the quantities appearing in Th.2 and Th.3 12 / 19 Let f Y denote the pdf of a random variable Y = √ snr X + Z, ▶ Only a very mild assumption that X has a finite second moment but otherwise is an ▶ Z is a standard Gaussian random variable ▶ X and Z are independent Estimating the Fisher information of f Y .

  14. Estimation of the FI of a R.V. Contaminated by Gaussian Noise (13) (14) Convergence of the Bhatuacharya estimator where 13 / 19 Theorem 4 If a = n − w , where w ∈ 0 , 1 � � � , and k n = u log ( n ) , where u ∈ (0 , w ) , then 6 P [ | I n ( f n ) − I ( f Y ) | ≥ ε n ] ≤ 2 e − c 1 n 1 − 4 w + 2 e − c 2 n 1 − 6 w , � u log ( n ) + 2 c 5 n u − w � n − w � � u log ( n ) c 3 + 12 c 4 c 5 ε n ≤ + + n w − u − 1 . 1 − n u − w � u log ( n ) ▶ I n ( f n ) converges to I ( f Y ) with probability 1. ▶ u and w : a trade-ofg between the convergence rate and the precision.

  15. Estimation of the FI of a R.V. Contaminated by Gaussian Noise Convergence of the clipped estimator (16) where (15) 14 / 19 Theorem 5 If a = n − w , where w ∈ 0 , 1 � � , and k n = n u , where u ∈ (0 , w 3 ) , then 6 n ( f n ) − I ( f Y ) | ≥ ε n ] ≤ 2 e − c 1 n 1 − 4 w + 2 e − c 2 n 1 − 6 w , P [ | I c c 3 + 2 n u + n 2 u � ε n ≤ 12 n u − w � + c 4 n − u . ▶ Improved precision: decaying polynomially in n instead of logarithmically

  16. Estimation of the MMSE (18) (20) Proposition 1 (19) Brown’s identity An estimator for the MMSE snr 15 / 19 where (17) I ( f Y ) = 1 − snr mmse ( X | Y ) , ( X − E [ X | Y ]) 2 � � mmse ( X | Y ) = E . mmse n ( X | Y ) = 1 − I c n ( f n ) . If a = n − w , where w ∈ � 0 , 1 � , and k n = n u , where u ∈ (0 , w 3 ) , then 6 P [ | mmse n ( X | Y ) − mmse ( X | Y ) | ≥ snr ε n ] ≤ 2 e − c 1 n 1 − 4 w + 2 e − c 2 n 1 − 6 w .

  17. Examples snr (b) snr (a) 16 / 19 1 1 I ( f Y ) ˆ I, a 0 = a 1 = 0 . 6 0 . 8 ˆ I, a 0 = a 1 = 0 . 3 ˆ I c , a 0 = a 1 = 0 . 6 0 . 8 ˆ I c , a 0 = a 1 = 0 . 3 0 . 6 I ( f Y ) I ( f Y ) 0 . 4 I ( f Y ) 0 . 6 ˆ I, a 0 = a 1 = 0 . 3 ˆ I, a 0 = 0 . 3 , a 1 = 0 . 15 0 . 2 ˆ I c , a 0 = a 1 = 0 . 3 ˆ I c , a 0 = 0 . 3 , a 1 = 0 . 15 0 0 . 4 0 2 4 6 8 10 0 2 4 6 8 10 Figure 1: Fisher information and its estimates when n = 10 4 and k n = 10 with: a) Gaussian input; and b) binary input.

  18. Examples Bhatuacharya estimator (b) Clipped estimator Bhatuacharya estimator (a) Clipped estimator 17 / 19 21 30 20 19 25 log 10 ( n ) 18 log 10 ( n ) 20 17 16 15 15 14 0 . 2 0 . 4 0 . 6 0 . 8 0 . 2 0 . 4 0 . 6 0 . 8 ε n P err Figure 2: Sample complexity with Gaussian input versus: a) fixed P err = 0 . 2 and varying ε n ; and b) fixed ε n = 0 . 5 and varying P err .

  19. Presentation Outline 1 Introduction 2 The Bhatuacharya Estimator 3 A Clipped Estimator 4 The Gaussian Noise Case 5 Conclusion 18 / 19

  20. Conclusion Estimation of the Fisher information of a random variable Specialization of the results of both estimators A consistent estimator for the MMSE Interesting future directions 5 Wei Cao, Alex Dytso, Michael Fauß, Gang Feng, and H. Vincent Poor. “Robust Power Allocation for Parallel Gaussian Channels with Approximately Gaussian Input Distributions”. In: IEEE Transactions on Wireless Communications (Early Access) (2020). 19 / 19 ▶ Bhatuacharya estimator: new sharper convergence results ▶ A clipped estimator: betuer bounds on convergence rates ▶ The case of a Gaussian noise contaminated random variable: ▶ To study the Gaussian noise case with further assumptions ▶ Applications in power allocation problems 5

  21. A full version can be found at: W. Cao, A. Dytso, M. Fauß, H. V. Poor, and G. Feng, “Nonparametric estimation of the Fisher information and its applications.” Available: https://arxiv.org/pdf/2005.03622.pdf Email: clarissa.cao@hotmail.com, {adytso, mfauss, poor}@princeton.edu, fenggang@uestc.edu.cn Thank You

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend