Statistical inference for R enyi entropy David K allberg - - PowerPoint PPT Presentation

statistical inference for r enyi entropy
SMART_READER_LITE
LIVE PREVIEW

Statistical inference for R enyi entropy David K allberg - - PowerPoint PPT Presentation

Statistical inference for R enyi entropy David K allberg Department of Mathematics and Mathematical Statistics Ume a University David K allberg Statistical inference for R enyi entropy Coauthor Oleg Seleznjev Department of


slide-1
SLIDE 1

Statistical inference for R´ enyi entropy

David K¨ allberg

Department of Mathematics and Mathematical Statistics Ume˙ a University

David K¨ allberg Statistical inference for R´ enyi entropy

slide-2
SLIDE 2

Coauthor Oleg Seleznjev Department of Mathematics and Mathematical Statistics Ume˙ a University

David K¨ allberg Statistical inference for R´ enyi entropy

slide-3
SLIDE 3

Outline

Introduction Measures of uncertainty U-statistics Estimation of entropy Numerical experiment Conclusion

David K¨ allberg Statistical inference for R´ enyi entropy

slide-4
SLIDE 4

Introduction

A system described by probability distribution P Only partial information about P is available, e.g. the mean A measure of uncertainty (entropy) in P What P should we use if any?

David K¨ allberg Statistical inference for R´ enyi entropy

slide-5
SLIDE 5

Why entropy?

The entropy maximization principle: choose P, satisfying given constraints, with maximum uncertainty. Objectivity: we don’t use more information than we have.

David K¨ allberg Statistical inference for R´ enyi entropy

slide-6
SLIDE 6

Measures of uncertainty. The Shannon entropy.

Discrete P = {p(k), k ∈ D} h1(P) := −

  • k

p(k) log p(k) Continuous P with density p(x), x ∈ Rd h1(P) := −

  • Rd log (p(x))p(x)dx

David K¨ allberg Statistical inference for R´ enyi entropy

slide-7
SLIDE 7

Measures of uncertainty. The R´ enyi entropy.

A class of entropies. Of order s = 1 given by Discrete P = {p(k), k ∈ D} hs(P) := 1 1 − s log (

  • k

p(k)s) Continuos P with density p(x), x ∈ Rd hs(P) := 1 1 − s log (

  • Rd p(x)sdx)

David K¨ allberg Statistical inference for R´ enyi entropy

slide-8
SLIDE 8

Motivation

The R´ enyi entropy satisfies axioms on how a measure of uncertainty should behave, R´ enyi (1970). For both discrete and continuous P, the R´ enyi entropy is a generalization of the Shannon entropy, because lim

q→1 hq(P) = h1(P)

David K¨ allberg Statistical inference for R´ enyi entropy

slide-9
SLIDE 9

Problem

Non-parametric estimation of integer order R´ enyi entropy, for discrete and continuous P, from sample {X1, . . . Xn} of P-iid

  • bservations.

David K¨ allberg Statistical inference for R´ enyi entropy

slide-10
SLIDE 10

Overview of R´ enyi entropy estimation, continuos P

Consistency of nearest neighbor estimators for any s, Leonenko et al. (2008) Consistency and asymptotic normality for quadratic case s=2, Leonenko and Seleznjev (2010)

David K¨ allberg Statistical inference for R´ enyi entropy

slide-11
SLIDE 11

U-statistics: basic setup

For a P-iid sample {X1, . . . , Xn} and a symmetric kernel function h(x1, . . . , xm) Eh(X1, . . . , Xm) = θ(P) The U-statistic estimator of θ is defined as: Un = Un(h) := n m −1

  • 1≤i1<...<im≤n

h(Xi1, . . . , Xim)

David K¨ allberg Statistical inference for R´ enyi entropy

slide-12
SLIDE 12

U-statistics: properties

Symmetric, unbiased Optimality properties for large class of P Asymptotically normally distributed

David K¨ allberg Statistical inference for R´ enyi entropy

slide-13
SLIDE 13

Estimation, continuous case

Method relies on estimating functional qs :=

  • Rd ps(x)dx = E(ps−1(X))

David K¨ allberg Statistical inference for R´ enyi entropy

slide-14
SLIDE 14

Estimation, some notation

d(x, y) the Euclidean distance in Rd Bǫ(x) := {y : d(x, y) ≤ ǫ} ball of radius ǫ with center x. bǫ volume of Bǫ(x) pǫ(x) := P(X ∈ Bǫ(x)) the ǫ-ball probability at x

David K¨ allberg Statistical inference for R´ enyi entropy

slide-15
SLIDE 15

Estimation, useful limit

When p(x) bounded and continuous, we rewrite qs = lim

ǫ→0 E(ps−1 ǫ

(X))/bs−1

ǫ

So, unbiased estimate of qs,ǫ := E(ps−1

ǫ

(X)) leads to asymptotically unbiased estimate of qs as ǫ → 0.

David K¨ allberg Statistical inference for R´ enyi entropy

slide-16
SLIDE 16

Estimation of qs,ǫ

For s = 2, 3, 4, . . ., let Iij(ǫ) := I(d(Xi, Xj) ≤ ǫ) ˜ Ii(ǫ) :=

  • 1≤j≤s

j=i

Iij(ǫ) Define U-statistic Qs,n for qs,ǫ by kernel hs(x1, . . . , xs) := 1 s

s

  • i=1

˜ Ii(ǫ)

David K¨ allberg Statistical inference for R´ enyi entropy

slide-17
SLIDE 17

Estimation of R´ enyi entropy

Denote by ˜ Qs,n := Qs,n/bs−1

ǫ

an estimator of qs and by Hs,n :=

1 1−s log (max ( ˜

Qs,n, 1/n)) corresponding estimator of hs

David K¨ allberg Statistical inference for R´ enyi entropy

slide-18
SLIDE 18

Consistency

Assume ǫ = ǫ(n) → 0 as n → ∞ Let v2

s,n := Var( ˜

Qs,n) v2

s,n → 0 as nǫd → a ∈ (0, ∞], so we get

Theorem Let nǫd → a, 0 < a ≤ ∞ and p(x) be bounded and continuos. Then Hs,n is a consistent estimator of hs

David K¨ allberg Statistical inference for R´ enyi entropy

slide-19
SLIDE 19

Smoothness conditions

Denote by Hα(K), 0 < α ≤ 2, K > 0, a linear space of continuos functions in Rd satisfying α-H¨

  • lder condition if 0 < α ≤ 1 or if

1 < α ≤ 2 with continuos partial derivates satisfying (α − 1)-H¨

  • lder condition with constant K.

David K¨ allberg Statistical inference for R´ enyi entropy

slide-20
SLIDE 20

Asymptotic normality

When nǫd → ∞, we have v2

s,n ∼ s(q2s−1 − q2 s )/n

Let Ks,n = max(s( ˜ Q2s−1,n − ˜ Q2

s,n), 1/n)

L(n) > 0, n ≥ 1 is a slowly varying function as n → ∞ Theorem Let ps−1(x) ∈ Hα(K) for α > d/2. If ǫ ∼ L(n)n−1/d and nǫd → ∞, then √n ˜ Qs,n(1 − s)

  • Ks,n

(Hs,n − hs)

D

− → N(0, 1)

David K¨ allberg Statistical inference for R´ enyi entropy

slide-21
SLIDE 21

Numerical experiment

χ2 distribution, 4 degrees of freedom. h3 = −1

2 log (q3),

where q3 = 1/54 300 simulations, each of size n = 500. Quantile plot and histogram supports standard normality

David K¨ allberg Statistical inference for R´ enyi entropy

slide-22
SLIDE 22

Figures

−3 −2 −1 1 2 3 −2 −1 1 2 3

Normal Q−Q Plot

Theoretical Quantiles Sample Quantiles

David K¨ allberg Statistical inference for R´ enyi entropy

slide-23
SLIDE 23

Figures

Chi2 sample

Standard normal density −2 −1 1 2 3 0.0 0.1 0.2 0.3 0.4

David K¨ allberg Statistical inference for R´ enyi entropy

slide-24
SLIDE 24

Conclusion

Asymptotically normal estimates possible for integer order R´ enyi entropy.

David K¨ allberg Statistical inference for R´ enyi entropy

slide-25
SLIDE 25

References

Leonenko, N. ,Pronzato, L. ,Savani, V. (1982). A class of R´ enyi information estimators for multidimensional densities, Annals of Statistics 36 2153-2182 Leonenko, N. and Seleznjev, O. (2009). Statistical inference for ǫ-entropy and quadratic R´ enyi entropy. Univ. Ume˙ a, Research Rep., Dep. Math. and Math. Stat., 1-21, J. Multivariate Analysis (submitted) Renyi, A. Probability theory, North-Holland Publishing Company 1970

David K¨ allberg Statistical inference for R´ enyi entropy