A conjecture regarding optimality of the dictator function under - - PowerPoint PPT Presentation

a conjecture regarding optimality of the dictator
SMART_READER_LITE
LIVE PREVIEW

A conjecture regarding optimality of the dictator function under - - PowerPoint PPT Presentation

A conjecture regarding optimality of the dictator function under Hellinger distance Chandra Nair The Chinese University of Hong Kong 2017 ITA Workshop, UCSD February 13, 2017 Collaborators (co-authors) Andrej Bogdanov Venkat Anantharam CUHK


slide-1
SLIDE 1

A conjecture regarding optimality of the dictator function under Hellinger distance

Chandra Nair

The Chinese University of Hong Kong 2017 ITA Workshop, UCSD February 13, 2017

slide-2
SLIDE 2

Collaborators (co-authors)

Venkat Anantharam U.C. Berkeley Amit Chakrabarti Dartmouth Andrej Bogdanov CUHK Thathachar Jayram IBM Alamaden Thanks: Simon’s institute

slide-3
SLIDE 3

Introduction

Introduction

Starting point: the following conjecture1 by Kumar (’12) X: uniform on {−1, +1}n Y: obtained from X via the standard noise-operator, i.e. flip each bit (independently) with probability 1−ρ

2 .

Conjecture-MI The dictator function fd(X) = X1 maximizes the mutual information I(f(X); Y) among all boolean functions f(X).

1Thomas A Courtade and Gowtham R Kumar. “Which Boolean functions maximize mutual

information on noisy inputs?” In: IEEE Transactions on Information Theory 60.8 (2014),

  • pp. 4515–4525.

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 3 / 15

slide-4
SLIDE 4

Introduction

Introduction

Starting point: the following conjecture1 by Kumar (’12) X: uniform on {−1, +1}n Y: obtained from X via the standard noise-operator, i.e. flip each bit (independently) with probability 1−ρ

2 .

Conjecture-MI The dictator function fd(X) = X1 maximizes the mutual information I(f(X); Y) among all boolean functions f(X). Alternate view Let ΦJS(x) := 1 − Hb(x) = JS[(x, 1 − x), (1 − x, x)]. Given f(X) : {−1, +1}n → {−1, +1}, let Zf(Y) := 1−(Tρf)(Y)

2

where (Tρf)(Y) = E(f(X)|Y). Conjecture-MI (restatement) The dictator function fd(X) = X1 maximizes the ΦJS-entropy, E(ΦJS(Zf(Y))) − ΦJS(E(Zf(Y))), among all boolean functions f(X).

1Courtade and Kumar, “Which Boolean functions maximize mutual information on noisy inputs?” chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 3 / 15

slide-5
SLIDE 5

Introduction

Main Conjecture

Idea: Replace ΦJS(x) by other convex functions. Consider squared Hellinger distance between (x, 1 − x) and (1 − x, x) ΦH2(x) := 1 − 2

  • x(1 − x).

As before, given f(X) : {−1, +1}n → {−1, +1}, let Zf(Y) := 1−(Tρf)(Y)

2

. Conjecture-SH The dictator function fd(X) = X1 maximizes the ΦH2-entropy, E(ΦH2(Zf(Y))) − ΦH2(E(Zf(Y))), among all boolean functions f(X). Equivalently

  • 1 − E(f)2 − E
  • 1 − (Tρf)2(Y)
  • ≤ 1 −
  • 1 − ρ2.

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 4 / 15

slide-6
SLIDE 6

Introduction

Main Conjecture

Idea: Replace ΦJS(x) by other convex functions. Consider squared Hellinger distance between (x, 1 − x) and (1 − x, x) ΦH2(x) := 1 − 2

  • x(1 − x).

As before, given f(X) : {−1, +1}n → {−1, +1}, let Zf(Y) := 1−(Tρf)(Y)

2

. Conjecture-SH The dictator function fd(X) = X1 maximizes the ΦH2-entropy, E(ΦH2(Zf(Y))) − ΦH2(E(Zf(Y))), among all boolean functions f(X). Equivalently

  • 1 − E(f)2 − E
  • 1 − (Tρf)2(Y)
  • ≤ 1 −
  • 1 − ρ2.

1 Why is this interesting? (or, why should one care about this Φ?) 2 Evidence to the veracity of the conjecture 3 Weaker forms chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 4 / 15

slide-7
SLIDE 7

Introduction

About the Hellinger conjecture

In short, two lemmas:

1 Conjecture-SH implies Conjecture-MI 2 Conjecture-SH is extremal chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 5 / 15

slide-8
SLIDE 8

Introduction

About the Hellinger conjecture

In short, two lemmas:

1 Conjecture-SH implies Conjecture-MI 2 Conjecture-SH is extremal

Proposition Conjecture-SH implies Conjecture-MI Lemma The function Hb

  • 1−

√ 1−x2 2

  • is non-negative, increasing, and convex in x for x ∈ [0, 1].

Let Ψ(x) = 1−

√ 1−x2 2

. Observe that H 1 − x 2

  • = H
  • Ψ
  • 1 − x2
  • .

Conjecture-MI can be expressed as

H

  • Ψ
  • 1 − E(f)2
  • − E
  • H
  • Ψ
  • 1 − (Tρf)2(Y)
  • ≤ H(Ψ(1)) − H
  • Ψ
  • 1 − ρ2
  • .

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 5 / 15

slide-9
SLIDE 9

Introduction

Idea of proof

By the convexity of H(Ψ(x)) (lemma) suffices to show

H

  • Ψ
  • 1 − E(f)2
  • − H
  • Ψ
  • E
  • 1 − (Tρf)2(Y)
  • ≤ H(Ψ(1)) − H
  • Ψ
  • 1 − ρ2
  • .

However Conjecture-SH implies

  • 1 − E(f)2 − E
  • 1 − (Tρf)2(Y)
  • ≤ 1 −
  • 1 − ρ2.

Apply weak-majorization inequality: in particular use convexity, non-negativeness, and increasing property of H(Ψ(x)).

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 6 / 15

slide-10
SLIDE 10

Introduction

Extremality of Hellinger Conjecture

Conjecture-SH states

  • 1 − E(f)2 − E
  • 1 − (Tρf)2(Y)
  • ≤ 1 −
  • 1 − ρ2.

Take lim ρ → 1 (clean channel). If Conjecture-SH is true then (for balanced Boolean functions) E

  • Senf(X)
  • ≥ 1.

Senf(x): sensitivity at x, number of neighbors with opposite value of f(x).

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 7 / 15

slide-11
SLIDE 11

Introduction

Extremality of Hellinger Conjecture

Conjecture-SH states

  • 1 − E(f)2 − E
  • 1 − (Tρf)2(Y)
  • ≤ 1 −
  • 1 − ρ2.

Take lim ρ → 1 (clean channel). If Conjecture-SH is true then (for balanced Boolean functions) E

  • Senf(X)
  • ≥ 1.

Senf(x): sensitivity at x, number of neighbors with opposite value of f(x). Similar limit for Conjecture-MI would be equivalent to (for balanced Boolean functions) E (Senf(X)) ≥ 1. This is known to be true (Poincare’s inequality, Pareseval’s theorem, Harper’s isoperimetric inequality)

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 7 / 15

slide-12
SLIDE 12

Introduction

Extremality of Hellinger Conjecture

Conjecture-SH states

  • 1 − E(f)2 − E
  • 1 − (Tρf)2(Y)
  • ≤ 1 −
  • 1 − ρ2.

Take lim ρ → 1 (clean channel). If Conjecture-SH is true then (for balanced Boolean functions) E

  • Senf(X)
  • ≥ 1.

Senf(x): sensitivity at x, number of neighbors with opposite value of f(x). On the other hand, best lower bound for balanced functions E

  • Senf(X)
  • 2

π (Bobkov ’98). Therefore even in this limit, the conjecture would imply something new.

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 7 / 15

slide-13
SLIDE 13

Introduction

Extremality of Hellinger Conjecture

Conjecture-SH states

  • 1 − E(f)2 − E
  • 1 − (Tρf)2(Y)
  • ≤ 1 −
  • 1 − ρ2.

Take lim ρ → 1 (clean channel). If Conjecture-SH is true then (for balanced Boolean functions) E

  • Senf(X)
  • ≥ 1.

Senf(x): sensitivity at x, number of neighbors with opposite value of f(x). Lemma For any α < 1

2, let maj(Y) denote the majority function (assume that n is odd).

Then there exists large enough n such that E

  • Senα

maj(Y)

  • < E (Senα

dic(Y)) = 1.

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 7 / 15

slide-14
SLIDE 14

Introduction

Evidence to the veracity of Conjecture-SH

  • 1 − E(f)2 − E
  • 1 − (Tρf)2(Y)
  • ≤ 1 −
  • 1 − ρ2.

1 verified numerically until n = 8 2 Conjecture-SH is true if

  • 1 − ρ2 +
  • 1 − E(f)2 ≤ 1

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 8 / 15

slide-15
SLIDE 15

Introduction

Evidence to the veracity of Conjecture-SH

  • 1 − E(f)2 − E
  • 1 − (Tρf)2(Y)
  • ≤ 1 −
  • 1 − ρ2.

1 verified numerically until n = 8 2 Conjecture-SH is true if

  • 1 − ρ2 +
  • 1 − E(f)2 ≤ 1 + (1 − ρ2)(1 − E(f)2)

On numerical verification Issue: Number of Boolean functions is 22n Lemma: For any convex Φ, there is a doubly monotone boolean function that maximizes the Φ-entropy, E(Φ(Zf)) − Φ(E(Zf)), where maximization is over all boolean functions. While the number of doubly monotone boolean functions also grows doubly exponentially, still amenable till n = 8 (or a bit more).

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 8 / 15

slide-16
SLIDE 16

Introduction

On lemma

Doubly-monotone: A boolean function is said to be doubly-monotone if it is monotone, and for any 1 ≤ i < j ≤ n, f(S ∪ {i, j}) ≥ f(S ∪ {i}) ≥ f(S ∪ {j}) ≥ f(S), ∀S ⊆ [1 : n]. Proof: Follows from majorization and Karamata’s inequality. Similar argument also present in (Courtade-Kumar ’14)

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 9 / 15

slide-17
SLIDE 17

Introduction

2nd evidence: Proof in the parameter regime

G(λ) :=

  • 1 − (1 − λ) E(f)2 − E
  • 1 − λρ2 − (1 − λ)g2(y)
  • ,

where g(y) = (Tρf)(y). Want to show G(1) ≥ G(0). G′(λ) = E(f)2 2

  • 1 − (1 − λ) E(f)2 − E
  • g2(y) − ρ2

2

  • 1 − λρ2 − (1 − λ)g2(y)
  • chandra@ie.cuhk.edu.hk

Z-interference 13-Feb-2017 10 / 15

slide-18
SLIDE 18

Introduction

2nd evidence: Proof in the parameter regime

G(λ) :=

  • 1 − (1 − λ) E(f)2 − E
  • 1 − λρ2 − (1 − λ)g2(y)
  • ,

where g(y) = (Tρf)(y). Want to show G(1) ≥ G(0). G′(λ) = E(f)2 2

  • 1 − (1 − λ) E(f)2 − E
  • g2(y) − ρ2

2

  • 1 − λρ2 − (1 − λ)g2(y)
  • Lemma

For any 0 ≤ ρ2, λ ≤ 1 the function f(u) := u − ρ2

  • 1 − λρ2 − (1 − λ)u

is convex and increasing in u when u ∈ [0, 1]. If U is a random variable that takes values in [0, 1] then E(f(U)) ≤ (1 − E(U))f(0) + E(U)f(1).

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 10 / 15

slide-19
SLIDE 19

Introduction

Calculus of variations..

Denoting α = E(g2(Y )) we obtain G′(λ) ≥ E(f)2 2

  • 1 − (1 − λ) E(f)2 + (1 − α)

ρ2 2

  • 1 − λρ2 − α
  • 1 − ρ2

2 √ λ . (1) Thus, integrating both sides with respect to λ from 0 to 1 we obtain 1 G′(λ)dλ ≥ 2 − α −

  • 1 − E(f)2 −
  • 1 − ρ2.

Since α ≤ E(f)2 + ρ2(1 − E(f)2) (by Parseval) we are done if 1 + (1 − E(f)2)(1 − ρ2) ≥

  • 1 − E(f)2 +
  • 1 − ρ2.

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 11 / 15

slide-20
SLIDE 20

Introduction

Weaker form

Conjecture†-SH-W For any pairs of boolean functions f(X), g(Y) on the hypercube taking values in {−1, +1},

  • 1 − E(f(X))2 − E
  • 1 − E(f(X)|g(Y))2
  • ≤ 1 −
  • 1 − ρ2.

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 12 / 15

slide-21
SLIDE 21

Introduction

Weaker form

Conjecture†-SH-W For any pairs of boolean functions f(X), g(Y) on the hypercube taking values in {−1, +1},

  • 1 − E(f(X))2 − E
  • 1 − E(f(X)|g(Y))2
  • ≤ 1 −
  • 1 − ρ2.

If this is true, then it would imply the following: Proposition (Pichler et.al. ’16) For any pairs of boolean functions f(X), g(Y) on the hypercube,

I(f(X); g(Y)) ≤ 1 − Hb 1 − ρ 2

  • However the techniques used here is insufficient to prove Conjecture†-SH-W.

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 12 / 15

slide-22
SLIDE 22

Introduction

Why is the weak form true†?

Conjecture† For any pair of binary random variables (U, V ) with V taking values in {−1, 1} the following holds:

  • 1 − E(V )2 − E
  • 1 − E(V |U)2
  • +
  • 1 − s†(U; V ) ≤ 1,

where s†(U; V ) = limp→0 sp(U; V ). sp(U; V ): reverse-hypercontractivity parameter.

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 13 / 15

slide-23
SLIDE 23

Introduction

Why is the weak form true†?

Conjecture† For any pair of binary random variables (U, V ) with V taking values in {−1, 1} the following holds:

  • 1 − E(V )2 − E
  • 1 − E(V |U)2
  • +
  • 1 − s†(U; V ) ≤ 1,

where s†(U; V ) = limp→0 sp(U; V ). sp(U; V ): reverse-hypercontractivity parameter. Reverse-hypercontractivity A pair of random variables (U, V ) is said to be (p, q)-reverse-hypercontractive for 1 > q ≥ p if E(f(U)g(V )) ≥ f(U)p′g(V )q. For an fixed p < 1 define sp(U; V ) := sup q − 1 p − 1 : (U, V ) is (p, q)-reverse-hypercontractive

  • .

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 13 / 15

slide-24
SLIDE 24

Introduction

An inequality†

For any (s, c, d) ∈ [0, 1], the inequality

  • 1 − (s( ¯

d − d) + ¯ s(c − ¯ c))2 − s

  • 1 − ( ¯

d − d)2 − ¯ s

  • 1 − (¯

c − c)2 +

  • D(s¯

c + ¯ sds ¯ d + ¯ sc) sD(cd) + ¯ sD(dc) ≤ 1.

Seems to be true (numerical simulations) If true, will imply Conjecture† (hence Conjecture†-SH-W) Can formally establish it for certain parameters, including a neighborhood of equality achieving points∗.

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 14 / 15

slide-25
SLIDE 25

Introduction

An inequality†

For any (s, c, d) ∈ [0, 1], the inequality

  • 1 − (s( ¯

d − d) + ¯ s(c − ¯ c))2 − s

  • 1 − ( ¯

d − d)2 − ¯ s

  • 1 − (¯

c − c)2 +

  • D(s¯

c + ¯ sds ¯ d + ¯ sc) sD(cd) + ¯ sD(dc) ≤ 1.

Seems to be true (numerical simulations) If true, will imply Conjecture† (hence Conjecture†-SH-W) Can formally establish it for certain parameters, including a neighborhood of equality achieving points∗. Remarks: Conjecture† may also be of independent interest Can possibly obtain a computer assisted formal proof of the inequality above

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 14 / 15

slide-26
SLIDE 26

Introduction

Conclusion

Proposed ΦH2(x) to be an extremal function for which dictator mamizes the Φ-entropy. Gave a proof in some (limited) parameter regimes Proposed an explicit three variable inequality that establishes the weaker form This is done via another inequality involving hypercontracitivity

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 15 / 15

slide-27
SLIDE 27

Introduction

Conclusion

Proposed ΦH2(x) to be an extremal function for which dictator mamizes the Φ-entropy. Gave a proof in some (limited) parameter regimes Proposed an explicit three variable inequality that establishes the weaker form This is done via another inequality involving hypercontracitivity Lots of open conjectures Connections to deeper stuff Talagrand’s inequality (via Bobkov) Thank You

chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 15 / 15