A conjecture regarding optimality of the dictator function under - - PowerPoint PPT Presentation
A conjecture regarding optimality of the dictator function under - - PowerPoint PPT Presentation
A conjecture regarding optimality of the dictator function under Hellinger distance Chandra Nair The Chinese University of Hong Kong 2017 ITA Workshop, UCSD February 13, 2017 Collaborators (co-authors) Andrej Bogdanov Venkat Anantharam CUHK
Collaborators (co-authors)
Venkat Anantharam U.C. Berkeley Amit Chakrabarti Dartmouth Andrej Bogdanov CUHK Thathachar Jayram IBM Alamaden Thanks: Simon’s institute
Introduction
Introduction
Starting point: the following conjecture1 by Kumar (’12) X: uniform on {−1, +1}n Y: obtained from X via the standard noise-operator, i.e. flip each bit (independently) with probability 1−ρ
2 .
Conjecture-MI The dictator function fd(X) = X1 maximizes the mutual information I(f(X); Y) among all boolean functions f(X).
1Thomas A Courtade and Gowtham R Kumar. “Which Boolean functions maximize mutual
information on noisy inputs?” In: IEEE Transactions on Information Theory 60.8 (2014),
- pp. 4515–4525.
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 3 / 15
Introduction
Introduction
Starting point: the following conjecture1 by Kumar (’12) X: uniform on {−1, +1}n Y: obtained from X via the standard noise-operator, i.e. flip each bit (independently) with probability 1−ρ
2 .
Conjecture-MI The dictator function fd(X) = X1 maximizes the mutual information I(f(X); Y) among all boolean functions f(X). Alternate view Let ΦJS(x) := 1 − Hb(x) = JS[(x, 1 − x), (1 − x, x)]. Given f(X) : {−1, +1}n → {−1, +1}, let Zf(Y) := 1−(Tρf)(Y)
2
where (Tρf)(Y) = E(f(X)|Y). Conjecture-MI (restatement) The dictator function fd(X) = X1 maximizes the ΦJS-entropy, E(ΦJS(Zf(Y))) − ΦJS(E(Zf(Y))), among all boolean functions f(X).
1Courtade and Kumar, “Which Boolean functions maximize mutual information on noisy inputs?” chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 3 / 15
Introduction
Main Conjecture
Idea: Replace ΦJS(x) by other convex functions. Consider squared Hellinger distance between (x, 1 − x) and (1 − x, x) ΦH2(x) := 1 − 2
- x(1 − x).
As before, given f(X) : {−1, +1}n → {−1, +1}, let Zf(Y) := 1−(Tρf)(Y)
2
. Conjecture-SH The dictator function fd(X) = X1 maximizes the ΦH2-entropy, E(ΦH2(Zf(Y))) − ΦH2(E(Zf(Y))), among all boolean functions f(X). Equivalently
- 1 − E(f)2 − E
- 1 − (Tρf)2(Y)
- ≤ 1 −
- 1 − ρ2.
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 4 / 15
Introduction
Main Conjecture
Idea: Replace ΦJS(x) by other convex functions. Consider squared Hellinger distance between (x, 1 − x) and (1 − x, x) ΦH2(x) := 1 − 2
- x(1 − x).
As before, given f(X) : {−1, +1}n → {−1, +1}, let Zf(Y) := 1−(Tρf)(Y)
2
. Conjecture-SH The dictator function fd(X) = X1 maximizes the ΦH2-entropy, E(ΦH2(Zf(Y))) − ΦH2(E(Zf(Y))), among all boolean functions f(X). Equivalently
- 1 − E(f)2 − E
- 1 − (Tρf)2(Y)
- ≤ 1 −
- 1 − ρ2.
1 Why is this interesting? (or, why should one care about this Φ?) 2 Evidence to the veracity of the conjecture 3 Weaker forms chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 4 / 15
Introduction
About the Hellinger conjecture
In short, two lemmas:
1 Conjecture-SH implies Conjecture-MI 2 Conjecture-SH is extremal chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 5 / 15
Introduction
About the Hellinger conjecture
In short, two lemmas:
1 Conjecture-SH implies Conjecture-MI 2 Conjecture-SH is extremal
Proposition Conjecture-SH implies Conjecture-MI Lemma The function Hb
- 1−
√ 1−x2 2
- is non-negative, increasing, and convex in x for x ∈ [0, 1].
Let Ψ(x) = 1−
√ 1−x2 2
. Observe that H 1 − x 2
- = H
- Ψ
- 1 − x2
- .
Conjecture-MI can be expressed as
H
- Ψ
- 1 − E(f)2
- − E
- H
- Ψ
- 1 − (Tρf)2(Y)
- ≤ H(Ψ(1)) − H
- Ψ
- 1 − ρ2
- .
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 5 / 15
Introduction
Idea of proof
By the convexity of H(Ψ(x)) (lemma) suffices to show
H
- Ψ
- 1 − E(f)2
- − H
- Ψ
- E
- 1 − (Tρf)2(Y)
- ≤ H(Ψ(1)) − H
- Ψ
- 1 − ρ2
- .
However Conjecture-SH implies
- 1 − E(f)2 − E
- 1 − (Tρf)2(Y)
- ≤ 1 −
- 1 − ρ2.
Apply weak-majorization inequality: in particular use convexity, non-negativeness, and increasing property of H(Ψ(x)).
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 6 / 15
Introduction
Extremality of Hellinger Conjecture
Conjecture-SH states
- 1 − E(f)2 − E
- 1 − (Tρf)2(Y)
- ≤ 1 −
- 1 − ρ2.
Take lim ρ → 1 (clean channel). If Conjecture-SH is true then (for balanced Boolean functions) E
- Senf(X)
- ≥ 1.
Senf(x): sensitivity at x, number of neighbors with opposite value of f(x).
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 7 / 15
Introduction
Extremality of Hellinger Conjecture
Conjecture-SH states
- 1 − E(f)2 − E
- 1 − (Tρf)2(Y)
- ≤ 1 −
- 1 − ρ2.
Take lim ρ → 1 (clean channel). If Conjecture-SH is true then (for balanced Boolean functions) E
- Senf(X)
- ≥ 1.
Senf(x): sensitivity at x, number of neighbors with opposite value of f(x). Similar limit for Conjecture-MI would be equivalent to (for balanced Boolean functions) E (Senf(X)) ≥ 1. This is known to be true (Poincare’s inequality, Pareseval’s theorem, Harper’s isoperimetric inequality)
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 7 / 15
Introduction
Extremality of Hellinger Conjecture
Conjecture-SH states
- 1 − E(f)2 − E
- 1 − (Tρf)2(Y)
- ≤ 1 −
- 1 − ρ2.
Take lim ρ → 1 (clean channel). If Conjecture-SH is true then (for balanced Boolean functions) E
- Senf(X)
- ≥ 1.
Senf(x): sensitivity at x, number of neighbors with opposite value of f(x). On the other hand, best lower bound for balanced functions E
- Senf(X)
- ≥
- 2
π (Bobkov ’98). Therefore even in this limit, the conjecture would imply something new.
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 7 / 15
Introduction
Extremality of Hellinger Conjecture
Conjecture-SH states
- 1 − E(f)2 − E
- 1 − (Tρf)2(Y)
- ≤ 1 −
- 1 − ρ2.
Take lim ρ → 1 (clean channel). If Conjecture-SH is true then (for balanced Boolean functions) E
- Senf(X)
- ≥ 1.
Senf(x): sensitivity at x, number of neighbors with opposite value of f(x). Lemma For any α < 1
2, let maj(Y) denote the majority function (assume that n is odd).
Then there exists large enough n such that E
- Senα
maj(Y)
- < E (Senα
dic(Y)) = 1.
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 7 / 15
Introduction
Evidence to the veracity of Conjecture-SH
- 1 − E(f)2 − E
- 1 − (Tρf)2(Y)
- ≤ 1 −
- 1 − ρ2.
1 verified numerically until n = 8 2 Conjecture-SH is true if
- 1 − ρ2 +
- 1 − E(f)2 ≤ 1
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 8 / 15
Introduction
Evidence to the veracity of Conjecture-SH
- 1 − E(f)2 − E
- 1 − (Tρf)2(Y)
- ≤ 1 −
- 1 − ρ2.
1 verified numerically until n = 8 2 Conjecture-SH is true if
- 1 − ρ2 +
- 1 − E(f)2 ≤ 1 + (1 − ρ2)(1 − E(f)2)
On numerical verification Issue: Number of Boolean functions is 22n Lemma: For any convex Φ, there is a doubly monotone boolean function that maximizes the Φ-entropy, E(Φ(Zf)) − Φ(E(Zf)), where maximization is over all boolean functions. While the number of doubly monotone boolean functions also grows doubly exponentially, still amenable till n = 8 (or a bit more).
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 8 / 15
Introduction
On lemma
Doubly-monotone: A boolean function is said to be doubly-monotone if it is monotone, and for any 1 ≤ i < j ≤ n, f(S ∪ {i, j}) ≥ f(S ∪ {i}) ≥ f(S ∪ {j}) ≥ f(S), ∀S ⊆ [1 : n]. Proof: Follows from majorization and Karamata’s inequality. Similar argument also present in (Courtade-Kumar ’14)
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 9 / 15
Introduction
2nd evidence: Proof in the parameter regime
G(λ) :=
- 1 − (1 − λ) E(f)2 − E
- 1 − λρ2 − (1 − λ)g2(y)
- ,
where g(y) = (Tρf)(y). Want to show G(1) ≥ G(0). G′(λ) = E(f)2 2
- 1 − (1 − λ) E(f)2 − E
- g2(y) − ρ2
2
- 1 − λρ2 − (1 − λ)g2(y)
- chandra@ie.cuhk.edu.hk
Z-interference 13-Feb-2017 10 / 15
Introduction
2nd evidence: Proof in the parameter regime
G(λ) :=
- 1 − (1 − λ) E(f)2 − E
- 1 − λρ2 − (1 − λ)g2(y)
- ,
where g(y) = (Tρf)(y). Want to show G(1) ≥ G(0). G′(λ) = E(f)2 2
- 1 − (1 − λ) E(f)2 − E
- g2(y) − ρ2
2
- 1 − λρ2 − (1 − λ)g2(y)
- Lemma
For any 0 ≤ ρ2, λ ≤ 1 the function f(u) := u − ρ2
- 1 − λρ2 − (1 − λ)u
is convex and increasing in u when u ∈ [0, 1]. If U is a random variable that takes values in [0, 1] then E(f(U)) ≤ (1 − E(U))f(0) + E(U)f(1).
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 10 / 15
Introduction
Calculus of variations..
Denoting α = E(g2(Y )) we obtain G′(λ) ≥ E(f)2 2
- 1 − (1 − λ) E(f)2 + (1 − α)
ρ2 2
- 1 − λρ2 − α
- 1 − ρ2
2 √ λ . (1) Thus, integrating both sides with respect to λ from 0 to 1 we obtain 1 G′(λ)dλ ≥ 2 − α −
- 1 − E(f)2 −
- 1 − ρ2.
Since α ≤ E(f)2 + ρ2(1 − E(f)2) (by Parseval) we are done if 1 + (1 − E(f)2)(1 − ρ2) ≥
- 1 − E(f)2 +
- 1 − ρ2.
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 11 / 15
Introduction
Weaker form
Conjecture†-SH-W For any pairs of boolean functions f(X), g(Y) on the hypercube taking values in {−1, +1},
- 1 − E(f(X))2 − E
- 1 − E(f(X)|g(Y))2
- ≤ 1 −
- 1 − ρ2.
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 12 / 15
Introduction
Weaker form
Conjecture†-SH-W For any pairs of boolean functions f(X), g(Y) on the hypercube taking values in {−1, +1},
- 1 − E(f(X))2 − E
- 1 − E(f(X)|g(Y))2
- ≤ 1 −
- 1 − ρ2.
If this is true, then it would imply the following: Proposition (Pichler et.al. ’16) For any pairs of boolean functions f(X), g(Y) on the hypercube,
I(f(X); g(Y)) ≤ 1 − Hb 1 − ρ 2
- However the techniques used here is insufficient to prove Conjecture†-SH-W.
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 12 / 15
Introduction
Why is the weak form true†?
Conjecture† For any pair of binary random variables (U, V ) with V taking values in {−1, 1} the following holds:
- 1 − E(V )2 − E
- 1 − E(V |U)2
- +
- 1 − s†(U; V ) ≤ 1,
where s†(U; V ) = limp→0 sp(U; V ). sp(U; V ): reverse-hypercontractivity parameter.
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 13 / 15
Introduction
Why is the weak form true†?
Conjecture† For any pair of binary random variables (U, V ) with V taking values in {−1, 1} the following holds:
- 1 − E(V )2 − E
- 1 − E(V |U)2
- +
- 1 − s†(U; V ) ≤ 1,
where s†(U; V ) = limp→0 sp(U; V ). sp(U; V ): reverse-hypercontractivity parameter. Reverse-hypercontractivity A pair of random variables (U, V ) is said to be (p, q)-reverse-hypercontractive for 1 > q ≥ p if E(f(U)g(V )) ≥ f(U)p′g(V )q. For an fixed p < 1 define sp(U; V ) := sup q − 1 p − 1 : (U, V ) is (p, q)-reverse-hypercontractive
- .
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 13 / 15
Introduction
An inequality†
For any (s, c, d) ∈ [0, 1], the inequality
- 1 − (s( ¯
d − d) + ¯ s(c − ¯ c))2 − s
- 1 − ( ¯
d − d)2 − ¯ s
- 1 − (¯
c − c)2 +
- D(s¯
c + ¯ sds ¯ d + ¯ sc) sD(cd) + ¯ sD(dc) ≤ 1.
Seems to be true (numerical simulations) If true, will imply Conjecture† (hence Conjecture†-SH-W) Can formally establish it for certain parameters, including a neighborhood of equality achieving points∗.
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 14 / 15
Introduction
An inequality†
For any (s, c, d) ∈ [0, 1], the inequality
- 1 − (s( ¯
d − d) + ¯ s(c − ¯ c))2 − s
- 1 − ( ¯
d − d)2 − ¯ s
- 1 − (¯
c − c)2 +
- D(s¯
c + ¯ sds ¯ d + ¯ sc) sD(cd) + ¯ sD(dc) ≤ 1.
Seems to be true (numerical simulations) If true, will imply Conjecture† (hence Conjecture†-SH-W) Can formally establish it for certain parameters, including a neighborhood of equality achieving points∗. Remarks: Conjecture† may also be of independent interest Can possibly obtain a computer assisted formal proof of the inequality above
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 14 / 15
Introduction
Conclusion
Proposed ΦH2(x) to be an extremal function for which dictator mamizes the Φ-entropy. Gave a proof in some (limited) parameter regimes Proposed an explicit three variable inequality that establishes the weaker form This is done via another inequality involving hypercontracitivity
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 15 / 15
Introduction
Conclusion
Proposed ΦH2(x) to be an extremal function for which dictator mamizes the Φ-entropy. Gave a proof in some (limited) parameter regimes Proposed an explicit three variable inequality that establishes the weaker form This is done via another inequality involving hypercontracitivity Lots of open conjectures Connections to deeper stuff Talagrand’s inequality (via Bobkov) Thank You
chandra@ie.cuhk.edu.hk Z-interference 13-Feb-2017 15 / 15