Large Deviations for Random Matrices and a Conjecture of Lukic - - PowerPoint PPT Presentation
Large Deviations for Random Matrices and a Conjecture of Lukic - - PowerPoint PPT Presentation
Large Deviations for Random Matrices and a Conjecture of Lukic Jonathan Breuer Hebrew University of Jerusalem Joint work with B. Simon (Caltech) and O. Zeitouni (The Weizmann Institute) Western States Mathematical Physics Meeting, 13.2.2017
Szeg˝
- ’s Theorem, Sum Rules, and Gems
Given a probability measure with infinite support on the unit circle, dµ(θ) = w(θ)dθ + dµsing, let {Φn}∞
n=0 be the monic orthogonal polynomials w.r.t. µ. Then:
Φn+1(z) = zΦn(z) − ¯ αnΦ∗
n(z)
Φ∗
n(z) = znΦn (1/¯
z) with the Verblunsky coefficients satisfying |αn| < 1 ∀n.
Szeg˝
- ’s Theorem, Sum Rules, and Gems
Given a probability measure with infinite support on the unit circle, dµ(θ) = w(θ)dθ + dµsing, let {Φn}∞
n=0 be the monic orthogonal polynomials w.r.t. µ. Then:
Φn+1(z) = zΦn(z) − ¯ αnΦ∗
n(z)
Φ∗
n(z) = znΦn (1/¯
z) with the Verblunsky coefficients satisfying |αn| < 1 ∀n. There exists a (continuous) bijection between coefficient sequences and probability measures with infinite support (Verblunsky’s Theorem).
Szeg˝
- ’s Theorem, Sum Rules, and Gems
A spectral theory gem is an iff relation between properties of the sequence {αn}∞
n=0 and properties of the corresponding µ.
Szeg˝
- ’s Theorem, Sum Rules, and Gems
A spectral theory gem is an iff relation between properties of the sequence {αn}∞
n=0 and properties of the corresponding µ.
One way of obtaining a gem is via a sum rule, i.e. an equation of the form
- j
a function of a finite number of α’s =
- a function of components of µ
Szeg˝
- ’s Theorem, Sum Rules, and Gems
Perhaps the most classical sum rule is Verblunsky’s formulation of Szeg˝
- ’s Theorem (1935):
∞
- j=0
log
- 1 − |αj|2
=
- log(w(θ))dθ
2π,
Szeg˝
- ’s Theorem, Sum Rules, and Gems
Perhaps the most classical sum rule is Verblunsky’s formulation of Szeg˝
- ’s Theorem (1935):
∞
- j=0
log
- 1 − |αj|2
=
- log(w(θ))dθ
2π, implying the gem
- |αj|2 < ∞ ⇐
⇒
- log(w(θ))dθ
2π > −∞.
Szeg˝
- ’s Theorem, Sum Rules, and Gems
Perhaps the most classical sum rule is Verblunsky’s formulation of Szeg˝
- ’s Theorem (1935):
∞
- j=0
log
- 1 − |αj|2
=
- log(w(θ))dθ
2π, implying the gem
- |αj|2 < ∞ ⇐
⇒
- log(w(θ))dθ
2π > −∞.
- What if log w has a ‘weak’ singularity at a point? Can anything be said
about the αn’s?
Szeg˝
- ’s Theorem, Sum Rules, and Gems
In (’05) Simon proved the sum rule exp 2π (1 − cos(θ)) log(w(θ))dθ 2π
- = e
1 2 (1−|1+α0|2)
∞
- j=0
- 1 − |αj|2
e|αj|2e− 1
2 |αj+1−αj|2
Szeg˝
- ’s Theorem, Sum Rules, and Gems
In (’05) Simon proved the sum rule exp 2π (1 − cos(θ)) log(w(θ))dθ 2π
- = e
1 2 (1−|1+α0|2)
∞
- j=0
- 1 − |αj|2
e|αj|2e− 1
2 |αj+1−αj|2
= e
1 2 (1−|1+α0|2)e(
∞
j=0 log(1−|αj|2))e
∞
j=0 |αj|2e− 1 2
∞
j=0 |αj+1−αj|2
= e
1 2 (1−|1+α0|2)e(
∞
k=1
∞
j=0(−|αj|2k))e
∞
j=0 |αj|2e− 1 2
∞
j=0 |αj+1−αj|2
Szeg˝
- ’s Theorem, Sum Rules, and Gems
In (’05) Simon proved the sum rule exp 2π (1 − cos(θ)) log(w(θ))dθ 2π
- = e
1 2 (1−|1+α0|2)
∞
- j=0
- 1 − |αj|2
e|αj|2e− 1
2 |αj+1−αj|2
= e
1 2 (1−|1+α0|2)e(
∞
j=0 log(1−|αj|2))e
∞
j=0 |αj|2e− 1 2
∞
j=0 |αj+1−αj|2
= e
1 2 (1−|1+α0|2)e(
∞
k=1
∞
j=0(−|αj|2k))e
∞
j=0 |αj|2e− 1 2
∞
j=0 |αj+1−αj|2
Implying
- (1 − cos(θ)) log(w(θ))dθ
2π > −∞ iff
∞
- j=0
|αj|4 < ∞ and
∞
- j=0
|αj+1 − αj|2 < ∞
Szeg˝
- ’s Theorem, Sum Rules, and Gems
Based on this and on two more gems (Simon-Zlatoˇ s, ’05), Simon made the following Conjecture (Simon ’05) Fix θ1, θ2, . . . , θk distinct in [0, 2π) and m1, . . . , mk positive integers. Then
- k
- j=1
(1 − cos (θ − θj))mj log(w(θ))dθ 2π > −∞ iff
k
- j=1
- S − e−iθjmj α ∈ ℓ2
and α ∈ ℓ2(1+maxj mj), where (Sα)j = αj+1.
Szeg˝
- ’s Theorem, Sum Rules, and Gems
However, Lukic (’13) constructed a counterexample (with k = 2, θ1 = 0, , θ2 = π, m1 = 2, , m2 = 1) and made a modified Conjecture (Lukic)
- k
- j=1
(1 − cos (θ − θj))mj log(w(θ))dθ 2π > −∞ iff
k
- j=1
- S − e−iθjmj α ∈ ℓ2
and, for each p = 1, . . . , k,
- j=p
(S − e−iθj)mjα ∈ ℓ2mp+2
Szeg˝
- ’s Theorem, Sum Rules, and Gems
◮ Lukic’s example satisfies
(S − 1)2(S + 1)α ∈ ℓ2 and α ∈ ℓ6, but 2π (1 − cos θ)2(1 + cos θ) log(w(θ))dθ 2π = −∞.
Szeg˝
- ’s Theorem, Sum Rules, and Gems
◮ Lukic’s example satisfies
(S − 1)2(S + 1)α ∈ ℓ2 and α ∈ ℓ6, but 2π (1 − cos θ)2(1 + cos θ) log(w(θ))dθ 2π = −∞.
◮ The above stated form of the conjecture is due to us. Lukic has a
different, equivalent, formulation.
Szeg˝
- ’s Theorem, Sum Rules, and Gems
◮ Lukic’s example satisfies
(S − 1)2(S + 1)α ∈ ℓ2 and α ∈ ℓ6, but 2π (1 − cos θ)2(1 + cos θ) log(w(θ))dθ 2π = −∞.
◮ The above stated form of the conjecture is due to us. Lukic has a
different, equivalent, formulation.
◮ Lukic (’16?) showed that in the case of k = 1, under the assumption
that α has square-summable variation, the remaining two statements are equivalent.
Szeg˝
- ’s Theorem, Sum Rules, and Gems
◮ Lukic’s example satisfies
(S − 1)2(S + 1)α ∈ ℓ2 and α ∈ ℓ6, but 2π (1 − cos θ)2(1 + cos θ) log(w(θ))dθ 2π = −∞.
◮ The above stated form of the conjecture is due to us. Lukic has a
different, equivalent, formulation.
◮ Lukic (’16?) showed that in the case of k = 1, under the assumption
that α has square-summable variation, the remaining two statements are equivalent.
◮ How does one go about generating sum rules of arbitrary order?
Sum Rules from Large Deviations
Recently, Gamboa, Nagel and Rouault (’16) found a beautiful approach to proving Szeg˝
- ’s Theorem (and many existing and new sum rules):
The approach procceds through recognizing the two sides of the sum rule as different presentations of the rate function in a large deviation principle.
Sum Rules from Large Deviations
Recently, Gamboa, Nagel and Rouault (’16) found a beautiful approach to proving Szeg˝
- ’s Theorem (and many existing and new sum rules):
The approach procceds through recognizing the two sides of the sum rule as different presentations of the rate function in a large deviation principle. Large Deviations – Let {Pn}∞
n=1 be a sequence of probability measures on
some metric space, X. On an informal level, we say that {Pn}∞
n=1 obey a
large deviations principle (LDP) if the Pn-probability to be near x0 is ∼ e−vnI(x0).
Sum Rules from Large Deviations
Recently, Gamboa, Nagel and Rouault (’16) found a beautiful approach to proving Szeg˝
- ’s Theorem (and many existing and new sum rules):
The approach procceds through recognizing the two sides of the sum rule as different presentations of the rate function in a large deviation principle. Large Deviations – Let {Pn}∞
n=1 be a sequence of probability measures on
some metric space, X. On an informal level, we say that {Pn}∞
n=1 obey a
large deviations principle (LDP) if the Pn-probability to be near x0 is ∼ e−vnI(x0). The function I(x) is called the rate function, and the sequence vn is called the speed.
Sum Rules from Large Deviations
Recently, Gamboa, Nagel and Rouault (’16) found a beautiful approach to proving Szeg˝
- ’s Theorem (and many existing and new sum rules):
The approach procceds through recognizing the two sides of the sum rule as different presentations of the rate function in a large deviation principle. Large Deviations – Let {Pn}∞
n=1 be a sequence of probability measures on
some metric space, X. On an informal level, we say that {Pn}∞
n=1 obey a
large deviations principle (LDP) if the Pn-probability to be near x0 is ∼ e−vnI(x0). The function I(x) is called the rate function, and the sequence vn is called the speed. It is not hard to see that the rate function is unique.
Sum Rules from Large Deviations
More precisely, let X be a complete metric space and I : X → [0, ∞] a lower semi-continuous function. Let vn be a positive sequence satisfying vn → ∞.
Sum Rules from Large Deviations
More precisely, let X be a complete metric space and I : X → [0, ∞] a lower semi-continuous function. Let vn be a positive sequence satisfying vn → ∞. We say that the sequence of measures {Pn}∞
n=1 obeys a LDP with rate
function I and speed {vn}∞
n=1 if: ◮ For all closed sets F ⊆ X
lim sup
n→∞
1 vn log Pn(F) ≤ − inf
x∈F I(x). ◮ For all open sets U ⊆ X
lim inf
n→∞
1 vn log Pn ≥ − inf
x∈U I(x).
Sum Rules from Large Deviations
Here’s a basic result: Theorem (Cram´ er’s Theorem) Given a random variable ξ, let Λ(λ) = log E
- eλξ
be its cumulant generating functions and I(η) = sup
λ∈R
(λη − Λ(λ)) (1) its Legendre transform. Let PN be the probability distribution for N−1SN ≡ N−1(ξ1 + · · · + ξN), where {ξj}∞
j=1 are independent copies of ξ.
Then PN has a LDP with speed N and rate function I.
Sum Rules from Large Deviations
If f : X → Y is a homeomorphism, then the push forward of {Pn}∞
n=1 on
Y obeys a LDP with rate function I ◦ f −1.
Sum Rules from Large Deviations
If f : X → Y is a homeomorphism, then the push forward of {Pn}∞
n=1 on
Y obeys a LDP with rate function I ◦ f −1. Thus, if the rate function on X is I and on Y it is J, then by uniqueness
- f the rate function
I(x) = J(f (x)).
Sum Rules from Large Deviations
If f : X → Y is a homeomorphism, then the push forward of {Pn}∞
n=1 on
Y obeys a LDP with rate function I ◦ f −1. Thus, if the rate function on X is I and on Y it is J, then by uniqueness
- f the rate function
I(x) = J(f (x)). We want X = probability measures on the unit circle Y = Verblunsky coefficient sequences.
Sum Rules from Large Deviations
If f : X → Y is a homeomorphism, then the push forward of {Pn}∞
n=1 on
Y obeys a LDP with rate function I ◦ f −1. Thus, if the rate function on X is I and on Y it is J, then by uniqueness
- f the rate function
I(x) = J(f (x)). We want X = probability measures on the unit circle Y = Verblunsky coefficient sequences. A huge bonus is that the rate function is always nonnegative. Thus, one gets sum rules with two nonnegative sides, which makes it possible (but not necessarily easy) to get gems!
Sum Rules from Large Deviations
If f : X → Y is a homeomorphism, then the push forward of {Pn}∞
n=1 on
Y obeys a LDP with rate function I ◦ f −1. Thus, if the rate function on X is I and on Y it is J, then by uniqueness
- f the rate function
I(x) = J(f (x)). We want X = probability measures on the unit circle Y = Verblunsky coefficient sequences. A huge bonus is that the rate function is always nonnegative. Thus, one gets sum rules with two nonnegative sides, which makes it possible (but not necessarily easy) to get gems!
- What are the {Pn}∞
n=1??
Large Deviations for Random Matrices
As it turns out, Szeg˝
- ’s Theorem follows from applying the above
strategy to Haar measure on n × n unitary matrices, or in other words, studying the CUE(n).
Large Deviations for Random Matrices
As it turns out, Szeg˝
- ’s Theorem follows from applying the above
strategy to Haar measure on n × n unitary matrices, or in other words, studying the CUE(n). For a matrix chosen from CUE(n), any fixed vector is cyclic with probability one, and the corresponding spectral measures have the form dµn =
n
- j=1
wjδθj.
Large Deviations for Random Matrices
As it turns out, Szeg˝
- ’s Theorem follows from applying the above
strategy to Haar measure on n × n unitary matrices, or in other words, studying the CUE(n). For a matrix chosen from CUE(n), any fixed vector is cyclic with probability one, and the corresponding spectral measures have the form dµn =
n
- j=1
wjδθj. The θ’s and w’s are independent. The w’s are uniformly distributed on {n
j=1 wj = 1} and the θ’s have the distribution
1 n!
- 1≤j<k≤n
- eiθj − eiθk
2
n
- j=1
dθj 2π .
Large Deviations for Random Matrices
Ben Arous and Guionnet (’97) have shown that the random measure 1 n
n
- j=1
δθj
- beys a LDP with speed n2. (They’ve actually shown the analogous
result on the line).
Large Deviations for Random Matrices
Ben Arous and Guionnet (’97) have shown that the random measure 1 n
n
- j=1
δθj
- beys a LDP with speed n2. (They’ve actually shown the analogous
result on the line). To get a LDP for the spectral measure with speed n, this means that the θj’s are with very high probability very close to uniformly distributed.
Large Deviations for Random Matrices
Ben Arous and Guionnet (’97) have shown that the random measure 1 n
n
- j=1
δθj
- beys a LDP with speed n2. (They’ve actually shown the analogous
result on the line). To get a LDP for the spectral measure with speed n, this means that the θj’s are with very high probability very close to uniformly distributed. This and the independence allows one to prove a LDP for the spectral measure with speed n and rate function I(dµ) = − 2π log(w(θ))dθ 2π.
Large Deviations for Random Matrices
The distribution of the Verblunsky coefficients for the random spectral measure dµ was computed by Killip and Nenciu (’04): The {αj}n−1
j=0 are independent with αn−1 distributed uniformly on the unit
circle, and αj having density on the unit disk equal to n − j − 1 π
- 1 − |z|2n−j−2 d2z
Large Deviations for Random Matrices
The distribution of the Verblunsky coefficients for the random spectral measure dµ was computed by Killip and Nenciu (’04): The {αj}n−1
j=0 are independent with αn−1 distributed uniformly on the unit
circle, and αj having density on the unit disk equal to n − j − 1 π
- 1 − |z|2n−j−2 d2z
= n − j − 1 π e(n−j−2) log(1−|z|2)d2z.
Large Deviations for Random Matrices
The distribution of the Verblunsky coefficients for the random spectral measure dµ was computed by Killip and Nenciu (’04): The {αj}n−1
j=0 are independent with αn−1 distributed uniformly on the unit
circle, and αj having density on the unit disk equal to n − j − 1 π
- 1 − |z|2n−j−2 d2z
= n − j − 1 π e(n−j−2) log(1−|z|2)d2z. Taking n to infinity leads to a LDP with speed n and rate function J(α) = −
∞
- j=0
log
- 1 − |αj|2
. This proves Szeg˝
- ’s Theorem!
Large Deviations for Random Matrices
◮ Taking n → ∞ on the coefficient side is subtle. The machinery of
projective limits in large deviation theory can be employed to do this.
Large Deviations for Random Matrices
◮ Taking n → ∞ on the coefficient side is subtle. The machinery of
projective limits in large deviation theory can be employed to do this.
◮ The measure side of the sum rule is −S(ν | µ) – the relative entropy
- f the limiting empirical measure, ν, with respect to µ.
That a LDP exists with rate function given by minus the relative entropy is true for general random matrix ensembles of the form dPn(M) ∼ exp (−ntrV (M)) dM with nice enough V (dM being Haar measure).
Large Deviations for Random Matrices
◮ Taking n → ∞ on the coefficient side is subtle. The machinery of
projective limits in large deviation theory can be employed to do this.
◮ The measure side of the sum rule is −S(ν | µ) – the relative entropy
- f the limiting empirical measure, ν, with respect to µ.
That a LDP exists with rate function given by minus the relative entropy is true for general random matrix ensembles of the form dPn(M) ∼ exp (−ntrV (M)) dM with nice enough V (dM being Haar measure).
- Can we apply this strategy to get higher order Szeg˝
- Theorems?
Higher Order Szeg˝
- Theorems
We know what the rate function on the measure side needs to be: I(µ) = −
- k
- j=1
(1 − cos (θ − θj))mj log(w(θ))dθ 2π
Higher Order Szeg˝
- Theorems
We know what the rate function on the measure side needs to be: I(µ) = −
- k
- j=1
(1 − cos (θ − θj))mj log(w(θ))dθ 2π = −S(dν | µ) by the previous remark.
Higher Order Szeg˝
- Theorems
We know what the rate function on the measure side needs to be: I(µ) = −
- k
- j=1
(1 − cos (θ − θj))mj log(w(θ))dθ 2π = −S(dν | µ) by the previous remark. Thus, we know what dν is!
Higher Order Szeg˝
- Theorems
We know what the rate function on the measure side needs to be: I(µ) = −
- k
- j=1
(1 − cos (θ − θj))mj log(w(θ))dθ 2π = −S(dν | µ) by the previous remark. Thus, we know what dν is! Previous work shows that this is the limiting empirical measure for the ensemble with V (eiθ) = 2
- log |eiθ − eiϕ|dν(ϕ).
Higher Order Szeg˝
- Theorems
We know what the rate function on the measure side needs to be: I(µ) = −
- k
- j=1
(1 − cos (θ − θj))mj log(w(θ))dθ 2π = −S(dν | µ) by the previous remark. Thus, we know what dν is! Previous work shows that this is the limiting empirical measure for the ensemble with V (eiθ) = 2
- log |eiθ − eiϕ|dν(ϕ).
V turns out to be a trigonometric polynomial in θ.
Higher Order Szeg˝
- Theorems
The coefficient side seems to be less straightforward, but isn’t:
Higher Order Szeg˝
- Theorems
The coefficient side seems to be less straightforward, but isn’t: The ensemble is dPn(M) ∼ exp (−ntrV (M)) dM and V is a polynomial in M and M. We may write M in the CMV basis and so, since the LDP has speed n, we see that the rate function is simply the rate function for CUE + the limit of V (M) when n → ∞.
Higher Order Szeg˝
- Theorems
The coefficient side seems to be less straightforward, but isn’t: The ensemble is dPn(M) ∼ exp (−ntrV (M)) dM and V is a polynomial in M and M. We may write M in the CMV basis and so, since the LDP has speed n, we see that the rate function is simply the rate function for CUE + the limit of V (M) when n → ∞. Higher order sum rules are immediate!
Higher Order Szeg˝
- Theorems
The coefficient side seems to be less straightforward, but isn’t: The ensemble is dPn(M) ∼ exp (−ntrV (M)) dM and V is a polynomial in M and M. We may write M in the CMV basis and so, since the LDP has speed n, we see that the rate function is simply the rate function for CUE + the limit of V (M) when n → ∞. Higher order sum rules are immediate! Extracting a gem, however, is not!
A Partial Result
Theorem (B-Simon-Zeitouni) If (S − 1)2 (S + 1)α ∈ ℓ2, (S − 1)2α ∈ ℓ4, α ∈ ℓ6, then
- (1 − cos (θ))2 (1 + cos (θ)) log(w(θ))dθ
2π > −∞.
A Partial Result
The coefficient side of the sum rule is 1 2
- trM + trM2 − 1
3trM3
- −
∞
- j=0
log
- 1 − |αj|2
and showing this is finite translates to showing that I2 =
- −αjαj−1 − 2αjαj−2 + αjαj−3 + 2α2
j
I4 =
- α2
j α2 j−1 + 2αjα2 j−1αj−2 − α2 j αj−1αj−2
− αjαj−1α2
j−2 − αjα2 j−1αj−3 − αjα2 j−2αj−3 + α4 j
I6 = 1 3α3
j α3 j−1 + αjα3 j−1α2 j−2 + α2 j α3 j−1αj−2
αjα2
j−1α2 j−2αj−3 + 2
3α6
j
are all finite.
A Partial Result
The coefficient side of the sum rule is 1 2
- trM + trM2 − 1
3trM3
- −
∞
- j=0
log
- 1 − |αj|2
and showing this is finite translates to showing that I2 =
- −αjαj−1 − 2αjαj−2 + αjαj−3 + 2α2
j
I4 =
- α2
j α2 j−1 + 2αjα2 j−1αj−2 − α2 j αj−1αj−2
− αjαj−1α2
j−2 − αjα2 j−1αj−3 − αjα2 j−2αj−3 + α4 j
I6 = 1 3α3
j α3 j−1 + αjα3 j−1α2 j−2 + α2 j α3 j−1αj−2
αjα2
j−1α2 j−2αj−3 + 2
3α6
j
are all finite. The other direction seems more diffiult. Computations become extremely messy.
A Partial Result
The coefficient side of the sum rule is 1 2
- trM + trM2 − 1
3trM3
- −
∞
- j=0
log
- 1 − |αj|2
and showing this is finite translates to showing that I2 =
- −αjαj−1 − 2αjαj−2 + αjαj−3 + 2α2
j
I4 =
- α2
j α2 j−1 + 2αjα2 j−1αj−2 − α2 j αj−1αj−2
− αjαj−1α2
j−2 − αjα2 j−1αj−3 − αjα2 j−2αj−3 + α4 j
I6 = 1 3α3
j α3 j−1 + αjα3 j−1α2 j−2 + α2 j α3 j−1αj−2
αjα2
j−1α2 j−2αj−3 + 2
3α6
j