Variational inference, spin glasses, and TAP free energy Song Mei - - PowerPoint PPT Presentation

variational inference spin glasses and tap free energy
SMART_READER_LITE
LIVE PREVIEW

Variational inference, spin glasses, and TAP free energy Song Mei - - PowerPoint PPT Presentation

Variational inference, spin glasses, and TAP free energy Song Mei Stanford University September 19, 2018 Joint work with Zhou Fan and Andrea Montanari Song Mei (Stanford University) TAP free energy September 19, 2018 1 / 29 General


slide-1
SLIDE 1

Variational inference, spin glasses, and TAP free energy

Song Mei

Stanford University

September 19, 2018

Joint work with Zhou Fan and Andrea Montanari

Song Mei (Stanford University) TAP free energy September 19, 2018 1 / 29

slide-2
SLIDE 2

General motivation

◮ Bayesian inference: high dimensional integration is hard! ◮ Variational inference: integration/summation ✦ optimization.

A popular objective function: “mean field free energy”.

◮ Applications: topic modeling, stochastic block model, low rank

matrix estimation, compressed sensing.... ... within which “MF free energy” is known to be not optimal.

◮ Today: introduce the optimal objective “TAP free energy”, and

provide rigorous results.

Song Mei (Stanford University) TAP free energy September 19, 2018 2 / 29

slide-3
SLIDE 3

General motivation

◮ Bayesian inference: high dimensional integration is hard! ◮ Variational inference: integration/summation ✦ optimization.

A popular objective function: “mean field free energy”.

◮ Applications: topic modeling, stochastic block model, low rank

matrix estimation, compressed sensing.... ... within which “MF free energy” is known to be not optimal.

◮ Today: introduce the optimal objective “TAP free energy”, and

provide rigorous results.

Song Mei (Stanford University) TAP free energy September 19, 2018 2 / 29

slide-4
SLIDE 4

General motivation

◮ Bayesian inference: high dimensional integration is hard! ◮ Variational inference: integration/summation ✦ optimization.

A popular objective function: “mean field free energy”.

◮ Applications: topic modeling, stochastic block model, low rank

matrix estimation, compressed sensing.... ... within which “MF free energy” is known to be not optimal.

◮ Today: introduce the optimal objective “TAP free energy”, and

provide rigorous results.

Song Mei (Stanford University) TAP free energy September 19, 2018 2 / 29

slide-5
SLIDE 5

General motivation

◮ Bayesian inference: high dimensional integration is hard! ◮ Variational inference: integration/summation ✦ optimization.

A popular objective function: “mean field free energy”.

◮ Applications: topic modeling, stochastic block model, low rank

matrix estimation, compressed sensing.... ... within which “MF free energy” is known to be not optimal.

◮ Today: introduce the optimal objective “TAP free energy”, and

provide rigorous results.

Song Mei (Stanford University) TAP free energy September 19, 2018 2 / 29

slide-6
SLIDE 6

General motivation

◮ Bayesian inference: high dimensional integration is hard! ◮ Variational inference: integration/summation ✦ optimization.

A popular objective function: “mean field free energy”.

◮ Applications: topic modeling, stochastic block model, low rank

matrix estimation, compressed sensing.... ... within which “MF free energy” is known to be not optimal.

◮ Today: introduce the optimal objective “TAP free energy”, and

provide rigorous results.

Song Mei (Stanford University) TAP free energy September 19, 2018 2 / 29

slide-7
SLIDE 7

General motivation

◮ Bayesian inference: high dimensional integration is hard! ◮ Variational inference: integration/summation ✦ optimization.

A popular objective function: “mean field free energy”.

◮ Applications: topic modeling, stochastic block model, low rank

matrix estimation, compressed sensing.... ... within which “MF free energy” is known to be not optimal.

◮ Today: introduce the optimal objective “TAP free energy”, and

provide rigorous results.

Song Mei (Stanford University) TAP free energy September 19, 2018 2 / 29

slide-8
SLIDE 8

Z✷ synchronization

◮ Signal:

x ❂ ❬①✶❀ ✿ ✿ ✿ ❀ ①♥❪T ✷ Z♥

✷ ❀

①✐

✐✿✐✿❞✿

✘ ❯♥✐❢✭Z✷✮❀ Z✷ ❂ ❢✰✶❀ ✶❣✿

◮ Observation: for ✶ ✔ ✐ ❁ ❥ ✔ ♥

❨✐❥ ❂ ✕ ♥①✐①❥ ✰ ❲✐❥✿

◮ Noise ❲✐❥ ✘ ◆✭✵❀ ✶❂♥✮. ◮ SNR ✕ ✷ ❬✵❀ ✶✮ fixed, dimension ♥ ✦ ✶. ◮ In matrix notation:

Y ❂ ✕ ♥xxT ✰ W ✿

◮ Task: given Y ❂ ✭❨✐❥✮, estimate x (or say X ❂ xxT).

Song Mei (Stanford University) TAP free energy September 19, 2018 3 / 29

slide-9
SLIDE 9

Z✷ synchronization

◮ Signal:

x ❂ ❬①✶❀ ✿ ✿ ✿ ❀ ①♥❪T ✷ Z♥

✷ ❀

①✐

✐✿✐✿❞✿

✘ ❯♥✐❢✭Z✷✮❀ Z✷ ❂ ❢✰✶❀ ✶❣✿

◮ Observation: for ✶ ✔ ✐ ❁ ❥ ✔ ♥

❨✐❥ ❂ ✕ ♥①✐①❥ ✰ ❲✐❥✿

◮ Noise ❲✐❥ ✘ ◆✭✵❀ ✶❂♥✮. ◮ SNR ✕ ✷ ❬✵❀ ✶✮ fixed, dimension ♥ ✦ ✶. ◮ In matrix notation:

Y ❂ ✕ ♥xxT ✰ W ✿

◮ Task: given Y ❂ ✭❨✐❥✮, estimate x (or say X ❂ xxT).

Song Mei (Stanford University) TAP free energy September 19, 2018 3 / 29

slide-10
SLIDE 10

Z✷ synchronization

◮ Signal:

x ❂ ❬①✶❀ ✿ ✿ ✿ ❀ ①♥❪T ✷ Z♥

✷ ❀

①✐

✐✿✐✿❞✿

✘ ❯♥✐❢✭Z✷✮❀ Z✷ ❂ ❢✰✶❀ ✶❣✿

◮ Observation: for ✶ ✔ ✐ ❁ ❥ ✔ ♥

❨✐❥ ❂ ✕ ♥①✐①❥ ✰ ❲✐❥✿

◮ Noise ❲✐❥ ✘ ◆✭✵❀ ✶❂♥✮. ◮ SNR ✕ ✷ ❬✵❀ ✶✮ fixed, dimension ♥ ✦ ✶. ◮ In matrix notation:

Y ❂ ✕ ♥xxT ✰ W ✿

◮ Task: given Y ❂ ✭❨✐❥✮, estimate x (or say X ❂ xxT).

Song Mei (Stanford University) TAP free energy September 19, 2018 3 / 29

slide-11
SLIDE 11

Z✷ synchronization

◮ Signal:

x ❂ ❬①✶❀ ✿ ✿ ✿ ❀ ①♥❪T ✷ Z♥

✷ ❀

①✐

✐✿✐✿❞✿

✘ ❯♥✐❢✭Z✷✮❀ Z✷ ❂ ❢✰✶❀ ✶❣✿

◮ Observation: for ✶ ✔ ✐ ❁ ❥ ✔ ♥

❨✐❥ ❂ ✕ ♥①✐①❥ ✰ ❲✐❥✿

◮ Noise ❲✐❥ ✘ ◆✭✵❀ ✶❂♥✮. ◮ SNR ✕ ✷ ❬✵❀ ✶✮ fixed, dimension ♥ ✦ ✶. ◮ In matrix notation:

Y ❂ ✕ ♥xxT ✰ W ✿

◮ Task: given Y ❂ ✭❨✐❥✮, estimate x (or say X ❂ xxT).

Song Mei (Stanford University) TAP free energy September 19, 2018 3 / 29

slide-12
SLIDE 12

Z✷ synchronization

◮ Signal:

x ❂ ❬①✶❀ ✿ ✿ ✿ ❀ ①♥❪T ✷ Z♥

✷ ❀

①✐

✐✿✐✿❞✿

✘ ❯♥✐❢✭Z✷✮❀ Z✷ ❂ ❢✰✶❀ ✶❣✿

◮ Observation: for ✶ ✔ ✐ ❁ ❥ ✔ ♥

❨✐❥ ❂ ✕ ♥①✐①❥ ✰ ❲✐❥✿

◮ Noise ❲✐❥ ✘ ◆✭✵❀ ✶❂♥✮. ◮ SNR ✕ ✷ ❬✵❀ ✶✮ fixed, dimension ♥ ✦ ✶. ◮ In matrix notation:

Y ❂ ✕ ♥xxT ✰ W ✿

◮ Task: given Y ❂ ✭❨✐❥✮, estimate x (or say X ❂ xxT).

Song Mei (Stanford University) TAP free energy September 19, 2018 3 / 29

slide-13
SLIDE 13

Z✷ synchronization

◮ Signal:

x ❂ ❬①✶❀ ✿ ✿ ✿ ❀ ①♥❪T ✷ Z♥

✷ ❀

①✐

✐✿✐✿❞✿

✘ ❯♥✐❢✭Z✷✮❀ Z✷ ❂ ❢✰✶❀ ✶❣✿

◮ Observation: for ✶ ✔ ✐ ❁ ❥ ✔ ♥

❨✐❥ ❂ ✕ ♥①✐①❥ ✰ ❲✐❥✿

◮ Noise ❲✐❥ ✘ ◆✭✵❀ ✶❂♥✮. ◮ SNR ✕ ✷ ❬✵❀ ✶✮ fixed, dimension ♥ ✦ ✶. ◮ In matrix notation:

Y ❂ ✕ ♥xxT ✰ W ✿

◮ Task: given Y ❂ ✭❨✐❥✮, estimate x (or say X ❂ xxT).

Song Mei (Stanford University) TAP free energy September 19, 2018 3 / 29

slide-14
SLIDE 14

Bayes estimation in Z✷ synchronization

◮ Settings:

x ✘ ❯♥✐❢✭Z♥

✷ ✮❀

Y ❂ ✭✕❂♥✮xxT ✰ W ✿

◮ Estimate X ❂ xxT with loss:

❵✭X❀ ❝ X✮ ❂ ✭✶❂♥✷✮❦X ❝ X❦✷

❋ ✿ ◮ For ✕ ❁ ✶, estimation is impossible. ◮ For ✕ ❃ ✶, estimation is possible and efficient, e.g., spectral

estimator (Baik, Ben Arous, Peche phase transition).

◮ The optimal estimator is the Bayes estimator (also minimax

estimator):

X❇❛②❡s ❂ E❬xxT❥Y ❪✿

Song Mei (Stanford University) TAP free energy September 19, 2018 4 / 29

slide-15
SLIDE 15

Bayes estimation in Z✷ synchronization

◮ Settings:

x ✘ ❯♥✐❢✭Z♥

✷ ✮❀

Y ❂ ✭✕❂♥✮xxT ✰ W ✿

◮ Estimate X ❂ xxT with loss:

❵✭X❀ ❝ X✮ ❂ ✭✶❂♥✷✮❦X ❝ X❦✷

❋ ✿ ◮ For ✕ ❁ ✶, estimation is impossible. ◮ For ✕ ❃ ✶, estimation is possible and efficient, e.g., spectral

estimator (Baik, Ben Arous, Peche phase transition).

◮ The optimal estimator is the Bayes estimator (also minimax

estimator):

X❇❛②❡s ❂ E❬xxT❥Y ❪✿

Song Mei (Stanford University) TAP free energy September 19, 2018 4 / 29

slide-16
SLIDE 16

Bayes estimation in Z✷ synchronization

◮ Settings:

x ✘ ❯♥✐❢✭Z♥

✷ ✮❀

Y ❂ ✭✕❂♥✮xxT ✰ W ✿

◮ Estimate X ❂ xxT with loss:

❵✭X❀ ❝ X✮ ❂ ✭✶❂♥✷✮❦X ❝ X❦✷

❋ ✿ ◮ For ✕ ❁ ✶, estimation is impossible. ◮ For ✕ ❃ ✶, estimation is possible and efficient, e.g., spectral

estimator (Baik, Ben Arous, Peche phase transition).

◮ The optimal estimator is the Bayes estimator (also minimax

estimator):

X❇❛②❡s ❂ E❬xxT❥Y ❪✿

Song Mei (Stanford University) TAP free energy September 19, 2018 4 / 29

slide-17
SLIDE 17

Bayes estimation in Z✷ synchronization

◮ Settings:

x ✘ ❯♥✐❢✭Z♥

✷ ✮❀

Y ❂ ✭✕❂♥✮xxT ✰ W ✿

◮ Estimate X ❂ xxT with loss:

❵✭X❀ ❝ X✮ ❂ ✭✶❂♥✷✮❦X ❝ X❦✷

❋ ✿ ◮ For ✕ ❁ ✶, estimation is impossible. ◮ For ✕ ❃ ✶, estimation is possible and efficient, e.g., spectral

estimator (Baik, Ben Arous, Peche phase transition).

◮ The optimal estimator is the Bayes estimator (also minimax

estimator):

X❇❛②❡s ❂ E❬xxT❥Y ❪✿

Song Mei (Stanford University) TAP free energy September 19, 2018 4 / 29

slide-18
SLIDE 18

Bayes estimation in Z✷ synchronization

◮ Settings:

x ✘ ❯♥✐❢✭Z♥

✷ ✮❀

Y ❂ ✭✕❂♥✮xxT ✰ W ✿

◮ Estimate X ❂ xxT with loss:

❵✭X❀ ❝ X✮ ❂ ✭✶❂♥✷✮❦X ❝ X❦✷

❋ ✿ ◮ For ✕ ❁ ✶, estimation is impossible. ◮ For ✕ ❃ ✶, estimation is possible and efficient, e.g., spectral

estimator (Baik, Ben Arous, Peche phase transition).

◮ The optimal estimator is the Bayes estimator (also minimax

estimator):

X❇❛②❡s ❂ E❬xxT❥Y ❪✿

Song Mei (Stanford University) TAP free energy September 19, 2018 4 / 29

slide-19
SLIDE 19

Bayes estimation in Z✷ synchronization

◮ Settings:

x ✘ ❯♥✐❢✭Z♥

✷ ✮❀

Y ❂ ✭✕❂♥✮xxT ✰ W ✿

◮ Risk:

▼❙❊✕✭❝ X✮ ❂ ✭✶❂♥✷✮E❬❦xxT ❝ X❦✷

❋ ❪✿

0.5 1 1.5 2 2.5 3 3.5 4 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 MSE Bayes MSE PCA MSE

Song Mei (Stanford University) TAP free energy September 19, 2018 5 / 29

slide-20
SLIDE 20

Compute the Bayesian estimator

◮ The Bayesian estimator:

X❇❛②❡s ❂ E❬xxT❥Y ❪ ❂

σ✷Z♥

σσT♣✭σ❥Y ✮✿

◮ The posterior distribution:

♣✭σ❥Y ✮ ❂ ✶ ❩ ❡①♣❢✕❤σ❀ Y σ✐❂✷❣✿

Song Mei (Stanford University) TAP free energy September 19, 2018 6 / 29

slide-21
SLIDE 21

Compute the Bayesian estimator

◮ The Bayesian estimator:

X❇❛②❡s ❂ E❬xxT❥Y ❪ ❂

σ✷Z♥

σσT♣✭σ❥Y ✮✿

◮ The posterior distribution:

♣✭σ❥Y ✮ ❂ ✶ ❩ ❡①♣❢✕❤σ❀ Y σ✐❂✷❣✿

Song Mei (Stanford University) TAP free energy September 19, 2018 6 / 29

slide-22
SLIDE 22

Mean field variational inference

◮ The posterior distribution:

♣✭σ❥Y ✮ ❂ ✶ ❩ ❡①♣❢✕❤σ❀ Y σ✐❂✷❣✿

◮ Approximate ♣✭σ❥Y ✮ by q ✷ P▼❋:

P▼❋ ❂

q✭σ✮ ❂

✐❂✶

q✐✭✛✐✮ ✿ q✐ ✷ P✭Z✷✮

♦ ✘

❂ ❬✶❀ ✶❪♥✿

◮ Minimize the relative entropy between q and ♣✭σ❥Y ✮:

♠✐♥

q✷P▼❋ D❦❧✭q❦♣✭σ❥Y ✮✮✿ ◮ Equivalently minimizing ♠✐♥m✷❬✶❀✶❪♥ ❋▼❋✭m✮

❋▼❋✭m✮ ✑

✐❂✶

h✭♠✐✮ ✕❤m❀ Y m✐❂✷ ✕ ❧♦❣ ❩❀ where h✭♠✮ ❂ ✶♠

❧♦❣✭ ✶♠

✷ ✮ ✶✰♠ ✷

❧♦❣✭ ✶✰♠

✷ ✮.

Song Mei (Stanford University) TAP free energy September 19, 2018 7 / 29

slide-23
SLIDE 23

Mean field variational inference

◮ The posterior distribution:

♣✭σ❥Y ✮ ❂ ✶ ❩ ❡①♣❢✕❤σ❀ Y σ✐❂✷❣✿

◮ Approximate ♣✭σ❥Y ✮ by q ✷ P▼❋:

P▼❋ ❂

q✭σ✮ ❂

✐❂✶

q✐✭✛✐✮ ✿ q✐ ✷ P✭Z✷✮

♦ ✘

❂ ❬✶❀ ✶❪♥✿

◮ Minimize the relative entropy between q and ♣✭σ❥Y ✮:

♠✐♥

q✷P▼❋ D❦❧✭q❦♣✭σ❥Y ✮✮✿ ◮ Equivalently minimizing ♠✐♥m✷❬✶❀✶❪♥ ❋▼❋✭m✮

❋▼❋✭m✮ ✑

✐❂✶

h✭♠✐✮ ✕❤m❀ Y m✐❂✷ ✕ ❧♦❣ ❩❀ where h✭♠✮ ❂ ✶♠

❧♦❣✭ ✶♠

✷ ✮ ✶✰♠ ✷

❧♦❣✭ ✶✰♠

✷ ✮.

Song Mei (Stanford University) TAP free energy September 19, 2018 7 / 29

slide-24
SLIDE 24

Mean field variational inference

◮ The posterior distribution:

♣✭σ❥Y ✮ ❂ ✶ ❩ ❡①♣❢✕❤σ❀ Y σ✐❂✷❣✿

◮ Approximate ♣✭σ❥Y ✮ by q ✷ P▼❋:

P▼❋ ❂

q✭σ✮ ❂

✐❂✶

q✐✭✛✐✮ ✿ q✐ ✷ P✭Z✷✮

♦ ✘

❂ ❬✶❀ ✶❪♥✿

◮ Minimize the relative entropy between q and ♣✭σ❥Y ✮:

♠✐♥

q✷P▼❋ D❦❧✭q❦♣✭σ❥Y ✮✮✿ ◮ Equivalently minimizing ♠✐♥m✷❬✶❀✶❪♥ ❋▼❋✭m✮

❋▼❋✭m✮ ✑

✐❂✶

h✭♠✐✮ ✕❤m❀ Y m✐❂✷ ✕ ❧♦❣ ❩❀ where h✭♠✮ ❂ ✶♠

❧♦❣✭ ✶♠

✷ ✮ ✶✰♠ ✷

❧♦❣✭ ✶✰♠

✷ ✮.

Song Mei (Stanford University) TAP free energy September 19, 2018 7 / 29

slide-25
SLIDE 25

Mean field variational inference

◮ The posterior distribution:

♣✭σ❥Y ✮ ❂ ✶ ❩ ❡①♣❢✕❤σ❀ Y σ✐❂✷❣✿

◮ Approximate ♣✭σ❥Y ✮ by q ✷ P▼❋:

P▼❋ ❂

q✭σ✮ ❂

✐❂✶

q✐✭✛✐✮ ✿ q✐ ✷ P✭Z✷✮

♦ ✘

❂ ❬✶❀ ✶❪♥✿

◮ Minimize the relative entropy between q and ♣✭σ❥Y ✮:

♠✐♥

q✷P▼❋ D❦❧✭q❦♣✭σ❥Y ✮✮✿ ◮ Equivalently minimizing ♠✐♥m✷❬✶❀✶❪♥ ❋▼❋✭m✮

❋▼❋✭m✮ ✑

✐❂✶

h✭♠✐✮ ✕❤m❀ Y m✐❂✷ ✕ ❧♦❣ ❩❀ where h✭♠✮ ❂ ✶♠

❧♦❣✭ ✶♠

✷ ✮ ✶✰♠ ✷

❧♦❣✭ ✶✰♠

✷ ✮.

Song Mei (Stanford University) TAP free energy September 19, 2018 7 / 29

slide-26
SLIDE 26

Mean field variational inference

◮ Mean field free energy:

❋▼❋✭m✮ ✑

✐❂✶

h✭♠✐✮ ✕❤m❀ Y m✐❂✷✿

◮ For m❄ ❂ ❛r❣ ♠✐♥m ❋▼❋✭m✮, we hope

m❄mT

❄ ✙ ❝

X❇❛②❡s ❂ E❬xxT❥Y ❪✿

◮ It was shown that m❄mT ❄ ✻✙ E❬xxT❥Y ❪ [Ghorbani, Javadi, and

Montanari, 2017].

◮ The assumption that posterior distribution can be approximately

factorized into the product of marginals is wrong!

Song Mei (Stanford University) TAP free energy September 19, 2018 8 / 29

slide-27
SLIDE 27

Mean field variational inference

◮ Mean field free energy:

❋▼❋✭m✮ ✑

✐❂✶

h✭♠✐✮ ✕❤m❀ Y m✐❂✷✿

◮ For m❄ ❂ ❛r❣ ♠✐♥m ❋▼❋✭m✮, we hope

m❄mT

❄ ✙ ❝

X❇❛②❡s ❂ E❬xxT❥Y ❪✿

◮ It was shown that m❄mT ❄ ✻✙ E❬xxT❥Y ❪ [Ghorbani, Javadi, and

Montanari, 2017].

◮ The assumption that posterior distribution can be approximately

factorized into the product of marginals is wrong!

Song Mei (Stanford University) TAP free energy September 19, 2018 8 / 29

slide-28
SLIDE 28

Mean field variational inference

◮ Mean field free energy:

❋▼❋✭m✮ ✑

✐❂✶

h✭♠✐✮ ✕❤m❀ Y m✐❂✷✿

◮ For m❄ ❂ ❛r❣ ♠✐♥m ❋▼❋✭m✮, we hope

m❄mT

❄ ✙ ❝

X❇❛②❡s ❂ E❬xxT❥Y ❪✿

◮ It was shown that m❄mT ❄ ✻✙ E❬xxT❥Y ❪ [Ghorbani, Javadi, and

Montanari, 2017].

◮ The assumption that posterior distribution can be approximately

factorized into the product of marginals is wrong!

Song Mei (Stanford University) TAP free energy September 19, 2018 8 / 29

slide-29
SLIDE 29

Mean field variational inference

◮ Mean field free energy:

❋▼❋✭m✮ ✑

✐❂✶

h✭♠✐✮ ✕❤m❀ Y m✐❂✷✿

◮ For m❄ ❂ ❛r❣ ♠✐♥m ❋▼❋✭m✮, we hope

m❄mT

❄ ✙ ❝

X❇❛②❡s ❂ E❬xxT❥Y ❪✿

◮ It was shown that m❄mT ❄ ✻✙ E❬xxT❥Y ❪ [Ghorbani, Javadi, and

Montanari, 2017].

◮ The assumption that posterior distribution can be approximately

factorized into the product of marginals is wrong!

Song Mei (Stanford University) TAP free energy September 19, 2018 8 / 29

slide-30
SLIDE 30

The TAP free energy

◮ Thouless, Anderson, and Palmer (1977) proposed the TAP free

energy when they study the Sherrington-Kirkpatrick model, whose Gibbs measure gives

  • ☞❀✕✭σ✮ ❂

✶ ❩☞❀✕ ❡①♣❢☞❤σ❀ Y σ✐❣✿ where ❨✐❥ ✘ ◆✭✕❂♥❀ ✶❂♥✮.

◮ When ☞ ❂ ✕, the Gibbs measure of SK model is the same as the

posterior of Z✷ synchronization

◮ The TAP free energy (when ☞ ❂ ✕) gives

❋❚❆P✭m✮ ✑

✐❂✶

h✭♠✐✮ ✕ ✷ ❤m❀ Y m✐

⑤ ④③ ⑥

❋▼❋

♥✕✷ ✹

✶ ❦m❦✷

✐✷ ⑤ ④③ ⑥

Onsager’s correction term

Song Mei (Stanford University) TAP free energy September 19, 2018 9 / 29

slide-31
SLIDE 31

The TAP free energy

◮ Thouless, Anderson, and Palmer (1977) proposed the TAP free

energy when they study the Sherrington-Kirkpatrick model, whose Gibbs measure gives

  • ☞❀✕✭σ✮ ❂

✶ ❩☞❀✕ ❡①♣❢☞❤σ❀ Y σ✐❣✿ where ❨✐❥ ✘ ◆✭✕❂♥❀ ✶❂♥✮.

◮ When ☞ ❂ ✕, the Gibbs measure of SK model is the same as the

posterior of Z✷ synchronization

◮ The TAP free energy (when ☞ ❂ ✕) gives

❋❚❆P✭m✮ ✑

✐❂✶

h✭♠✐✮ ✕ ✷ ❤m❀ Y m✐

⑤ ④③ ⑥

❋▼❋

♥✕✷ ✹

✶ ❦m❦✷

✐✷ ⑤ ④③ ⑥

Onsager’s correction term

Song Mei (Stanford University) TAP free energy September 19, 2018 9 / 29

slide-32
SLIDE 32

The TAP free energy

◮ Thouless, Anderson, and Palmer (1977) proposed the TAP free

energy when they study the Sherrington-Kirkpatrick model, whose Gibbs measure gives

  • ☞❀✕✭σ✮ ❂

✶ ❩☞❀✕ ❡①♣❢☞❤σ❀ Y σ✐❣✿ where ❨✐❥ ✘ ◆✭✕❂♥❀ ✶❂♥✮.

◮ When ☞ ❂ ✕, the Gibbs measure of SK model is the same as the

posterior of Z✷ synchronization

◮ The TAP free energy (when ☞ ❂ ✕) gives

❋❚❆P✭m✮ ✑

✐❂✶

h✭♠✐✮ ✕ ✷ ❤m❀ Y m✐

⑤ ④③ ⑥

❋▼❋

♥✕✷ ✹

✶ ❦m❦✷

✐✷ ⑤ ④③ ⑥

Onsager’s correction term

Song Mei (Stanford University) TAP free energy September 19, 2018 9 / 29

slide-33
SLIDE 33

The TAP free energy

◮ The TAP free energy

❋❚❆P✭m✮ ✑

✐❂✶

h✭♠✐✮ ✕ ✷ ❤m❀ Y m✐

⑤ ④③ ⑥

❋▼❋

♥✕✷ ✹

✶ ❦m❦✷

✐✷ ⑤ ④③ ⑥

Onsager’s correction term

◮ For m❄ ❂ ❛r❣ ♠✐♥m ❋❚❆P✭m✮, we hope

m❄mT

❄ ✙ ❝

X❇❛②❡s ❂ E❬xxT❥Y ❪✿

◮ Our main theorem shows that this is correct.

Song Mei (Stanford University) TAP free energy September 19, 2018 10 / 29

slide-34
SLIDE 34

The TAP free energy

◮ The TAP free energy

❋❚❆P✭m✮ ✑

✐❂✶

h✭♠✐✮ ✕ ✷ ❤m❀ Y m✐

⑤ ④③ ⑥

❋▼❋

♥✕✷ ✹

✶ ❦m❦✷

✐✷ ⑤ ④③ ⑥

Onsager’s correction term

◮ For m❄ ❂ ❛r❣ ♠✐♥m ❋❚❆P✭m✮, we hope

m❄mT

❄ ✙ ❝

X❇❛②❡s ❂ E❬xxT❥Y ❪✿

◮ Our main theorem shows that this is correct.

Song Mei (Stanford University) TAP free energy September 19, 2018 10 / 29

slide-35
SLIDE 35

The TAP free energy

◮ The TAP free energy

❋❚❆P✭m✮ ✑

✐❂✶

h✭♠✐✮ ✕ ✷ ❤m❀ Y m✐

⑤ ④③ ⑥

❋▼❋

♥✕✷ ✹

✶ ❦m❦✷

✐✷ ⑤ ④③ ⑥

Onsager’s correction term

◮ For m❄ ❂ ❛r❣ ♠✐♥m ❋❚❆P✭m✮, we hope

m❄mT

❄ ✙ ❝

X❇❛②❡s ❂ E❬xxT❥Y ❪✿

◮ Our main theorem shows that this is correct.

Song Mei (Stanford University) TAP free energy September 19, 2018 10 / 29

slide-36
SLIDE 36

Proof of the main theorem

Theorem (Fan, M., Montanari, 2018)

Denote ❈✕❀♥ ❂ ❢m ✷ ❬✶❀ ✶❪♥ ✿ r❋❚❆P✭m✮ ❂ 0❀ ❋❚❆P✭m✮ ✔ ✕✷❂✸❣. There exists ✕✵ ❃ ✵, such that for any ✕ ❃ ✕✵, we have ❧✐♠

♥✦✶ E

s✉♣

m✷❈✕❀♥

✶ ♥✷ ❦mmT ❝ X❇❛②❡s❦✷

❋ ❫ ✶

❂ ✵✿ (1) All the critical points (below a threshold) are close to the Bayesian estimator.

Song Mei (Stanford University) TAP free energy September 19, 2018 11 / 29

slide-37
SLIDE 37

Proof of the main theorem

Theorem (Fan, M., Montanari, 2018)

Denote ❈✕❀♥ ❂ ❢m ✷ ❬✶❀ ✶❪♥ ✿ r❋❚❆P✭m✮ ❂ 0❀ ❋❚❆P✭m✮ ✔ ✕✷❂✸❣. There exists ✕✵ ❃ ✵, such that for any ✕ ❃ ✕✵, we have ❧✐♠

♥✦✶ E

s✉♣

m✷❈✕❀♥

✶ ♥✷ ❦mmT ❝ X❇❛②❡s❦✷

❋ ❫ ✶

❂ ✵✿ (1) All the critical points (below a threshold) are close to the Bayesian estimator.

Song Mei (Stanford University) TAP free energy September 19, 2018 11 / 29

slide-38
SLIDE 38

Relationship with AMP

◮ Another way to construct the Bayes estimator is approximate message

passing [Donoho, Maleki, and Montanari, 2009], [Bolthausen, 2014]: m❦✰✶ ❂ t❛♥❤✭✕Y m❦ ✕✷❬✶ ❦m❦❦✷

✷❂♥❪m❦✶✮ ✿

◮ Fixed point of AMP is a critical point of the TAP free energy. ◮ The risk of AMP iterations converge to the Bayes risk [Deshpande,

Abbes, and Montanari, 2016], [Montanari and Venkataramanan, 2017]: ❧✐♠

❦✦✶ ❧✐♠ ♥✦✶

✶ ♥✷ ❦m❦✭m❦✮T xxT❦✷

❋ ❂ ❧✐♠ ♥✦✶ ▼❙❊♥✭❝

X❇❛②❡s✮✿

◮ But it is not known if AMP will converge to a fixed point (It is still an

  • pen problem).

Song Mei (Stanford University) TAP free energy September 19, 2018 12 / 29

slide-39
SLIDE 39

Relationship with AMP

◮ Another way to construct the Bayes estimator is approximate message

passing [Donoho, Maleki, and Montanari, 2009], [Bolthausen, 2014]: m❦✰✶ ❂ t❛♥❤✭✕Y m❦ ✕✷❬✶ ❦m❦❦✷

✷❂♥❪m❦✶✮ ✿

◮ Fixed point of AMP is a critical point of the TAP free energy. ◮ The risk of AMP iterations converge to the Bayes risk [Deshpande,

Abbes, and Montanari, 2016], [Montanari and Venkataramanan, 2017]: ❧✐♠

❦✦✶ ❧✐♠ ♥✦✶

✶ ♥✷ ❦m❦✭m❦✮T xxT❦✷

❋ ❂ ❧✐♠ ♥✦✶ ▼❙❊♥✭❝

X❇❛②❡s✮✿

◮ But it is not known if AMP will converge to a fixed point (It is still an

  • pen problem).

Song Mei (Stanford University) TAP free energy September 19, 2018 12 / 29

slide-40
SLIDE 40

Relationship with AMP

◮ Another way to construct the Bayes estimator is approximate message

passing [Donoho, Maleki, and Montanari, 2009], [Bolthausen, 2014]: m❦✰✶ ❂ t❛♥❤✭✕Y m❦ ✕✷❬✶ ❦m❦❦✷

✷❂♥❪m❦✶✮ ✿

◮ Fixed point of AMP is a critical point of the TAP free energy. ◮ The risk of AMP iterations converge to the Bayes risk [Deshpande,

Abbes, and Montanari, 2016], [Montanari and Venkataramanan, 2017]: ❧✐♠

❦✦✶ ❧✐♠ ♥✦✶

✶ ♥✷ ❦m❦✭m❦✮T xxT❦✷

❋ ❂ ❧✐♠ ♥✦✶ ▼❙❊♥✭❝

X❇❛②❡s✮✿

◮ But it is not known if AMP will converge to a fixed point (It is still an

  • pen problem).

Song Mei (Stanford University) TAP free energy September 19, 2018 12 / 29

slide-41
SLIDE 41

Relationship with AMP

◮ Another way to construct the Bayes estimator is approximate message

passing [Donoho, Maleki, and Montanari, 2009], [Bolthausen, 2014]: m❦✰✶ ❂ t❛♥❤✭✕Y m❦ ✕✷❬✶ ❦m❦❦✷

✷❂♥❪m❦✶✮ ✿

◮ Fixed point of AMP is a critical point of the TAP free energy. ◮ The risk of AMP iterations converge to the Bayes risk [Deshpande,

Abbes, and Montanari, 2016], [Montanari and Venkataramanan, 2017]: ❧✐♠

❦✦✶ ❧✐♠ ♥✦✶

✶ ♥✷ ❦m❦✭m❦✮T xxT❦✷

❋ ❂ ❧✐♠ ♥✦✶ ▼❙❊♥✭❝

X❇❛②❡s✮✿

◮ But it is not known if AMP will converge to a fixed point (It is still an

  • pen problem).

Song Mei (Stanford University) TAP free energy September 19, 2018 12 / 29

slide-42
SLIDE 42

Related literatures in spin glass theory

TAP free energy in unbiased SK.

◮ TAP equations: [Talagrand, 2004], [Chatterjee, 2009], [Chen, 2011],

[Auffinger and Jagannath, 2016], Posterior means/Pure states satisfy TAP equations.

◮ TAP free energy: [Chen and Panchenko, 2017], constrained TAP

minimum are exact. Calculating the complexity.

[Auffinger, Ben Arous, and Cerny, 2010], [Subag, 2016].

Song Mei (Stanford University) TAP free energy September 19, 2018 13 / 29

slide-43
SLIDE 43

Proof of the main theorem

Theorem (Fan, M., Montanari, 2018)

Denote ❈✕❀♥ ❂ ❢m ✷ ❬✶❀ ✶❪♥ ✿ r❋❚❆P✭m✮ ❂ 0❀ ❋❚❆P✭m✮ ✔ ✕✷❂✸❣. There exists ✕✵ ❃ ✵, such that for any ✕ ❃ ✕✵, we have ❧✐♠

♥✦✶ E

s✉♣

m✷❈✕❀♥

✶ ♥✷ ❦mmT ❝ X❇❛②❡s❦✷

❋ ❫ ✶

❂ ✵✿ (1) All the critical points (below a threshold) are close to the Bayesian estimator.

Song Mei (Stanford University) TAP free energy September 19, 2018 14 / 29

slide-44
SLIDE 44

Proof idea - Count the number of critical points

◮ Recall

❋❚❆P✭m✮ ✑

✐❂✶

h✭♠✐✮ ✕ ✷ ❤m❀ Y m✐ ♥✕✷ ✹ ❤ ✶ ❦m❦✷

♥ ✐✷ ✿

◮ Define some important statistics of m:

❊✭m✮ ❂ ❋❚❆P✭m✮❂♥❀ ◗✭m✮ ❂ ❦m❦✷

✷❂♥❀

▼✭m✮ ❂ ❤m❀ x✐❂♥✿

◮ For any ❯ ✒ R✸, define

❈r✐t♥✭❯✮ ✑ ★❢m ✿ r❊✭m✮ ❂ 0❀ ✭◗✭m✮❀ ▼✭m✮❀ ❊✭m✮✮ ✷ ❯❣✿ (2)

Proposition

E❬❈r✐t♥✭❯✮❪ ✔ ❡①♣

♥ s✉♣

✭q❀✬❀❡✮✷❯

❙❄✭q❀ ✬❀ ❡✮ ✰ ♦✭♥✮

Song Mei (Stanford University) TAP free energy September 19, 2018 15 / 29

slide-45
SLIDE 45

Proof idea - Count the number of critical points

❙❄✭q❀ ✬❀ ❡✮ ❂ s✉♣

❛✷R

✐♥❢

✭✖❀✗❀✜❀✌✮✷R✹ ❙✭q❀ ✬❀ ❛❀ ❡❀ ✖❀ ✗❀ ✜❀ ✌✮❀

where ❙✭q❀ ✬❀ ❛❀ ❡❀ ✖❀ ✗❀ ✜❀ ✌✮ ❂ ✶ ✹☞✷ ❤❛ q ☞✕✬✷ q ☞✷✭✶ q✮ ✐✷ q✖ ✬✗ ❛✜ ❤ ☞✷ ✹ ✭✶ q✷✮ ✰ ❛ ✷ ❡ ✐ ✌ ✰ ❧♦❣ ■❀ and ■ ❂ ❩ ✶

✶ ✭✷✙☞✷q✮✶❂✷ ❡①♣ ♥ ✭① ☞✕✬✮✷ ✷☞✷q ✰ ✖ t❛♥❤✷✭①✮ ✰ ✗ t❛♥❤✭①✮ ✰ ✜① t❛♥❤✭①✮ ✰ ✌ ❧♦❣❬✷ ❝♦s❤✭①✮❪ ♦ ❞①✿

Song Mei (Stanford University) TAP free energy September 19, 2018 16 / 29

slide-46
SLIDE 46

Proof idea - Count the number of critical points

◮ Key proposition: for ❯ ✒ R✸,

E❬❈r✐t♥✭❯✮❪ ✔ ❡①♣

❚✭❯✮

③ ⑥⑤ ④

s✉♣

✭q❀✬❀❡✮✷❯

❙❄✭q❀ ✬❀ ❡✮ ✰♦✭♥✮

◮ For any ❯ such that ❚✭❯✮ ❃ ✵, there could potentially be critical

points of ❋❚❆P in ❯.

◮ For any ❯ such that ❚✭❯✮ ❁ ✵, there is no critical points of ❋❚❆P

in ❯, with high probability.

◮ If we admit the key proposition, suffice to show that ❚✭❯✮ ❁ ✵

unless ❯ contains a neighborhood of the Bayes estimator.

Song Mei (Stanford University) TAP free energy September 19, 2018 17 / 29

slide-47
SLIDE 47

Proof idea - Count the number of critical points

◮ Key proposition: for ❯ ✒ R✸,

E❬❈r✐t♥✭❯✮❪ ✔ ❡①♣

❚✭❯✮

③ ⑥⑤ ④

s✉♣

✭q❀✬❀❡✮✷❯

❙❄✭q❀ ✬❀ ❡✮ ✰♦✭♥✮

◮ For any ❯ such that ❚✭❯✮ ❃ ✵, there could potentially be critical

points of ❋❚❆P in ❯.

◮ For any ❯ such that ❚✭❯✮ ❁ ✵, there is no critical points of ❋❚❆P

in ❯, with high probability.

◮ If we admit the key proposition, suffice to show that ❚✭❯✮ ❁ ✵

unless ❯ contains a neighborhood of the Bayes estimator.

Song Mei (Stanford University) TAP free energy September 19, 2018 17 / 29

slide-48
SLIDE 48

Proof idea - Count the number of critical points

◮ Key proposition: for ❯ ✒ R✸,

E❬❈r✐t♥✭❯✮❪ ✔ ❡①♣

❚✭❯✮

③ ⑥⑤ ④

s✉♣

✭q❀✬❀❡✮✷❯

❙❄✭q❀ ✬❀ ❡✮ ✰♦✭♥✮

◮ For any ❯ such that ❚✭❯✮ ❃ ✵, there could potentially be critical

points of ❋❚❆P in ❯.

◮ For any ❯ such that ❚✭❯✮ ❁ ✵, there is no critical points of ❋❚❆P

in ❯, with high probability.

◮ If we admit the key proposition, suffice to show that ❚✭❯✮ ❁ ✵

unless ❯ contains a neighborhood of the Bayes estimator.

Song Mei (Stanford University) TAP free energy September 19, 2018 17 / 29

slide-49
SLIDE 49

Proof idea - Count the number of critical points

◮ Key proposition: for ❯ ✒ R✸,

E❬❈r✐t♥✭❯✮❪ ✔ ❡①♣

❚✭❯✮

③ ⑥⑤ ④

s✉♣

✭q❀✬❀❡✮✷❯

❙❄✭q❀ ✬❀ ❡✮ ✰♦✭♥✮

◮ For any ❯ such that ❚✭❯✮ ❃ ✵, there could potentially be critical

points of ❋❚❆P in ❯.

◮ For any ❯ such that ❚✭❯✮ ❁ ✵, there is no critical points of ❋❚❆P

in ❯, with high probability.

◮ If we admit the key proposition, suffice to show that ❚✭❯✮ ❁ ✵

unless ❯ contains a neighborhood of the Bayes estimator.

Song Mei (Stanford University) TAP free energy September 19, 2018 17 / 29

slide-50
SLIDE 50

Proof idea - the complexity function ❙❄

◮ ❙❄✭❡✮ ❂ s✉♣q❀✬ ❙❄✭q❀ ✬❀ ❡✮.

E

  • 2.3
  • 2.2
  • 2.1
  • 2
  • 1.9
  • 1.8
  • 1.7
  • 1.6
  • 1.5

Complexity

  • 0.07
  • 0.06
  • 0.05
  • 0.04
  • 0.03
  • 0.02
  • 0.01

0.01 0.02 0.03 lambda = 2

◮ At ❡❄, ❙❄✭❡❄✮ ❂ ✵.

Song Mei (Stanford University) TAP free energy September 19, 2018 18 / 29

slide-51
SLIDE 51

Proof idea - the complexity function ❙❄

◮ ❙❄✭✬✮ ❂ s✉♣q❀❡ ❙❄✭q❀ ✬❀ ❡✮.

M 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Complexity

  • 0.07
  • 0.06
  • 0.05
  • 0.04
  • 0.03
  • 0.02
  • 0.01

0.01 0.02 0.03 lambda = 2

◮ At ✬❄, ❙❄✭✬❄✮ ❂ ✵.

Song Mei (Stanford University) TAP free energy September 19, 2018 19 / 29

slide-52
SLIDE 52

Proof idea - the complexity function ❙❄

◮ ❙❄✭q✮ ❂ s✉♣✬❀❡ ❙❄✭q❀ ✬❀ ❡✮.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Q

  • 0.1
  • 0.08
  • 0.06
  • 0.04
  • 0.02

0.02 0.04 0.06 0.08 0.1 Complexity lambda = 2

◮ At q❄, ❙❄✭q❄✮ ❂ ✵.

Song Mei (Stanford University) TAP free energy September 19, 2018 20 / 29

slide-53
SLIDE 53

Proof idea - the complexity function ❙❄

There exists ✕✵, for ✕ ✕ ✕✵,

◮ ❙❄✭q❄❀ ✬❄❀ ❡❄✮ ❂ ✵, where ✭q❄❀ ✬❄❀ ❡❄✮ ✙ ✭◗✭m❄✮❀ ▼✭m❄✮❀ ❊✭m❄✮✮

for ❝ X❇❛②❡s ✙ m❄mT

❄ . ◮ ❙❄✭q❀ ✬❀ ❡✮ ❁ ✵ for any ❡ ✔ ✕✷❂✸ and ✭q❀ ✬❀ ❡✮ ✻❂ ✭q❄❀ ✬❄❀ ❡❄✮.

The proof of these two properties is more than calculus. It requires bounds using concentration inequalities. Combining with the key inequality it is easy to show the main theorem. E❬❈r✐t♥✭❯✮❪ ✔ ❡①♣

♥ s✉♣

✭q❀✬❀❡✮✷❯

❙❄✭q❀ ✬❀ ❡✮ ✰ ♦✭♥✮

✿ Now suffice to show the key inequality.

Song Mei (Stanford University) TAP free energy September 19, 2018 21 / 29

slide-54
SLIDE 54

Proof idea - the complexity function ❙❄

There exists ✕✵, for ✕ ✕ ✕✵,

◮ ❙❄✭q❄❀ ✬❄❀ ❡❄✮ ❂ ✵, where ✭q❄❀ ✬❄❀ ❡❄✮ ✙ ✭◗✭m❄✮❀ ▼✭m❄✮❀ ❊✭m❄✮✮

for ❝ X❇❛②❡s ✙ m❄mT

❄ . ◮ ❙❄✭q❀ ✬❀ ❡✮ ❁ ✵ for any ❡ ✔ ✕✷❂✸ and ✭q❀ ✬❀ ❡✮ ✻❂ ✭q❄❀ ✬❄❀ ❡❄✮.

The proof of these two properties is more than calculus. It requires bounds using concentration inequalities. Combining with the key inequality it is easy to show the main theorem. E❬❈r✐t♥✭❯✮❪ ✔ ❡①♣

♥ s✉♣

✭q❀✬❀❡✮✷❯

❙❄✭q❀ ✬❀ ❡✮ ✰ ♦✭♥✮

✿ Now suffice to show the key inequality.

Song Mei (Stanford University) TAP free energy September 19, 2018 21 / 29

slide-55
SLIDE 55

Proof idea - the complexity function ❙❄

There exists ✕✵, for ✕ ✕ ✕✵,

◮ ❙❄✭q❄❀ ✬❄❀ ❡❄✮ ❂ ✵, where ✭q❄❀ ✬❄❀ ❡❄✮ ✙ ✭◗✭m❄✮❀ ▼✭m❄✮❀ ❊✭m❄✮✮

for ❝ X❇❛②❡s ✙ m❄mT

❄ . ◮ ❙❄✭q❀ ✬❀ ❡✮ ❁ ✵ for any ❡ ✔ ✕✷❂✸ and ✭q❀ ✬❀ ❡✮ ✻❂ ✭q❄❀ ✬❄❀ ❡❄✮.

The proof of these two properties is more than calculus. It requires bounds using concentration inequalities. Combining with the key inequality it is easy to show the main theorem. E❬❈r✐t♥✭❯✮❪ ✔ ❡①♣

♥ s✉♣

✭q❀✬❀❡✮✷❯

❙❄✭q❀ ✬❀ ❡✮ ✰ ♦✭♥✮

✿ Now suffice to show the key inequality.

Song Mei (Stanford University) TAP free energy September 19, 2018 21 / 29

slide-56
SLIDE 56

Calculating the ❈r✐t: Kac-Rice formula

Lemma (Kac-Rice formula, c.f. [Adler and Taylor, 2007)

] Let ❢ ✿ R❞ ✦ R be a “sufficiently regular” random morse function. Let ♣m✭z✮ be the density of r❢✭m✮ at z. For any Borel measurable set ❚ ✒ R❞, denote ❈r✐t✭❚✮ ❂ ★❢m ✷ ❚ ✿ r❢✭m✮ ❂ 0❣✿ Then E❬❈r✐t✭❚✮❪ ❂E ❤ ❩

☞ ☞ ❞❡t r✷❢✭m✮ ☞ ☞ ✁ ✍✭r❢✭m✮✮ ✁ dm ✐ ❂ ❩

E ❤☞ ☞ ❞❡t r✷❢✭m✮ ☞ ☞ ☞ ☞ ☞r❢✭m✮ ❂ 0 ✐ ♣m✭0✮dm✿

◮ ❥ ❞❡t r✷❢✭m✮❥ is the correct weight function so that each critical point

count exactly once.

Song Mei (Stanford University) TAP free energy September 19, 2018 22 / 29

slide-57
SLIDE 57

Dealing with determinant of Hessian

◮ The conditional Hessian is distributed as (up to some scaling)

❬r✷❋❚❆P✭m✮❥r❋❚❆P✭m✮ ❂ 0❪ ❞ ❂ D ✰ W ✰ low rank perturbation❀ where D ❂ diag✭❞✐✮, and W ✘ ●❖❊✭♥✮.

◮ The low rank perturbation has vanishing effects. Therefore, we

just need to calculate E❬❥ ❞❡t✭H✮❥❪, with H ❂ D ✰ W ✿

Song Mei (Stanford University) TAP free energy September 19, 2018 23 / 29

slide-58
SLIDE 58

Dealing with determinant of Hessian

◮ The conditional Hessian is distributed as (up to some scaling)

❬r✷❋❚❆P✭m✮❥r❋❚❆P✭m✮ ❂ 0❪ ❞ ❂ D ✰ W ✰ low rank perturbation❀ where D ❂ diag✭❞✐✮, and W ✘ ●❖❊✭♥✮.

◮ The low rank perturbation has vanishing effects. Therefore, we

just need to calculate E❬❥ ❞❡t✭H✮❥❪, with H ❂ D ✰ W ✿

Song Mei (Stanford University) TAP free energy September 19, 2018 23 / 29

slide-59
SLIDE 59

Dealing with determinant of Hessian

H ❂ D ✰ W ❂ ❞✐❛❣♦♥❛❧ ✰ ●❖❊✿ ✶ ♥ ❧♦❣ E❬❥ ❞❡t✭H✮❥❪ ❂ ✶ ♥ ❧♦❣ E

❤ ♥ ❨

✐❂✶

❥✕✐✭H✮❥

✙ ✶ ♥ ❧♦❣

❤ ♥ ❨

✐❂✶

❥✕✐✭H✮❥

❂ ✶ ♥

✐❂✶

❧♦❣ ❥✕✐✭H✮❥ ❂

R

❧♦❣ ❥①❥ ✁ ✖H✭d①✮ ✙ E

❤ ❩

R

❧♦❣ ❥①❥ ✁ ✖H✭d①✮

✿ where ✖H ❂ ✭✶❂♥✮ P♥

✐❂✶ ✍✭✕✐✭H✮✮. ◮ Approximate equalities are due to concentration. ◮ The Stieltjes transform of ✖H can be approximately calculated

using free probability theory.

◮ Once the Stieltjes transform of ✖H is known, the quantity

E

❤ ❘

R✭❧♦❣ ❥①❥✮✖H✭d①✮

can be computed.

Song Mei (Stanford University) TAP free energy September 19, 2018 24 / 29

slide-60
SLIDE 60

Dealing with determinant of Hessian

H ❂ D ✰ W ❂ ❞✐❛❣♦♥❛❧ ✰ ●❖❊✿ ✶ ♥ ❧♦❣ E❬❥ ❞❡t✭H✮❥❪ ❂ ✶ ♥ ❧♦❣ E

❤ ♥ ❨

✐❂✶

❥✕✐✭H✮❥

✙ ✶ ♥ ❧♦❣

❤ ♥ ❨

✐❂✶

❥✕✐✭H✮❥

❂ ✶ ♥

✐❂✶

❧♦❣ ❥✕✐✭H✮❥ ❂

R

❧♦❣ ❥①❥ ✁ ✖H✭d①✮ ✙ E

❤ ❩

R

❧♦❣ ❥①❥ ✁ ✖H✭d①✮

✿ where ✖H ❂ ✭✶❂♥✮ P♥

✐❂✶ ✍✭✕✐✭H✮✮. ◮ Approximate equalities are due to concentration. ◮ The Stieltjes transform of ✖H can be approximately calculated

using free probability theory.

◮ Once the Stieltjes transform of ✖H is known, the quantity

E

❤ ❘

R✭❧♦❣ ❥①❥✮✖H✭d①✮

can be computed.

Song Mei (Stanford University) TAP free energy September 19, 2018 24 / 29

slide-61
SLIDE 61

Free convolution of two distribution

Let A ✷ R♥✂♥, and ✖A ❂ ✭✶❂♥✮ P♥

✐❂✶ ✍✭✕✐✭A✮✮. For any ③ ✷ C✰, the

Stieltjes transform of ✖A is defined as ❣A✭③✮ ❂

R

✶ ① ③ ✖A✭d①✮ ❂ ✶ ♥

✐❂✶

✶ ✕✐✭A✮ ③ ✿

Lemma (Due to free probability theory)

Let D ❂ diag✭❞✐✮ be a diagonal matrix, and let H ❂ D ✰ W . Then E❣H✭③✮ ❂ ✶ ♥

✐❂✶

✶ ❞✐ ③ E❣H✭③✮ ✰ ♦♥✭✶✮✿ (3)

Song Mei (Stanford University) TAP free energy September 19, 2018 25 / 29

slide-62
SLIDE 62

Free convolution of two distribution

Let A ✷ R♥✂♥, and ✖A ❂ ✭✶❂♥✮ P♥

✐❂✶ ✍✭✕✐✭A✮✮. For any ③ ✷ C✰, the

Stieltjes transform of ✖A is defined as ❣A✭③✮ ❂

R

✶ ① ③ ✖A✭d①✮ ❂ ✶ ♥

✐❂✶

✶ ✕✐✭A✮ ③ ✿

Lemma (Due to free probability theory)

Let D ❂ diag✭❞✐✮ be a diagonal matrix, and let H ❂ D ✰ W . Then E❣H✭③✮ ❂ ✶ ♥

✐❂✶

✶ ❞✐ ③ E❣H✭③✮ ✰ ♦♥✭✶✮✿ (3)

Song Mei (Stanford University) TAP free energy September 19, 2018 25 / 29

slide-63
SLIDE 63

Dealing with determinant of Hessian

H ❂ D ✰ W ❂ ❞✐❛❣♦♥❛❧ ✰ ●❖❊✿ ✶ ♥ ❧♦❣ E❬❥ ❞❡t✭H✮❥❪ ❂ ✶ ♥ ❧♦❣ E

❤ ♥ ❨

✐❂✶

❥✕✐✭H✮❥

✙ ✶ ♥ ❧♦❣

❤ ♥ ❨

✐❂✶

❥✕✐✭H✮❥

❂ ✶ ♥

✐❂✶

❧♦❣ ❥✕✐✭H✮❥ ❂

R

❧♦❣ ❥①❥ ✁ ✖H✭d①✮ ✙ E

❤ ❩

R

❧♦❣ ❥①❥ ✁ ✖H✭d①✮

✿ where ✖H ❂ ✭✶❂♥✮ P♥

✐❂✶ ✍✭✕✐✭H✮✮. ◮ Approximate equalities are due to concentration. ◮ The Stieltjes transform of ✖H can be approximately calculated

using free probability theory.

◮ Once the Stieltjes transform of ✖H is known, the quantity

E

❤ ❘

R✭❧♦❣ ❥①❥✮✖H✭d①✮

can be computed.

Song Mei (Stanford University) TAP free energy September 19, 2018 26 / 29

slide-64
SLIDE 64

Calculate E❬

❘ R ❧♦❣ ❥①❥ ✁ ✖H✭d①✮❪

◮ Define

❇✭t✮ ❂ E

R

❧♦❣✭① it✮✖H✭d①✮✿

◮ We have

❁❇✭✵✰✮ ❂E

R

❧♦❣ ❥①❥ ✁ ✖H✭d①✮❀ ❇✵✭t✮ ❂ iE

R

❬✶❂✭① it✮❪✖H✭d①✮ ❂ iE❬❣H✭it✮❪✿

◮ We guess a formula

⑦ ❇✭t✮ ❂ ✶ ♥

✐❂✶

❧♦❣✭❞✐ it E❣H✭it✮✮ ✰ ✶ ✷❬E❣H✭it✮❪✷✿ Then ⑦ ❇✭t✮ satisfy all the conditions that ❇✭t✮ approximately satisfy, so that ⑦ ❇✭t✮ ❂ ❇✭t✮ ✰ ♦♥✭✶✮.

◮ Hence

✶ ♥ ❧♦❣ E❬❥ ❞❡t✭H✮❥❪ ❂ ⑦ ❇✭✵✮ ✰ ♦♥✭✶✮✿

Song Mei (Stanford University) TAP free energy September 19, 2018 27 / 29

slide-65
SLIDE 65

Calculate E❬

❘ R ❧♦❣ ❥①❥ ✁ ✖H✭d①✮❪

◮ Define

❇✭t✮ ❂ E

R

❧♦❣✭① it✮✖H✭d①✮✿

◮ We have

❁❇✭✵✰✮ ❂E

R

❧♦❣ ❥①❥ ✁ ✖H✭d①✮❀ ❇✵✭t✮ ❂ iE

R

❬✶❂✭① it✮❪✖H✭d①✮ ❂ iE❬❣H✭it✮❪✿

◮ We guess a formula

⑦ ❇✭t✮ ❂ ✶ ♥

✐❂✶

❧♦❣✭❞✐ it E❣H✭it✮✮ ✰ ✶ ✷❬E❣H✭it✮❪✷✿ Then ⑦ ❇✭t✮ satisfy all the conditions that ❇✭t✮ approximately satisfy, so that ⑦ ❇✭t✮ ❂ ❇✭t✮ ✰ ♦♥✭✶✮.

◮ Hence

✶ ♥ ❧♦❣ E❬❥ ❞❡t✭H✮❥❪ ❂ ⑦ ❇✭✵✮ ✰ ♦♥✭✶✮✿

Song Mei (Stanford University) TAP free energy September 19, 2018 27 / 29

slide-66
SLIDE 66

Calculate E❬

❘ R ❧♦❣ ❥①❥ ✁ ✖H✭d①✮❪

◮ Define

❇✭t✮ ❂ E

R

❧♦❣✭① it✮✖H✭d①✮✿

◮ We have

❁❇✭✵✰✮ ❂E

R

❧♦❣ ❥①❥ ✁ ✖H✭d①✮❀ ❇✵✭t✮ ❂ iE

R

❬✶❂✭① it✮❪✖H✭d①✮ ❂ iE❬❣H✭it✮❪✿

◮ We guess a formula

⑦ ❇✭t✮ ❂ ✶ ♥

✐❂✶

❧♦❣✭❞✐ it E❣H✭it✮✮ ✰ ✶ ✷❬E❣H✭it✮❪✷✿ Then ⑦ ❇✭t✮ satisfy all the conditions that ❇✭t✮ approximately satisfy, so that ⑦ ❇✭t✮ ❂ ❇✭t✮ ✰ ♦♥✭✶✮.

◮ Hence

✶ ♥ ❧♦❣ E❬❥ ❞❡t✭H✮❥❪ ❂ ⑦ ❇✭✵✮ ✰ ♦♥✭✶✮✿

Song Mei (Stanford University) TAP free energy September 19, 2018 27 / 29

slide-67
SLIDE 67

Calculate E❬

❘ R ❧♦❣ ❥①❥ ✁ ✖H✭d①✮❪

◮ Define

❇✭t✮ ❂ E

R

❧♦❣✭① it✮✖H✭d①✮✿

◮ We have

❁❇✭✵✰✮ ❂E

R

❧♦❣ ❥①❥ ✁ ✖H✭d①✮❀ ❇✵✭t✮ ❂ iE

R

❬✶❂✭① it✮❪✖H✭d①✮ ❂ iE❬❣H✭it✮❪✿

◮ We guess a formula

⑦ ❇✭t✮ ❂ ✶ ♥

✐❂✶

❧♦❣✭❞✐ it E❣H✭it✮✮ ✰ ✶ ✷❬E❣H✭it✮❪✷✿ Then ⑦ ❇✭t✮ satisfy all the conditions that ❇✭t✮ approximately satisfy, so that ⑦ ❇✭t✮ ❂ ❇✭t✮ ✰ ♦♥✭✶✮.

◮ Hence

✶ ♥ ❧♦❣ E❬❥ ❞❡t✭H✮❥❪ ❂ ⑦ ❇✭✵✮ ✰ ♦♥✭✶✮✿

Song Mei (Stanford University) TAP free energy September 19, 2018 27 / 29

slide-68
SLIDE 68

Dealing with determinant of Hessian

H ❂ D ✰ W ❂ ❞✐❛❣♦♥❛❧ ✰ ●❖❊✿ ✶ ♥ ❧♦❣ E❬❥ ❞❡t✭H✮❥❪ ❂ ✶ ♥ ❧♦❣ E

❤ ♥ ❨

✐❂✶

❥✕✐✭H✮❥

✙ ✶ ♥ ❧♦❣

❤ ♥ ❨

✐❂✶

❥✕✐✭H✮❥

❂ ✶ ♥

✐❂✶

❧♦❣ ❥✕✐✭H✮❥ ❂

R

❧♦❣ ❥①❥ ✁ ✖H✭d①✮ ✙ E

❤ ❩

R

❧♦❣ ❥①❥ ✁ ✖H✭d①✮

✿ where ✖H ❂ ✭✶❂♥✮ P♥

✐❂✶ ✍✭✕✐✭H✮✮. ◮ Approximate equalities are due to concentration. ◮ The Stieltjes transform of ✖H can be approximately calculated

using free probability theory.

◮ Once the Stieltjes transform of ✖H is known, the quantity

E

❤ ❘

R✭❧♦❣ ❥①❥✮✖H✭d①✮

can be computed.

Song Mei (Stanford University) TAP free energy September 19, 2018 28 / 29

slide-69
SLIDE 69

Summary

◮ TAP free energy is accurate for Z✷ synchronization. ◮ Can be generalized to topic modeling, low rank matrix estimation,

compressed sensing, etc...

◮ It is interesting to study and apply variational inference beyond

mean field.

Song Mei (Stanford University) TAP free energy September 19, 2018 29 / 29

slide-70
SLIDE 70

Summary

◮ TAP free energy is accurate for Z✷ synchronization. ◮ Can be generalized to topic modeling, low rank matrix estimation,

compressed sensing, etc...

◮ It is interesting to study and apply variational inference beyond

mean field.

Song Mei (Stanford University) TAP free energy September 19, 2018 29 / 29

slide-71
SLIDE 71

Summary

◮ TAP free energy is accurate for Z✷ synchronization. ◮ Can be generalized to topic modeling, low rank matrix estimation,

compressed sensing, etc...

◮ It is interesting to study and apply variational inference beyond

mean field.

Song Mei (Stanford University) TAP free energy September 19, 2018 29 / 29