Information Recovery from Pairwise Measurements A Shannon-Theoretic - - PowerPoint PPT Presentation

information recovery from pairwise measurements
SMART_READER_LITE
LIVE PREVIEW

Information Recovery from Pairwise Measurements A Shannon-Theoretic - - PowerPoint PPT Presentation

Information Recovery from Pairwise Measurements A Shannon-Theoretic Approach Yuxin Chen , Changho Suh , Andrea Goldsmith Stanford University KAIST Page 1 Recovering data from correlation measurements A large collection of


slide-1
SLIDE 1

Information Recovery from Pairwise Measurements

A Shannon-Theoretic Approach

Yuxin Chen†, Changho Suh∗, Andrea Goldsmith†

Stanford University† KAIST∗

Page 1

slide-2
SLIDE 2

Recovering data from correlation measurements

  • A large collection of data instances
  • In many applications, it is
  • difficult/infeasible to measure each variable directly
  • feasible to measure pairwise correlation

Page 2

slide-3
SLIDE 3

Motivating application: multi-image alignment

  • Structure from motion: estimate 3D structures from 2D image sequences
  • Key step: joint alignment

– input: (noisy) estimates of relative camera poses – goal: jointly recover all camera poses

Page 3

slide-4
SLIDE 4

Motivating application: graph clustering

  • Real-world networks exhibit community structures
  • input: pairwise similarities between members
  • goal: uncover hidden clusters

Page 4

slide-5
SLIDE 5

This talk: recovery from pairwise difference measurements

  • Goal: recover a collection of variables {xi}
  • Can only measure several pairwise difference xi − xj (broadly defined)

Page 5

slide-6
SLIDE 6

This talk: recovery from pairwise difference measurements

  • Goal: recover a collection of variables {xi}
  • Can only measure several pairwise difference xi − xj (broadly defined)
  • Examples:

— joint alignment

– xi: (angle θi, position zi) – relative rotation/translation (θi − θj, zi − zj)

Page 5

slide-7
SLIDE 7

This talk: recovery from pairwise difference measurements

  • Goal: recover a collection of variables {xi}
  • Can only measure several pairwise difference xi − xj (broadly defined)
  • Examples:

— joint alignment

– xi: (angle θi, position zi) – relative rotation/translation (θi − θj, zi − zj)

— graph partition

– xi: membership (which partition it belongs to) – cluster agreement: xi − xj =

  • 1,

if i, j ∈ same partition 0, else.

Page 5

slide-8
SLIDE 8

This talk: recovery from pairwise difference measurements

  • Goal: recover a collection of variables {xi}
  • Can only measure several pairwise difference xi − xj (broadly defined)
  • Examples:

— joint alignment

– xi: (angle θi, position zi) – relative rotation/translation (θi − θj, zi − zj)

— graph partition

– xi: membership (which partition it belongs to) – cluster agreement: xi − xj =

  • 1,

if i, j ∈ same partition 0, else.

— pairwise maps, parity reads, ...

Page 5

slide-9
SLIDE 9

A fundamental-limit perspective?

  • A flurry of activity in recovery algorithm design

convex program combinatorial spectral method

Page 6

slide-10
SLIDE 10

A fundamental-limit perspective?

  • A flurry of activity in recovery algorithm design

convex program combinatorial spectral method

  • What are the fundamental recovery limits?

— min. sample complexity? how noisy the measurements can be?

Page 6

slide-11
SLIDE 11

A fundamental-limit perspective?

  • A flurry of activity in recovery algorithm design

convex program combinatorial spectral method

  • What are the fundamental recovery limits?

— min. sample complexity? how noisy the measurements can be?

  • So far mostly studied in a model-specific manner
  • Seek a more unified framework

Page 6

slide-12
SLIDE 12

xi ∈ {0, · · · , M − 1} x3 x1 x4 x5 x7 x2 x6 y26 y

  • Information network
  • n vertices
  • discrete inputs w/ alphabet size:

M — could scale with n

Problem setup: a Shannon-theoretic framework

Page 7

slide-13
SLIDE 13

measurement graph G measurements of x1 − x2, x1 − x3, x1 − x5, · · ·

x3 x1 x4 x5 x7 x2 x6 y12 y26 y67 y27 y13 y34 y35 y15 y24

  • Pairwise difference measurements
  • truth: xi − xj
  • measurements: yij (arbitrary alphabet)

∗ can be corrupted by noise, distortion, ...

  • Graphical representation
  • observe yij

⇐ ⇒ (i, j) ∈ G

Problem setup: a Shannon-theoretic framework

Page 7

slide-14
SLIDE 14

x1

  • x2

x6

  • x7

x1

  • x5

x2

  • x7

y12 y15 y27 y67 p (yij

| xi-xj )

x3 x1 x4 x5 x7 x2 x6

channel channel channel channel

  • Channel-decoding perspective
  • each measurement is modeled by an i.i.d. channel
  • transition prob. P(yij | xi − xj)

Problem setup: a Shannon-theoretic framework

Page 7

slide-15
SLIDE 15

x1

  • x2

x6

  • x7

x1

  • x5

x2

  • x7

y12 y15 y27 y67 p (yij

| xi-xj )

x3 x1 x4 x5 x7 x2 x6

channel channel channel channel

  • Goal: recover {xi} exactly (up to global offset)
  • Unified framework for decoding model
  • capture similarities among various applications

Problem setup: a Shannon-theoretic framework

Page 7

slide-16
SLIDE 16

x1 x2 channel channel

x1 − x2 = 1 x1 − x2 = 2

y12 ∼ P1 y12 ∼ P2 Pl := P( yij | xi − xj = l)

  • Channel distance/resolution
  • Captured by

KL( Pl Pk)

  • r

Hellinger( Pl Pk)

  • r

...

What factors dictate hardness of recovery?

Page 8

slide-17
SLIDE 17

x1 x2 channel channel

x1 − x2 = 1 x1 − x2 = 3

y12 ∼ P1 y12 ∼ P3 Pl := P( yij | xi − xj = l)

  • Minimum channel distance/resolution

minl=k KL( Pl Pk) := KLmin

  • r

minl=k Hellinger( Pl Pk) := Hellingermin

  • r

...

  • Uncoded input

What factors dictate hardness of recovery?

Page 8

slide-18
SLIDE 18
  • Impossible to recover isolated vertices

measurement graph G

What factors dictate hardness of recovery?

  • Graph connectivity

Page 9

slide-19
SLIDE 19
  • Over-sparse connectivity is fragile

measurement graph G

What factors dictate hardness of recovery?

  • Graph connectivity

Page 9

slide-20
SLIDE 20

measurement graph G

  • Sufficient connectivity removes fragility!

What factors dictate hardness of recovery?

  • Graph connectivity

Page 9

slide-21
SLIDE 21

Agenda

Page 10

slide-22
SLIDE 22

Main result: Erdos-Renyi random graph

Erdos-Renyi graph G(n, pobs). Each edge (i, j) is present independently w.p. pobs

(pobs = 1) (pobs = 0.3)

Page 11

slide-23
SLIDE 23

Main result: Erdos-Renyi random graph

Erdos-Renyi graph G(n, pobs). Each edge (i, j) is present independently w.p. pobs

(pobs = 1) (pobs = 0.3)

  • ML decoding works if

Hellingermin > 2 log n + 4 log M pobsn

Page 11

slide-24
SLIDE 24

Main result: Erdos-Renyi random graph

Erdos-Renyi graph G(n, pobs). Each edge (i, j) is present independently w.p. pobs

(pobs = 1) (pobs = 0.3)

  • ML decoding works if

Hellingermin > 2 log n + 4 log M pobsn

  • Converse: no method works if

KLmin < log n pobsn

Page 11

slide-25
SLIDE 25

           non-asymptotic!

Main result: Erdos-Renyi random graph

Erdos-Renyi graph G(n, pobs). Each edge (i, j) is present independently w.p. pobs

(pobs = 1) (pobs = 0.3)

  • ML decoding works if

Hellingermin > 2 log n + 4 log M pobsn

  • Converse: no method works if

KLmin < log n pobsn

Page 11

slide-26
SLIDE 26

Main result: Erdos-Renyi random graph

(pobs = 1) (pobs = 0.3)

Page 12

slide-27
SLIDE 27

Main result: Erdos-Renyi random graph

(pobs = 1) (pobs = 0.3)

  • In the hard regime where dPl

dPk ≈ 1:

KLmin ≈ 2 · Hellingermin

Page 12

slide-28
SLIDE 28
  • Recovery conditions

ML works if Hellingermin > 2 log n + 4 log M pobsn Impossible if Hellingermin < log n 2pobsn

Main result: Erdos-Renyi random graph

(pobs = 1) (pobs = 0.3)

  • In the hard regime where dPl

dPk ≈ 1:

KLmin ≈ 2 · Hellingermin

Page 12

slide-29
SLIDE 29
  • Fundamental recovery condition (assuming M poly(n))

Hellingermin log n pobsn

Main result: Erdos-Renyi random graph

(pobs = 1) (pobs = 0.3)

  • In the hard regime where dPl

dPk ≈ 1:

KLmin ≈ 2 · Hellingermin

Page 12

slide-30
SLIDE 30
  • Fundamental recovery condition (assuming M poly(n))

Hellingermin log n pobsn ⇐ ⇒ avg-degree × Hellingermin log n

Main result: Erdos-Renyi random graph

(pobs = 1) (pobs = 0.3)

  • In the hard regime where dPl

dPk ≈ 1:

KLmin ≈ 2 · Hellingermin

Page 12

slide-31
SLIDE 31

Fundamental recovery condition (Erdos-Renyi graphs). avg-degree × Hellingermin log n [xi−xj]1≤i,j≤n

1 2 3 4 5 6 7 8 9

Intuition

Page 13

slide-32
SLIDE 32

Fundamental recovery condition (Erdos-Renyi graphs). avg-degree × Hellingermin log n [xi−xj]1≤i,j≤n hypotheses:

H0: x = [0, 0, · · · , 0] H1: x = [1, 0, · · · , 0]

  • H0 and H1 differ only at the highlighted region (≈ avg-degree pieces of info)

Intuition

Page 13

slide-33
SLIDE 33

Fundamental recovery condition (Erdos-Renyi graphs). avg-degree × Hellingermin log n [xi−xj]1≤i,j≤n hypotheses:

H0: x = [0, 0, · · · , 0] H2: x = [0, 1, · · · , 0]

  • H0 and H2 differ only at the highlighted region (≈ avg-degree pieces of info)

Intuition

Page 13

slide-34
SLIDE 34

Fundamental recovery condition (Erdos-Renyi graphs). avg-degree × Hellingermin log n (1) [xi−xj]1≤i,j≤n hypotheses:

H0: x = [0, 0, · · · , 0] Hn: x = [0, 0, · · · , 1]

  • n minimally-separated hypotheses

⇒ needs at least log n bits

  • the consequence of uncoded inputs

Intuition

Page 13

slide-35
SLIDE 35

Minimal sample complexity

Fundamental recovery condition (Erdos-Renyi graphs). avg-degree × Hellingermin log n

  • Sample complexity:

n · avg-degree

Page 14

slide-36
SLIDE 36

Minimal sample complexity

Fundamental recovery condition (Erdos-Renyi graphs). avg-degree × Hellingermin log n

  • Sample complexity:

n · avg-degree Min sample complexity ≍ n log n Hellingermin

Page 14

slide-37
SLIDE 37

How general this limit is?

Fundamental recovery condition (Erdos-Renyi graphs). avg-degree × Hellingermin log n

  • Can we go beyond Erdos-Renyi graphs?

Page 15

slide-38
SLIDE 38

Main results: homogeneous graphs

random geometric graph (generalized) ring

Page 16

slide-39
SLIDE 39
  • Homogeneous graphs:
  • min-degree ≍ max-degree ≍ mincut
  • balanced cut-set distributions

Main results: homogeneous graphs

random geometric graph (generalized) ring

Fundamental recovery condition (various homogeneous graphs). avg-degree × Hellingermin log n

Page 16

slide-40
SLIDE 40
  • Depend almost only on graph sparsity

Main results: homogeneous graphs

random geometric graph (generalized) ring

Fundamental recovery condition (various homogeneous graphs). avg-degree × Hellingermin log n

Page 16

slide-41
SLIDE 41

Main results: general graphs

mincut = 5

Page 17

slide-42
SLIDE 42
  • Information across the minimum cut set:

mincut · Hellingermin

Main results: general graphs

mincut = 5

Page 17

slide-43
SLIDE 43
  • Recovery conditions

ML works if mincut · Hellingermin τ cut + log n + log M Impossible if mincut · Hellingermin τ cut + mincut max-degree log n

Main results: general graphs

mincut = 5

Page 17

slide-44
SLIDE 44

Cut-homogeneity exponent

  • τ cut captures1
  • growth rate of the cut-set distribution
  • the ratio

mincut avg-degree

1 τcut := maxk 1 k |N (k · mincut)|, where N (K) := |{cut : cut-size ≤ K}|

Page 18

slide-45
SLIDE 45

Cut-homogeneity exponent

  • τ cut captures1
  • growth rate of the cut-set distribution
  • the ratio

mincut avg-degree

  • In general: τ cut 1
  • For homogeneous graphs: τ cut log n

1 τcut := maxk 1 k |N (k · mincut)|, where N (K) := |{cut : cut-size ≤ K}|

Page 18

slide-46
SLIDE 46

mincut × Hellingermin (1 ∼ log n) gap log n avg-deg × Hellingermin log n gap ≍ 1 avg-deg × Hellingermin log n gap ≤ 4(1 + 2 log M

log n )

Summary of main results

Page 19

slide-47
SLIDE 47

Concrete application: stochastic block model

  • Stochastic block model:
  • 2 clusters
  • edge densities:

— within-cluster: p = α log n

n

— across-cluster: q = β log n

n

(q < p)

adjacency matrix

Page 20

slide-48
SLIDE 48

Concrete application: stochastic block model

  • Stochastic block model:
  • 2 clusters
  • edge densities:

— within-cluster: p = α log n

n

— across-cluster: q = β log n

n

(q < p)

adjacency matrix

  • Our theory:

feasible if √α −

  • β >

√ 2 impossible if √α −

  • β < 1/2
  • Fundamental limit (Abbe et al. and Mossel et al.): √α − √β >

√ 2

Page 20

slide-49
SLIDE 49

Concluding remarks

  • A unified framework to determine recovery limits
  • Interplay between IT and graph theory
  • Tighten the pre-constants?

Arxiv: http://arxiv.org/abs/1504.01369

Page 21