Stay on path: PCA along graph paths Megasthenis Asteris - - PowerPoint PPT Presentation

stay on path pca along graph paths
SMART_READER_LITE
LIVE PREVIEW

Stay on path: PCA along graph paths Megasthenis Asteris - - PowerPoint PPT Presentation

Stay on path: PCA along graph paths Megasthenis Asteris Electrical and Computer Engineering Anastasios Kyrillidis Alexandros Dimakis Han - Gyol Yi Communication Sciences and Disorders Bharath Chandrasekaran Sparse PCA Direction of x


slide-1
SLIDE 1

Stay on path: 
 PCA along graph paths

Electrical and Computer Engineering Communication Sciences and Disorders Anastasios Kyrillidis Alexandros Dimakis Megasthenis Asteris Bharath Chandrasekaran Han - Gyol Yi

slide-2
SLIDE 2

Sparse PCA

yn y1 . . . Direction of 
 maximum variance p n observations / datapoints p variables

x1

x2

Find new variable (feature) that
 captures most of the variance.

slide-3
SLIDE 3

Sparse PCA

yn y1 . . . Direction of 
 maximum variance p n observations / datapoints p variables

x1

x2

Find new variable (feature) that
 captures most of the variance. b Σ = 1

n · n

X

i=1

yiy>

i

y>

i

yi

Empirical 


  • cov. matrix

kxk2 = 1

slide-4
SLIDE 4

Sparse PCA

yn y1 . . . p n observations / datapoints p variables Find new variable (feature) that
 captures most of the variance. b Σ = 1

n · n

X

i=1

yiy>

i

y>

i

yi

Empirical 


  • cov. matrix

Sparse direction of 
 maximum variance

x1 x2

NP-Hard

kxk2 = 1

kxk0 = k

slide-5
SLIDE 5

Sparse PCA

Extracted feature is more interpretable; it depends on only a few original variables. Recovery of “true” PC in high dimensions; # observations << # variables. [ Statistician ] [ Engineer ] Why sparsity?

slide-6
SLIDE 6

Sparse PCA

Extracted feature is more interpretable; it depends on only a few original variables. Recovery of “true” PC in high dimensions; # observations << # variables. [ Statistician ] [ Engineer ] Why sparsity? More structure…? More interpretable. Better sample complexity.

[Baraniuk et al., 2008; Kyrillidis et al., 2014, Friedman et al., 2010, …]

E.g. wavelets of natural images, block structures, periodical neuronal spikes, …

slide-7
SLIDE 7

Sparse PCA

Extracted feature is more interpretable; it depends on only a few original variables. Recovery of “true” PC in high dimensions; # observations << # variables. [ Statistician ] [ Engineer ] Why sparsity? More structure…? More interpretable. Better sample complexity.

[Baraniuk et al., 2008; Kyrillidis et al., 2014, Friedman et al., 2010, …]

E.g. wavelets of natural images, block structures, periodical neuronal spikes, …

  • Structured sparse PCA [Jenatton et al., 2010]
  • Sparsity-inducing norm
  • 2D grid, rectangular nonzero patterns
slide-8
SLIDE 8

[ PCA On Graph Paths ]

slide-9
SLIDE 9

Problem Definition

S T

x1

x2

Active variables


  • n s⤳t path

p xi xi x2 x3 xp x1 . . . . . . Directed,
 Acyclic

  • Structure captured by an underlying graph.
slide-10
SLIDE 10

Problem Definition

Graph Path
 PCA

S T

x1

x2

Active variables


  • n s⤳t path

p xi xi x2 x3 xp x1 . . . . . . Directed,
 Acyclic

  • Structure captured by an underlying graph.
slide-11
SLIDE 11

Motivation 1: Neuroscience

  • Variables: “voxels” (points in the brain)
  • Measurements: blood-oxygen levels
slide-12
SLIDE 12

Motivation 1: Neuroscience

  • Variables: “voxels” (points in the brain)
  • Measurements: blood-oxygen levels

T S

slide-13
SLIDE 13

Motivation 2: Finance

  • Variables: stocks
  • Measurements: prices over time
  • Goal: Find subset that explains variance
slide-14
SLIDE 14

Motivation 2: Finance

divided in sectors 1 stock/ sector

  • Variables: stocks
  • Measurements: prices over time
  • Goal: Find subset that explains variance
slide-15
SLIDE 15

Motivation 2: Finance

divided in sectors 1 stock/ sector Chase BofA UBS

  • Variables: stocks
  • Measurements: prices over time
  • Goal: Find subset that explains variance
slide-16
SLIDE 16

Motivation 2: Finance

divided in sectors 1 stock/ sector Chase BofA UBS BANKS

  • Variables: stocks
  • Measurements: prices over time
  • Goal: Find subset that explains variance
slide-17
SLIDE 17

Motivation 2: Finance

divided in sectors 1 stock/ sector Chase BofA UBS BANKS Chevron Shell

  • Variables: stocks
  • Measurements: prices over time
  • Goal: Find subset that explains variance
slide-18
SLIDE 18

Motivation 2: Finance

divided in sectors 1 stock/ sector Chase BofA UBS BANKS Chevron Shell ENERGY

  • Variables: stocks
  • Measurements: prices over time
  • Goal: Find subset that explains variance
slide-19
SLIDE 19

Motivation 2: Finance

divided in sectors 1 stock/ sector Chase BofA UBS BANKS Chevron Shell ENERGY

  • Variables: stocks
  • Measurements: prices over time
  • Goal: Find subset that explains variance
slide-20
SLIDE 20

Motivation 2: Finance

divided in sectors 1 stock/ sector Chase BofA UBS BANKS Chevron Shell ENERGY

  • Variables: stocks
  • Measurements: prices over time
  • Goal: Find subset that explains variance
slide-21
SLIDE 21

Motivation 2: Finance

divided in sectors 1 stock/ sector

S T

Chase BofA UBS BANKS Chevron Shell ENERGY

  • Variables: stocks
  • Measurements: prices over time
  • Goal: Find subset that explains variance
slide-22
SLIDE 22

Motivation 2: Finance

divided in sectors 1 stock/ sector

S T

Chase BofA UBS BANKS Chevron Shell ENERGY

  • Variables: stocks
  • Measurements: prices over time
  • Goal: Find subset that explains variance
slide-23
SLIDE 23

[ Statistical Analysis ]

slide-24
SLIDE 24

Data model

2 3

. . .

p−2 k

. . . . . .

· · · · · ·

. . . . . .

p−2 1 S p T = d = d k layers

(p,k,d)-layer graph

slide-25
SLIDE 25

Data model

2 3

. . .

p−2 k

. . . . . .

· · · · · ·

. . . . . .

p−2 1 S p T = d = d k layers

Source vertex Target vertex (p,k,d)-layer graph

slide-26
SLIDE 26

Data model

2 3

. . .

p−2 k

. . . . . .

· · · · · ·

. . . . . .

p−2 1 S p T = d = d k layers

Source vertex Target vertex layer (p,k,d)-layer graph

slide-27
SLIDE 27

Data model

2 3

. . .

p−2 k

. . . . . .

· · · · · ·

. . . . . .

p−2 1 S p T = d = d k layers

Source vertex Target vertex in & out degree layer (p,k,d)-layer graph

slide-28
SLIDE 28

Data model

2 3

. . .

p−2 k

. . . . . .

· · · · · ·

. . . . . .

p−2 1 S p T = d = d k layers

Source vertex Target vertex in & out degree layer (p,k,d)-layer graph

yi = p · ui · x? + zi,

Gaussian noise (i.i.d) Signal, supported on path of G. Samples Spike along a path

slide-29
SLIDE 29

Bounds

[ Theorem 1 ] : -layer graph (known).

(p, k, d) G

: signal support on st-path of .

x? G (unknown)

y1, . . . , yn of i.i.d. samples from .

Observe sequence

b Σ b x n = O ⇣ log p k + k log d ⌘

Then, samples suffice for recovery.

N(0, β · x?x>

? + I)

slide-30
SLIDE 30

Bounds

[ Theorem 1 ] : -layer graph (known).

(p, k, d) G

: signal support on st-path of .

x? G (unknown)

y1, . . . , yn of i.i.d. samples from .

Observe sequence

b Σ b x

vs for sparse PCA. Ω ⇣ k log p k ⌘

n = O ⇣ log p k + k log d ⌘

Then, samples suffice for recovery.

N(0, β · x?x>

? + I)

slide-31
SLIDE 31

Bounds

[ Theorem 1 ] : -layer graph (known).

(p, k, d) G

: signal support on st-path of .

x? G (unknown)

y1, . . . , yn of i.i.d. samples from .

Observe sequence

b Σ b x

vs for sparse PCA. Ω ⇣ k log p k ⌘

n = O ⇣ log p k + k log d ⌘

Then, samples suffice for recovery.

N(0, β · x?x>

? + I)

[ Theorem 2 ] That many samples are also necessary.

slide-32
SLIDE 32

Bounds

[ Theorem 1 ] : -layer graph (known).

(p, k, d) G

: signal support on st-path of .

x? G (unknown)

y1, . . . , yn of i.i.d. samples from .

Observe sequence

b Σ b x

vs for sparse PCA. Ω ⇣ k log p k ⌘

n = O ⇣ log p k + k log d ⌘

Then, samples suffice for recovery.

N(0, β · x?x>

? + I)

[ Theorem 2 ] That many samples are also necessary.

NP-HARD

slide-33
SLIDE 33

Algorithms

slide-34
SLIDE 34

Algorithm 1

A Power Method-based approach.

End?

b x ← xi+1

init x0, i ← 0 Input:

wi ← b Σxi

Power Iteration
 with projection 
 step.

slide-35
SLIDE 35

[ Projection Step ]

S T

arg min

x∈X(G) kx wk2

Project a p-dimensional on w

slide-36
SLIDE 36

[ Projection Step ]

S T

Due to the
 constraints. arg min

x∈X(G) kx wk2

Project a p-dimensional on w

slide-37
SLIDE 37

[ Projection Step ]

S T

Due to 
 Cauchy

  • Schwarz

Due to the
 constraints. arg min

x∈X(G) kx wk2

Project a p-dimensional on w

slide-38
SLIDE 38

[ Projection Step ]

S T

Due to 
 Cauchy

  • Schwarz

Longest (weighted) path 
 problem on G, with 
 special weights!

G acyclic;

Due to the
 constraints. arg min

x∈X(G) kx wk2

Project a p-dimensional on w

slide-39
SLIDE 39

[ Experiments ]

slide-40
SLIDE 40

Synthetic

Data generated according to the (p,k,d)-layer graph model. (p=1000, k=50, d=10 , 100 MC iterations)

Samples n

1000 2000 3000 4000 5000

kb xb x> ! xx>kF

0.2 0.4 0.6 0.8 1 1.2 1.4

  • Trunc. Power M.
  • Span. k-sparse

Graph Power M. Low-D Sampling

slide-41
SLIDE 41
  • Resting state fMRI dataset.*
  • 111 regions of interest (ROIs) (variables),

extracted based on Harvard-Oxford Atlas [Desikan et al., 2006].

  • Graph extracted based on Euclidean

distances between center of mass of ROIs.

*[Human Connectome Project, WU-Minn Consortium]

Neuroscience

Identified core neural components

  • f the brain’s memory network.
slide-42
SLIDE 42

Summary

  • New problem: sparse PCA with support restricted on paths of DAGs.
  • Statistical analysis
  • Introduced a simple graph model.
  • Side information (underlying graph) reduces statistical complexity.
  • Approximation algorithms
  • Projection step → Longest path on weighted graph.
  • Other combinatorial structures?
  • Algorithm guarantees
  • Neuroscience applications

[ Future ]