Hidden Relations in Data A Signal Processing on Graphs Approach - - PowerPoint PPT Presentation

hidden relations in data a signal processing on graphs
SMART_READER_LITE
LIVE PREVIEW

Hidden Relations in Data A Signal Processing on Graphs Approach - - PowerPoint PPT Presentation

Carnegie Mellon Hidden Relations in Data A Signal Processing on Graphs Approach Jos M. F. Moura Phillip L. and Marsha Dowd University Professor moura@cmu.edu, www.ece.cmu.edu/~moura Acknowledgements: AFOSR grant FA95501210087 Work with:


slide-1
SLIDE 1

Carnegie Mellon

Hidden Relations in Data – A Signal Processing on Graphs Approach

José M. F. Moura Phillip L. and Marsha Dowd University Professor

moura@cmu.edu, www.ece.cmu.edu/~moura

Acknowledgements: AFOSR grant FA95501210087 Work with: Kar, Sandryhaila, Deri, Mei

slide-2
SLIDE 2

Carnegie Mellon

  • By 2020, all digital data created, replicated, consumed, in a year (IDC, Dec 2012):
  • 44 ZB
  • 44,000 EB
  • 44,000,000 PB
  • 44,000,000,000 TB
  • 44,000,000,000,000 GB
  • Big Data:
  • Variety, Volume, Velocity, Veracity, Variability, Value, Visualization
  • Unstructured, Distributed, •••

≈ 170 M US LoC

DataData Data Data Data Data Data Data Data Data Data Data

slide-3
SLIDE 3

Carnegie Mellon

Data: Traditional Signals

  • Time signals
  • Images, video

Mendrelic: Time Lapse

http://vimeo.com/18554749

KU Band SAR Image

Sandia Nat Lab

Forbes, 03/05/ 2013

Speech Radar Signal Time series

Nice

slide-4
SLIDE 4

Carnegie Mellon

Data: Variety (Social, Web, Companies, …)

Wireless Service Providers

Adamic, Glance

Web: hyperlinked blogs Social networks Linkedin Contacts Sensor Networks

http://vidi.cs.ucdavis.edu/projects/ AggressionNetworks/

Friendship Networks Internet

slide-5
SLIDE 5

Carnegie Mellon

Data: The Old

  • Time series
  • Company data

and the New

slide-6
SLIDE 6

Carnegie Mellon

Analytics of Data Science: DSP on Graphs

  • (Linear) DSP for social, biological and physical graph data

: Graph signal model, filters, filtering and convolution, impulse response, z- and Fourier transforms, spectrum, frequency response, …

Sandryhaila & Moura, “DSP on graphs,” IEEE Tr-SP, 2013

slide-7
SLIDE 7

Carnegie Mellon

Data Science: Graph Supports Associations

  • Graphical models: Markov random fields
  • Machine Learning approaches (Jordan, Willsky, …)
  • Data transforms:
  • Diffusion wavelets (Coifman, 2004), Regression analysis, wavelets on

irregular sensor network (Baraniuk, 2005), filterbanks (Vandergheynst, 2011), separable wavelet filterbanks (Ortega, 2012)

  • Graph Laplacian (assumed undirected, non-negative

weights)(Vandergheynst, Barbarossa, Ortega …)

  • Algebraic Signal Processing (ASP):
  • Pueschel and Moura (SIAM 2003, T-SP 2008, May, August)
slide-8
SLIDE 8

Carnegie Mellon

Data: The Old

  • Time series
  • Company data

and the New

slide-9
SLIDE 9

Carnegie Mellon

  • Time signal:

DSP: Time Signals

slide-10
SLIDE 10

Carnegie Mellon

  • Time signal:

DSP: Time Signals – Shift

Z-1

  • Shift operator:
  • Shift matrix:
slide-11
SLIDE 11

Carnegie Mellon

  • Graph:

DSP: Time Signals – Graph

  • Shift matrix:

0 1 2 ••• n-1 1 2

  • n-1
slide-12
SLIDE 12

Carnegie Mellon

  • Time signals:
  • Cosine signal , k=0, …, 5

DSPG: Graph Signals

  • Average temperature in US cities
  • Website topics in hyperkinked blogs
  • Average # tweets

Graph signal, shift, filters, filtering/ convolution, impulse response, A-, Fourier-transforms, spectrum, frequency response

slide-13
SLIDE 13

Carnegie Mellon

DSPG: Graph Shift

  • Shift: Adjacency matrix

Graph signal, shift, filters, filtering/ convolution, impulse response, A-, Fourier-transforms, spectrum, frequency response

  • Graph shift: local operation, replace signal value at a node

by weighted linear combination of values at neighbors:

  • 1st order interpolation, weighted averaging, regression on

graphs

slide-14
SLIDE 14

Carnegie Mellon

  • Graph signal:

DSPG: Graph Filtering

Graph signal, shift, filters, filtering/ convolution, impulse response, A-, Fourier-transforms, spectrum, frequency response

  • Graph Filtering:
slide-15
SLIDE 15

Carnegie Mellon

  • Graph signal:

DSPG: Graph Filtering

  • Graph Filtering:
  • Shift Invariance:

H and A commute

Graph signal, shift, filters, filtering/ convolution, impulse response, A-, Fourier-transforms, spectrum, frequency response

Graph filter polynomial in shift A H and A have same eigenvectors

slide-16
SLIDE 16

Carnegie Mellon

  • Discrete Fourier Transform (DFT):
  • Fourier Tr.:

Graph signal, shift, filters, filtering/ convolution, impulse response, A-, Fourier-transforms, spectrum, frequency response

DSPG: Fourier Transform

slide-17
SLIDE 17

Carnegie Mellon

  • Discrete Fourier Transform (DFT):
  • Fourier Tr.:

Graph signal, shift, filters, filtering/ convolution, impulse response, A-, Fourier-transforms, spectrum, frequency response

DSPG: Fourier Transform

slide-18
SLIDE 18

Carnegie Mellon

DSPG: Graph Shift & FT

  • Diagonalization of the shift:
  • DFT and the shift:

Graph signal, shift, filters, filtering/ convolution, impulse response, A-, Fourier-transforms, spectrum, frequency response

slide-19
SLIDE 19

Carnegie Mellon

DSPG: Graph Fourier Tr.

Graph signal, shift, filters, filtering/ convolution, impulse response, A-, Fourier-transforms, spectrum, frequency response

  • For simplicity: A is diagonalizable
  • Graph Fourier Tr.:
  • Inverse Graph Fourier Tr.:
slide-20
SLIDE 20

Carnegie Mellon

Frequency

  • Time frequencies:
slide-21
SLIDE 21

Carnegie Mellon

DSPG: Frequency Response

Graph signal, shift, filters, filtering/ convolution, impulse response, A-, Fourier-transforms, spectrum, frequency response

  • Graph filter
  • Graph filter frequency response
slide-22
SLIDE 22

Carnegie Mellon

DSPG: Convolution Thm

Graph signal, shift, filters, filtering/ convolution, impulse response, A-, Fourier-transforms, spectrum, frequency response

  • Filtering/ convolution theorem
slide-23
SLIDE 23

Carnegie Mellon

DSPG: Graph Low and High Pass Signals

  • Ordering frequencies: Total variation
  • For s a graph frequency component
slide-24
SLIDE 24

Carnegie Mellon

DSPG: Graph Low and High Pass Signals

slide-25
SLIDE 25

Carnegie Mellon

DSPG: Graph Low and High Pass Signals

slide-26
SLIDE 26

Carnegie Mellon

DSPG: Political Blogs

  • Data: 1224 conservative & liberal political blogs
  • Graph: 1224 nodes (blogs), edges (hyperlinks)

Adamic, Glance

  • Graph signal: blue

1, red

  • 1

Sandryhaila, Moura, DSP on Graphs,IEEE Tr-SP, 2013

slide-27
SLIDE 27

Carnegie Mellon

DSPG: Blogosphere Low- vs High Pass

  • 1224 political blogs, hyperlinked: conservative & liberal
  • Adjacency matrix given by hyperlinks
  • Graph signal:
  • F ourier transform:
  • Frequency representation:

Adamic, Glance

A

slide-28
SLIDE 28

Carnegie Mellon

DSPG: Blogosphere Low- vs High Pass

  • 1224 political blogs, hyperlinked: conservative & liberal
  • Adjacency matrix given by hyperlinks
  • Graph signal:
  • Frequency representation:

Adamic, Glance

A

slide-29
SLIDE 29

Carnegie Mellon

DSPG: Classification–Political Blogsphere

  • 1224 political blogs, hyperlinked: conservative & liberal
  • Adjacency matrix given by hyperlinks
  • Semisupervised Classifier (filter+threshold):
  • Filter design:

Adamic, Glance

A

slide-30
SLIDE 30

Carnegie Mellon

DSPG: Classification–Political Blogsphere

  • Political blogs:

Adamic, Glance

1 2 3 4 5 6 7 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

Most connected Random

  • Classifier P=10:

AGF: Globalsip, 2013: Chen. Sandryhaila, Moura, Kovacevic DF: Diffusion Functions, 2008, J. Mach Learn. Res., Szlam, Coiffman, Maggioni

slide-31
SLIDE 31

Carnegie Mellon

  • 10 months log:
  • 3.7 Million customers: 3.6 M non-churners, 100K churners

DSPG: Service Provider–Predict Customer Behavior

  • Adjacency matrix:

Learn from few churners in month A who will churn month A+1

slide-32
SLIDE 32

Carnegie Mellon

DSPG: Service Provider–Predict Customer Behavior

  • Classifier

Deri & Moura, ICASSP, May 2014

slide-33
SLIDE 33

Carnegie Mellon

DSPG: Classification

Globalsip, 2013: Sandryhaila, Moura

  • Graph:
  • Regularization:
  • Classification by regularization
  • News articles dataset:

Belkin, Matveeva, Nyogi, 2004 Graph: 18,000 news articles, 20 topics, graph: randomly select 500 from each class, each article a vector of 6000 most common keywords, use cosine distance between keywords:

slide-34
SLIDE 34

Carnegie Mellon

DSPG: News Article Dataset–Classification

  • News articles dataset:

Belkin, Matveeva, Nyogi, 2004 Graph: 18,000 news articles, 20 topics, graph: randomly select 500 from each class, each article a vector of 6000 most common keywords, use cosine distance between keywords:

Globalsip, 2013: Sandryhaila, Moura

slide-35
SLIDE 35

Carnegie Mellon

DSPG: NIST Digits Database Classification

  • NIST database: 70 0 0 0 grayscale im ages of handwritten digits from 0 to 9.

Random ly select 30 0 0 im ages of each digit, construct testing dataset of 30 0 0 0 im ages. The representation graph for this dataset is constructed by viewing each im age as a point in a 28 2=78 4 dim ensional vector space, com puting Euclidean distances between all im ages, and connecting each im age with six nearest neighbors.

slide-36
SLIDE 36

Carnegie Mellon

Big Data

  • Product graphs:
  • Kronecker graphs:
  • Cartesian products:
  • Strong product:
slide-37
SLIDE 37

Carnegie Mellon

Big Data

slide-38
SLIDE 38

Carnegie Mellon

Filtering

  • Filtering: Cartesian graph
  • Filtering: Kronecker graphs
  • Filtering: Strong graphs
slide-39
SLIDE 39

Carnegie Mellon

Big Data

  • Parallel constructs:
  • Vectorizable constructs:
slide-40
SLIDE 40

Carnegie Mellon

Big Data: Fourier Transform

  • Cartesian graphs:
  • Kronecker graphs and Strong graphs:
slide-41
SLIDE 41

Carnegie Mellon

Conclusion

  • DSP on Graphs: complex relations among data captured by a

graph

  • Shift: adjacency matrix
  • Filters: polynomials in the adjacency matrix
  • Graph Fourier transform: eigen decomposition of adjacency

matrix

  • Range of applications: traditional DSP now with networked

data

[2] Sandryhaila & Moura, “DSP on graphs: Freq. Analysis, T-SP, May 2014 [4] Sandryhaila & Moura, “Big Data Analysis with SP on Graphs,” IEEE SP- Magazine, in press, 2014 [3] Deri & Moura, ICASSP, 2014 [1] Sandryhaila & Moura, “DSP on graphs,” IEEE Trans.-SP, vol. 61, Apr 2013

slide-42
SLIDE 42

Carnegie Mellon

THANKS